Научная статья на тему 'Corpus-based translation tools: a new paradigm in translation'

Corpus-based translation tools: a new paradigm in translation Текст научной статьи по специальности «Языкознание и литературоведение»

CC BY
248
96
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
LANGUAGE CORPORA / CONCORDANCE / CONTEXT / TRANSLATION / TRANSLATION MEMORY / TEACHING TRANSLATION / TM-PROGRAM

Аннотация научной статьи по языкознанию и литературоведению, автор научной работы — Dabyltayeva Raikhan Esbergenovna, Sadykova Aida Kenesbekovna, Aushakhman Assel Talgatovna

The article analyzes the possibilities to optimize the translation process using modern methods of using the system of language corpus. Corpus Internet technologies can be used effectively in teaching translation, as well as in educational and methodological work of the teacher of higher education institution. The ability to access the meaning of the word in its contextual use is provided by systems of the language corpora, which are seen as a necessary complement to the translator toolkit to enhance and develop the translation competence. The advantages of concordances, translation memory programs, which are to improve the productivity of a translator’s work are considered.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Corpus-based translation tools: a new paradigm in translation»

Dabyltayeva Raikhan Esbergenovna, Candidate of pedagogical sciences Sadykova Aida Kenesbekovna, Candidate of pedagogical sciences Aushakhman Assel Talgatovna, MA in Translation studies, University of Foreign Languages and Business Career, Kazakh National University named after Al-Farabi E-mail: assel.sadyk@gmail.com

Corpus-based translation tools: a new paradigm in translation

Abstract: The article analyzes the possibilities to optimize the translation process using modern methods of using the system of language corpus. Corpus Internet technologies can be used effectively in teaching translation, as well as in educational and methodological work of the teacher of higher education institution. The ability to access the meaning of the word in its contextual use is provided by systems of the language corpora, which are seen as a necessary complement to the translator toolkit to enhance and develop the translation competence. The advantages of concordances, translation memory programs, which are to improve the productivity of a translator's work are considered.

Keywords: language corpora, concordance, context, translation, translation memory, teaching translation, TM-program.

Introduction

In modern society the important role is played by computer technology, which penetrates into the sphere of human activity, forming a global information space. The computerization of the translation process was one of the most important problems from the beginning of information technologies application in science. The dream of creating an automatic machine translation did not leave scientists from the beginning. The introduction of computer tools in the process, is initially focused only on the man, his ability to selecting appropriate option at the level of experience and sense of style, requires special attention to detail and technology. These successes are dependent primarily on the achievements in the study of human thought and verbal communication skills of engineering-linguistic modeling of these processes. Nowadays it is impossible to imagine the work of a translator without a personal computer that is used for the actual translation and to address related problems, for example, to search for background information and learning the terminology.

The capabilities of corpus in teaching translation

The resources such as electronic dictionaries, encyclopedias, reference books have replaced the paper-based counterparts. The main advantage of digitized resources is to simplify and accelerate the work with them, which contributes to the rapid development of resources for linguists and translators in particular.

Nowadays, many scientific experiments are carried out in line with the corpus linguistics, whose goal is the study of the process of translation. Prominent in this area are the works of M. Wilkinson, V. N. Shevchuk, N. V Vladimova and R. K. Koshkin, where the corpus of electronic texts is seen as a means of identifying and addressing the factors that led to

the set of "negative effect" associated with the "inauthenticity" of the translated discourse.

"Corpora made up of specialised texts can be a useful source of terminology and content information. In the classroom, comparable corpora can be used to confirm translation hypotheses and to suggest possible solutions to actual translation problems related to a specific text. They can also provide a means to investigate similar domains or subdomains across languages. A specialised comparable corpus can offer information about terminology and concepts, and about the attested-ness of expressions within a certain context" [1].

The corpus of texts is characterized by four basic parameters: 1) it must be of a sufficiently large volume; 2) it should be structured or marked up; 3) the texts that make up a definite corpus should be in electronic form; 4) the concept of "electronic corpus" is, as a rule, special software to work with this corpus. The value of the corpus is defined as follows:

• corpus shows language data in their real environment that allows to explore the lexical and grammatical structure of the language, as well as continuous processes of language changes over a certain period of time;

• corpus is characterized by representative, or a balanced composition of the texts, it can be used to test the search machine, morphology, translation systems, and use it in various linguistic studies;

• corpus is essential to teaching of translation, as it can quickly and efficiently check the features of the use of an unfamiliar word or grammatical form.

The effectiveness of information retrieval in the corpus depends on special software — the so-called corpus-managers, or concordancer-programs. The space of electronic text

Corpus-based translation tools: a new paradigm in translation

corpora has enabled the effective use of electronic concordances, which offer the perspectives of modeling language picture of the world. Concordance is a specialized language application program, through which the sample is automatically given language units of electronic texts [2, 77]. Thus, concordance to a corpus is a list of word tokens, elements of the corpus, with reference to all contexts. The differences between the dictionaries and concordances are in representativeness, orientation on invariant, semantic or grammatical analysis.

The function of concordance can be compared with the function of search in the text editor, but the capabilities of the concordance are much greater: it analyzes not one, but several texts or corpus of electronic texts at one time. Depending on the technical capabilities concordance can provide information about the frequency of use and compatibility of a given

if <Cc*ifCHdjfeCf - Pwrni at Ph*p I jrtin.Cancc+tfjrvcf

language unit, but also enables to access a specific text, in which the example was found.

In other words, concordance is a "program that allows analyzing large amounts of texts in order to detect patterns of use of words and expressions in the language. Concordance-program searches the requested word in the corpus and produces a new window with a few fragments of sentences from different texts where the word or phrase is used. From the results of the corpus search the meaning of the word out of context, and an analysis of its use in the language can be obtained. The search results can be used to clarify the usage and elimination of rules for the use of certain words and phrases in the language, as well as for the study of the grammatical structure of the language" [3, 12-13].

Picture 1 shows the window of search results from program Concordance 3.3 for the word heart.

1*4 WE

n of Q d H * fib p, 1 m [WjB S U P|,

MHdwurtll Contant S^Ofi* QonftHi R*fl*r*ncv

HIAH Mimmmssmm

HEARD » Bp flwut of №+ Hl«t t> '.HnGthrp | 1

HEAIWG T Nevwo Id Jkd Jtfjfl fl till Of in» n*ait IB J Mil a Mam* w*v* i

HI AM* s Tn<s Ire-ja thai c*al of rt_ li is my own h*wt 1? Tf jurn&r&i _

HEARSE 1 B*c-au** i 4 IB n*)f own ntwc TO- 'U«n|T larTUMjp.

HEART 24 Ml n*Aft ii newigi IIK* m* tun 23 'i jiM wiinu i

Ml /lKl ' !i ? Tho vagu+ ht«t jtiarpc-nad Id a candia caul 55 Tti« March Fa I II i

HEART SwAPt D 1 ConfrsG tray hlirt making 72 Un«t on i

HEARTH HaviftQ no io put ajid* in* Tiofi 1n£Hom*ii

HEARTS T Atk1№« o*>r puking his ti*art otil *n th* G#n1s 144 Ei-9*fibal Be

HEARTY 1 Atuir&otirr?« m* Ji.a.l ft 201 W mi I

HEAT ft Triuae i -ACaJla enema* mj mart bo l*aa 23'i Aflftr-Cwin+r

HEAT-HAZE 1 Tim« in hw iifl* dn+ma of th* H*art 250 Tim» and Si

HEATH 1 TM* p^iiirin-a iiij.ii A CnU LI

MEATS 1 Ho-Ai ihDiila ihfty iwttp th-t flin d n*art 27S' 1 ¡¡Io a girl c

HEAVE 11 HwA-fiattfi* tiwrt Can-flOVfrn 2BZ'H»m«#«to?

HEAVEN 4 Fe* HI* n*Alt IB »* lawi**». Jina Mi cua ah m f

HI AVI N IMH HfAVxJ P: THAN-... 1 i 'i'iTTi th* un->j«ia«l 41 ■hirHr i-ru#H *i-.n mi h*art »ir IT» riding 293'On* mjfl w- nn -».- i-i-i- -

HFAWST i i _If t

WwJif TpMm Al pM^hf« Waricnt 1

71** ifiM i! *6 a |ii| *K (cun^nre oii>f

Pic. 1

The translator, who has access to the corpus, can see all the examples of words and phrases from the millions of words of text in a few seconds. Not limited (constantly developing) monitor corpora play a huge role in the structure of the dictionary, as they allow to follow the new words, piercing the language, or a pre-existing words to change their meaning or balance their use in accordance with the style. The method of presentation and storage of corpus texts are based on modern computer technology of data storage and processing.

The corpus of texts can be considered to be one of those means, the use ofwhich in certain cases should be referred to a necessary condition for analysis of linguistic phenomena. For example, modern electronic corpus contains hundreds of millions of word tokens, which allows us to speak about their viability in terms of the language competence.

N. V. Vladimov notes that the "basic procedures that are available to the researcher in the analysis of corpus include:

• search for the specified words, phrases in the corpus;

• display of search results, taking into account the features of a specific field;

• counting the number of examples of use of the word in the corpus;

• sorting of search results based on the required parameters.

All these procedures are carried out quickly and accurately by a computer program of concordance compilation (searching for equivalents), which allows researchers to quickly and accurately find what they need" [4, 26].

At this point in the wide popularity acquires the possibility of using the Internet as the corpora of the texts. This became possible after the on set of online web-concordancers, the so-called «Web as well Corpus». This resource has a number of disadvantages, and it is less effective than thematic corpora working in anautonomous regime. However, given the shortage of time and lack of specialized corpora ready online-con-cordancers can be a source for reliable linguistic information.

Virtual (specialized) corpus is a vast in terms of volume on specific topics, specially composed to find certain linguistic information text selection for the translators. The texts are taken from various sources (periodicals, encyclopedias, the Internet) in a strictly defined category and always presented in an

electronic form. We can say that virtual corpora produced by a translator on specific topics may help him in the following cases:

• to define the lexical and grammatical compatibility of words;

• to select from several options lexical equivalent of the original word, offered in different dictionaries or the Internet;

• to validate the decisions intuitively selected by the translator;

• to find additional encyclopedic information on the subject;

• to find terminological doublets, antonyms, hol-onyms, meronyms, identifying names and definitions of terms [5, 52-57].

Picture 2 shows an example of the search results of the word translation in the Web-based version of "British National Corpus."

Pic. 2.

Many researchers in the field of corpus linguistics point out that the corpus gives the translator the opportunity to actually navigate the language and solve a certain number of linguistic and extra-linguistic problems in the translation process. V. N. Shevchuk notes, «This is a powerful and reliable electronic resource, in practice replaces the actual native speaker, and becomes the so-called "virtual native speaker", the use ofwhich, no doubt, contributes to the quality oftrans-lation. The corpus gives a clear idea of the lexical, grammatical, stylistic, spelling and punctuation rules, operating in a modern language» [5].

Conclusion

In the objectives of teaching translation corpus of texts can be seen as abstract information and provide samples of professional translation in the study of methods and techniques of translation. An analysis of text corpora, the methods of corpus linguistics and achievements are a promising direction in the field of teaching foreign languages and translation. The world practice of development in this field proves their effectiveness. The toolkit of working with words and

expressions the proposed by the corpus creates additional opportunities for saving the translator's time in finding the equivalents and contributes to the accumulation ofknowledge of communicative-heuristic character, allowing the translator to cognize the language in a complex way, with the context, that is an effective strategy to address translation problems.

We believe that the corpus is more convenient and reliable means in comparison with dictionaries for several reasons. First, the corpus of texts is not a set structures as traditional dictionaries, but is constantly updated database. With it translator is able to keep up with the latest trends in the development of language-based analysis. Secondly, the corpus is a vast source of wordtokens than a dictionary due to the fact that it is much larger in volume, and the information about the word, which can be obtained from the corpus, is more objective and accurate. Thirdly, work with the corpusis much easier than with the dictionary. The corpus is placed on computer, which is currently part of work place of a translator. By means of simple manipulations the translator in a few seconds is able to get the needed linguistic information.

References:

1. Pearson J. Teaching Terminology using Electronic Resources. S. Botley, J. Glass, T. McEnery and A. Wilson (Eds). Proceedings of Teaching and Language Corpora. - UCREL technical Papers, Lancaster: UCREL, 1996. - P. 203-216.

2. Bovtenko M. A. Computer Linguodidactics: Textbook/A. Bovtenko. - M.: Flinta Science, 2005. - 216 p.

3. Sysoev P. V. Foreign languages at school. - M.: LLP "Methodological mosaic", 2010. - Vol. 4. - P. 12-13.

4. Vladimov N. V. Hull approach to solving the problems of translation: On a material of translation from Russian into English: the dissertation ... the candidate of philological sciences: 10.02.19. - Moscow, 2005. - P. 26.

5. Shevchuk V. Electronic resources of an interpreter: Help for a beginner translator. - M.: Librayt, 2010. - P. 45-57.

i Надоели баннеры? Вы всегда можете отключить рекламу.