Личаргин Дмитрий Викторович, Ладе Александра Вадимовна, Сафонов Константин Владимирович ПРИМЕНЕНИЕ КОНЛАНГОВ К ПОРОЖДЕНИЮ ЕСТЕСТВЕННЫХ ЯЗЫКОВ НА ОСНОВЕ ПОРОЖДАЮЩИХ ГРАММАТИК
Статья посвящена анализу тенденций применения плановых языков в различных целях, в роли языка-посредника между различными народностями, языка моделирования естественных языков, связанного с познанием новой лингвистической реальности, клубного и интернет общения, развития способностей и мировосприятия. Особое внимание обращается на разбор синтагматического строя плановых языков в применении к анализу структуры естественных языков. На основании анализа классов плановых языков, а также привлечения интерлингвистического подхода, который концентрируется на изучении международных искусственных языков как одного из средств преодоления языкового барьера, делается вывод о том, что плановые языки межнационального общения могут быть использованы в искусстве, в качестве хобби и средства моделирования и анализа естественных языков, и что особенно важно, в целях генерации осмысленной речи, осуществления языковых трансформаций. Адрес статьи: \м№^.агато1а.пе1/та1епа18/2/2016/12-3/35.1^т1
Источник
Филологические науки. Вопросы теории и практики
Тамбов: Грамота, 2016. № 12(66): в 4-х ч. Ч. 3. C. 128-133. ISSN 1997-2911.
Адрес журнала: www.gramota.net/editions/2.html
Содержание данного номера журнала: www .gramota.net/mate rials/2/2016/12-3/
© Издательство "Грамота"
Информация о возможности публикации статей в журнале размещена на Интернет сайте издательства: www.aramota.net Вопросы, связанные с публикациями научных материалов, редакция просит направлять на адрес: [email protected]
УДК 811.92
The article is devoted to the analysis of tendencies in the use of planned language in various ways as interlinguas between the different nationalities, the language for modeling natural languages, enthusiasm for learning new linguistic reality, club and internet communication, the development of skills and attitude to the world. Special attention is drawn to the analysis of the syntagmatic order ofplanned languages as applied to natural language structure analysis. Based on the analysis of planned language classes, as well as attracting interlinguistic approach that focuses on the study of international artificial languages as a means of overcoming the language barrier, the conclusion is drawn that the planned international language can be used in art, as a hobby and modeling tools and analysis of natural languages to generate meaningful speech of the linguistic transformations.
Key words and phrases: interlinguistics; planned languages; philosophical languages; natural languages modeling; generative grammars.
Lichargin Dmitrii Viktorovich, Ph. D. in Technical Sciences Lade Aleksandra Vadimovna
The Institute of Space and Information Technologies, Siberian Federal University [email protected]; [email protected]
Safonov Konstantin Vladimirovich, Doctor in Physics and Mathematical Sciences
Siberian State Aerospace University named after M. F. Reshetnev [email protected]
APPLYING CONLANGS TO THE NATURAL LANGUAGES GENERATION BASED ON GENERATIVE GRAMMARS
I. INTRODUCTION
The problem of creating and applying various constructed languages for providing the generation of the natural languages, their simulation and analysis is considered in this work.
Nowadays, a variety of systems such as software translators, expert systems, supporting dialogue with the user, automatic text summarization, and information extraction from natural language texts are widely spread and developed.
The problem of creating and applying constructed languages is solved at the intersection of such disciplines as linguistics, interlinguistics, computational linguistics, logic, philosophy and psychology.
The issue of constructing and applying the constructed languages has been long and widely studied by different authors, in particular by E. A. Bokarev, Z. N. Verdiyeva, P. N. Denisov, A. D. Dulichenko, M. I. Isayev, A. M. Kondratov, F. A. Litvin, O. N. Seliverstova [2-8; 14].
However, the application of more effective planned languages in particular natural language modeling requires further research from the part of the theory of classification, vectorization of classes classification and applying the law of the excluded middle while constructing a semantic classification.
The purpose of this paper is to construct a comparative classification and comparative description of certain planned languages, description of the advantages and the analysis of the spheres of application for some languages, their application for solving the problem of generating meaningful subsets of the natural language.
The basic idea is to compare some classes of the planned languages and to build a principle of their application, as a tool of generating meaningful subsets of the natural languages, such as English, Russian and others.
The novelty of the paper is the use of the model presented in the work [9] as a principle of a constructed language creation, which clearly defines the semantic classification of words, with the possibility of their successful application, as the languages of the non-terminal symbols of Chomsky's generative grammar.
II. THE PROBLEM OF CREATION AND APPLICATION OF CONSTRUCTED LANGUAGES
People have always thought about the way of facilitating and simplifying the communication between people on the planet, of overcoming various barriers between the nations such as language and cultural ones. Nowadays various models of the languages are widely spread and developed. A great variety of conlangers (creators of artificial languages) are working now and worked in the past. Each conlanger reflects philosophy, views, ideas and projects in one's special language. The most famous and important of planned (artificial) languages such as Esperanto, Toki Pona, Panduniya, Intelingua and Lojban are mentioned in this work. The process of creating artificial languages is studied by different authors, in particular Zheff Burke, Sally Kaves, John Clifford E. and A. Wierzbicki.
Their work represents the most comprehensible international relations and underlines the importance of addressing the problem of a simple universal tool of international communication from the point of view of modeling the linguistic capacities based on these natural languages. The dissemination of the constructed languages is carried out by individual enthusiasts, and it is an interesting object for study of such science as interlinguistics. However, the question of the classification and systematization of various types of constructed languages requires further researches in the theory of classification, logic, semantics, semiotics, philosophy and other sciences and theories.
The idea to create the Intelingua systems, which serve as intermediate languages of machine translation, has been dominated in the field of machine translation systems recently. It means that, all the existing languages should be translated into Intelingua and back from Intelingua. As a result, you can get the translation from any language A
to language B, using a chain of A-I-B, where I - is a language of Intelingua. One of the most famous projects of In-terlingua - Distributed Language Translation system was developed in Holland starting in 1979 and being closed in 1992. But at the time of its prolongation, a new project of Interlingua UNL or Universal Network Language was launched in 1997 initiated in Japan. This project is directly related to the task of online document translation. The last project described correlation of the attributes that distinguished their different lexical-semantic variants to the words. Promt and Google apply these ideas nowadays.
The question of the application of a simple and clear structured artificial language as the language of modeling natural languages such as English, German, French, Russian, Chinese and so on is often discussed. The principles of the models were discussed by linguists who specialized on formal models of the natural languages.
The question of creating an international artificial language without polysemy and notional conventions to be designed was especially urgent in the late XIX century. In total there are more than 1,000 projects of such languages now. The only language of this kind, which spreads relatively wide among enthusiasts, is the Esperanto language. Today it consists of about 1900 roots (mainly of Roman origin). Esperanto was based on the similarities of many European languages, and as a result it contains a lot of similar words from many languages. The grammar of Esperanto is characterized by logic and simplicity. Derivation is extensive in the Esperanto language. There are other artificial languages such as Basic English, containing 1,000 words (originally there were 850 words) and describing the entire vocabulary with phrases of modified English. Today one of Wikipedia divisions is written in it.
Apart from the traditional purpose artificial languages of international communication can be applied in the following spheres:
1) for the analysis of language;
2) for the communication with the computer programs and their development.
III. THE CLASSIFICATION OF ARTIFICIAL LANGUAGES
To understand the long history of developing the idea of an auxiliary language, a century of its operation experience, it is necessary to give an adequate philosophical and linguistic-sociological analysis of this problem. One of the possible classifications is given below that can demonstrate the diversity of artificial languages for the analysis of a particular class of planned language.
Figure 1.
The classification of different types of languages
1.2. Information languages: programming, mark-up, file structure languages, etc.;
2.3.1.1. Philosophic and logic languages are languages that have an exact structure of word-formation and syntax:
2.3.1.2. «The language of minimalism ideals» is a. developer of philosophic language:
2.3.1.3. The language is also created for the experiment, for example to test Sa.pir-Wh.orf hypothesis (that the
language, which is spoken by the people, limits consciousness, limits it within certain amount):
2.3.1.6. Artistic or aesthetic languages are created for aesthetic and creative enjoyment
3. Picture languages are methods and pattern of matching pictures for the carrying of a verbal message:
4. The languages of linguistic-historical hypothesis which explain that the formation of historical trajectories of possibly different natural Languages based on natural languages modeling systems.
IV. VARIOUS FUNCTIONS OF CONLANGS IN GENERAL
Most of conlangs are spread in very particular pages of social networks and simply are individual enthusiasts' activities, and they often don't find practical application.
Table I shows semes of one of the conlangs which are needed to describe the lexical and grammatical structure of any natural language in some approximation. The examples in tables below are developed by the authors for the solution of meaningful speech generation problem (see Table I, II) [Там же, с. 83].
From the point of some researchers the process of globalization requires universal means of communication in different cultures. In this case there is a risk to minimize cultural diversity, based on national languages discrimination with one universal language of mankind, for example English. From this perspective, the development of some conlangs such as Esperanto has to preserve the national culture, and to eliminate discrimination with the English language.
However, this can lead to the depersonalization of local cultures and the creation of cosmopolitan values in the image and likeness not of the single country but based on anonymity and incoherence with the historical and cultural values, it can lead to an artificial formation of a conlang which is based on the averaging of the other languages culture-specific concepts or based on boring and soulless logical schemes and structures.
On the other hand, the planned languages simulate the behavior of many possible languages patterns in terms of universals and individual characteristics of these languages as examples of all possible states of the space of the languages, regarding grammar, semantics, phonetic systems and other aspects of the language.
The analysis of the history confirms the importance of namely pragmatic aspect of conlangs, for example, the most popular conlang Esperanto. From the point of view of some listeners, Esperanto words do not sound euphonious and they are not motivated in most languages. Some people even consider Esperanto as a set of random words from different languages; it is not clear why one word is closer to French, and the other to Russian. Modern computer systems are able to perform much better quality selection of international vocabulary than it was proposed in Esperanto by the creator L. L. Zamenhof. Many conlangs have clear advantages compared to Esperanto, for example, an average person who knows the Romance language, can read a text in Intelingua - IALA easily, without a dictionary.
The other function of Esperanto is «travel clubs», so a relatively large number of people travel to attend conferences and festivals where Esperanto speakers meet old friends and make new ones. Many Esperanto speakers have correspondents around the world and are often willing to provide a place of residence and food for Esperanto speaking tourists.
On the other hand, such areas of conlangs development as a hobby, film industry, literature and art have great prospects, for example, J. R. Tolkien created a series of artificial languages that were used by him in his literary works to describe the fictional "universe". In addition, artificial languages are also spoken by characters of the fiction saga «Star Wars». The Institute for the study of this language is also founded as the Klingon language gets serious funding. A rock band sings songs on it, and Google search system has created a Klingon interface. In 2010, the Klingon language opera was shown in the Hague under the name of «U», which was translated from Klingon as "the Universe". The opera was staged by Zeebelt theatre.
One more conlang, Pandunia tries to select its words from major groups of the world languages without neglecting the important languages such as Swahili and Hindi, where the sentences are very concise.
V. LANGUAGES DESCRIBING OTHER NATURAL LANGUAGES
As an illustration of an example of basic linguistic units in one of the artificial philosophical language (conlang oGiro, version 3, see Table II) in order to explain the principle of natural language descriptions based on the seme -atom of meaning as a planned philosophical language for describing the terminals of generative grammars is given below [Там же, c. 103].
In this regard, some words of the conlangs with full and concise definitions described by meaning elements -semes (atoms of sense) can be seen below.
Table I.
Semes of one of the eon längs (oGiro), where ee, w, e and other combined letters ¡ire the short notation for the semes, expressed in the oGiro language with minimal symbols in its alphabet
" - lo use f= implication, reason ee = appearance C = essence i = idea bb = body w = all THE SECOND STEM i = verb 1 = predicate (i)z = present
- to do s = and, conjunction aa = to continue J = aspect e = place pp = part 1 = a lot ii = verb requiring infinitive/ gerund [I = modality, compliment (i)v = future
iii = doer x = sending, result 00 = to disappear P = characteristic o = object dd = food r= enough e = adverb E = adverbial modifier (i)q - past
eee =■ recipient c = process ii = exist F = link a = abstraction it = filling m - few/ little ee = parenthesis EE = parenthetic phrase (i)f = perfect
aaa = object v = future uu = not to exist D = action u = attitude, action gg = clothes n = minimum, nothing o = noun U 1 preposition place, link (i)c = continuous
ooo = tool z = present yy = to avoid T = getting, taking 0 = consciousness kk = cover, lid W = excellent oo = pronoun OD = subject (i)h = indefinite
uiiu = self q = past EE = possible Z = connection A = being vv = group of beings, organisation L = good a = possessive pronoun, ordinal numeral Of = object (i)p = perfect continuous
yyy = process j = time/ tense 00 = necessary G = delivery to smb E = information ff= stack, pile, collection R = normal ao — cardinal noun O = noun phrase (i)j = infinitive
III - control b — part, to join in II = impossible K = getting from smb I = intellect zz = building, house, quarters M = bad aa = adjective AA = attribute (i)vf — future perfect
EEE - chain d = to be, identity UU — needlessly or random X = change U = incomprehensible ss i= container, package N = terrible u - conjunction, preposition between parts of speech A = qualifier (o)t = singular
AAA -mutually g = the whole, to include YY = delay 0 = request Y = irrational qq = sitting ww = hard, solid uu - conjunction, preposition between sentences (o)h = plural
ООО-parallel dj -unification or multiplying A A = renew y = concrete, given, comprehensible xx = support 11 = soft y y — word sentence, interjection, politeness words (ojtc = conjugate
UUU-cycle p = less ij = device rr = liquid ah - particle (aa)r = purely
YYY -hierarchy t «e equally djj = device mm = gas aah - adverb of degree (aa)l = more
AAAH -single k = more cc = tool nn = fire, plasma y - question word (aa)w = the most
tc = different from tcc = instrument WW - perfect io - verbal noun (aa)m = less
Let us consider the example of the predictive structure of the Lojban language applied to the analysis of natural languages structure providing translation transformations.
dunda - x1 to give x2 (a present) x3 (to smb). Here x1, x3 are the arguments of this predicate. This notation in Lojban means that the subject - x1, which is called the first argument, gives the object - x2 (gift) called the second argument to the recipient - x3, called the third argument.
Table II.
The examples of words definition of the language generation conlang developed by the authors
Word -semantic definition Abbreviated word Definition Definition elements - literal translation
Gi'i Gi To inform To 'give' an idea to smb
Gi-La'i GiL To praise To 'give' a positive idea of smth
Go-X'i GX To sell To give an object for money/exchange
Te'i Te To come To 'get' a location
Ge'i Ge To accept a guest To 'present' a location/place
zz-Go-X'o zzG, zG Shop A building for selling
zz-Todd'o zzdd, zd Canteen A building for food consumption
zz-Dogg-ooZoHkkN 'o zzggkk, zgk Laundry A building for an action with clothes for the removal the connection of undesirable (negative) substance with coating
gg-Duvv-FNevv-ooDA'o ggoo, goo Uniform Clothes for work in the conflict between countries connected with destruction of people
Gni-ii'i Gn To hide Not to express an idea of the real
Gir-ooEE-tni'i Gioo To argue against To give an idea which potentially 'destroys' the other one
Gir-aaEE-tni'i Giaa To agree with To give an idea which potentially supports the other one
DEkk'i DE To write To create information on a surface of the carrier
eeEEl-iiZi-i'i EEZi, EZi To persuade To provide an opportunity of true interpretation of an idea
KaX'i KX To buy To take for money
To-e'i Toe To receive To get an object from some place
Za-kLL'i ZL To prefer To compare smth with smth as a better variant
EEGar'i EEG, EG To offer To give smth ('potentially')
Go-e-A-Te'i Ge To send To 'give' a thing from a place to a living being who gets it
There is no need to memorize cases and prepositions. All the parts of the same phenomenon are included in definitions of the word dunda, which is a word-predicate.
'mi dunda ti do' - I give it to you.
'mi dunda ti do' = 'fa mi dunda fe ti fi do' = 'fi do fe ti dunda fa mi' = To you + this + is given + by me.
The word can be turned from predicate into argument by putting around it the words lo ... ku -'lo dunda ku' - a giver.
There is no distinction of verbs and nouns, as Lojban conveys the meaning, not the words. Let us consider some of the principles of its construction.
At the basis of the Lojban language is the idea of natural languages representation and their broadcast to the articulated language of predicates theory. Any word of the Lojban language is a predicate with a certain number of arguments (positions), where each position is marked by a special function word, therefore prepositions become unnecessary.
The Lojban language creators claim that their language has been checked carefully to prevent the contradictions; it has not been achieved by any other conlang.
In this respect a number of conlangs are perspective as a tool for modeling other languages and as a solution to the problems of artificial intelligence.
Then let us consider the application of conlangs as a tool of the natural languages simulation. For example, there is an effective program for parsing, which in the Lojban language is related to the fact that Lojban is a language with the fully described syntax. Lojban can be also used in transformation of expert systems in the form of: I am running, my race, my jogging, I do smth, running in..., etc.
All these phrases are presented as one type of a phrase in the Lojban language.
Today conlang oGiro, version 2, which is one of the projects for formalization of the semantics, is used as a tool for generating meaningful phrases of a language as a means of describing vector of semantic and grammatical features of words. Let us consider the problem of describing an abstract text in English to generate instructional materials using the oGiro language, version 2. So, instead of a bulky vector notation by using English words, a more convenient entry in this conlang is offered below. The notation in the form of vectors of semes looks as follows:
«My name is» + word[Essence, Person, General, Small, sex/male\female\(01j] + word[Essence, Person, General, Small, sex/malefemale\(01}] + («I» + word[Action, Anything, ...Positive] + word[Essence, Idea, Audio, ...Positive] \ «I» + word[Action, Anything, ..., Positive] + word[Essence, Abstraction; Presentation, Idea, Much] + «of» + word[Essence, Idea, ..., Necessary] \ word[Essence, Idea; Presenting, Idea] \ word[Action-Take, Idea, Surface, ..., Positive]).
The symbol \{01} means that items marked with the number 1 are correlated, i.e. each of them occurs only if all other elements with the same sign are selected, in this case a female name corresponds only to a female surname. The notation in the form of the conlang constructions looks as follows:
«My name is» Aam/nnn\lll\01 Aal/nnn\lll\01. («I» Da*L ittL («music»). \ «I» Da*L u-Gil «of» ia<OO\Gi\TikkL>) ... .
The result is the generation of phrases and texts like «My name is Svetlana Ivanova. I adore rock music. I love lectures of Chemistry...», «My name is Peter Brown. I adore jazz. I love seminars of Physics...», «My name is Robert Smith. I love classes of French. I like classical music.». The following input data has been tested in linguistic software within the experiment of generating meaningful phrases for the approbation on the conlangs application. Currently, a generation program is being developed for generating educational tests for e-learning courses. VI. CONCLUSION
Using a planned language as an intermediate language of machine translation, a language of vector component of the words meanings, a language for the descriptions of words valences (a word valences theory), a language of semantic classification, a language of non-terminal symbols of generative grammars can be considered as a perspective and an important problem associated with the task of modeling, analysing, synthesing and potential generating (in some approximation) a complete set of sentences, narrations and texts in natural languages, such as English and Russian.
The issue of identifying the language of international communication still remains topical. As it was mentioned previously, most of them are built on the grammatical and lexical material of already existing European languages, which is the main difficulty in mastering such languages by the population of the Eastern region. Accordingly, for many language speakers mastering any of these languages is equivalent to the study of a new one.
Some authors agree that the purpose of auxiliary artificial language is primarily to communicate with foreigners. An opinion on the process of the formal description of the natural language was made by P. N. Denisov: «In this respect the metalanguage of the modern linguistics has at least two disadvantages: non-universalism and incom-pleteness» [4, c. 167]. This paper can be viewed as a contribution to overcoming these disadvantages.
Taking all these facts into consideration, it can be concluded that the future of the international communication is connected with the work of effective machine translation systems with voice support, but not with the implementation of a conlang as a basic language. Nevertheless, study and research of conlangs and their application in art, as a hobby and as a means of modeling and analysis of natural languages is important and necessary.
References
1. Алексеева И. С. Текст как доминанта перевода // Журнал Сибирского федерального университета. Красноярск: Издательство СФУ, 2011. Т. 4. № 10. С. 1375-1384.
2. Бокарев Е. А. Эсперанто-русский словарь. Около 26000 слов. М.: Наука, 1989. 488 с.
3. Вердиева З. Н Семантические поля в современном английском языке. М.: Высшая школа, 1986. 120 с.
4. Денисов П. Н Принципы моделирования языка. М.: Наука, 1965. 208 с.
5. Дуличенко А. Д. Интерлингвистика: сущность и проблемы // Interlinguistica Tartuensis VI: Общая интерлингвистика и плановые языки. Ученые записки ТартуГУ. Тарту, 1989. Вып. 858. С. 18-41.
6. Исаев М. И. Проблемы международного вспомогательного языка. М.: Наука, 1991. 263 с.
7. Кондратов А. М. Звуки и знаки. М.: Знание, 1966. 208 с.
8. Литвин Ф. А. Многозначность слова в языке и речи. М.: Высшая школа, 1984. 119 с.
9. Личаргин Д. В. Методы и средства порождения семантических конструкций естественно языкового интерфейса программных систем: дисс. ... к.т.н. Красноярск, 2004. 154 с.
10. Личаргин Д. В., Бачурина Е. П. Обобщенная иерархическая структура учебного электронного курса и рассмотрение на ее основе электронного курса обучения английскому языку РИЯ ИКИТ СФУ // Информатизация образования и науки. 2012. № 3. С. 16-20.
11. Личаргин Д. В., Таранчук Е. А. Иерархическая структура учебного электронного курса и его вариабельность для обучения иностранному языку // Дистанционное и виртуальное обучение. 2011. № 4. С. 56-75.
12. Личаргин Д. В., Суманеева Я. А., Юрьева Е. В. Метод подстановочных таблиц и его применение в сфере обучения русскому языку для иностранцев // Вестник Сургутского государственного педагогического университета. 2012. № 6. С. 179-187.
13. Сдобников В. В. Новый взгляд на стратегию перевода: коммуникативно-функциональный подход // Журнал Сибирского федерального университета. Красноярск: Издательство СФУ, 2011. T. 4. № 10. С. 1444-1453.
14. Селиверстова О. Н. Компонентный анализ многозначных слов. М.: Наука, 1990. 240 с.
ПРИМЕНЕНИЕ КОНЛАНГОВ К ПОРОЖДЕНИЮ ЕСТЕСТВЕННЫХ ЯЗЫКОВ НА ОСНОВЕ ПОРОЖДАЮЩИХ ГРАММАТИК
Личаргин Дмитрий Викторович, к.т.н. Ладе Александра Вадимовна
Институт космических и информационных технологий Сибирского федерального университета
orderist@yandex. ru; withlady@yandex. ru
Сафонов Константин Владимирович, д. физ.-мат. н. Сибирский государственный аэрокосмический университет имени академика М. Ф. Решетнёва
safonovkv@rambler. ru
Статья посвящена анализу тенденций применения плановых языков в различных целях, в роли языка-посредника между различными народностями, языка моделирования естественных языков, связанного с познанием новой лингвистической реальности, клубного и интернет общения, развития способностей и мировосприятия. Особое внимание обращается на разбор синтагматического строя плановых языков в применении к анализу структуры естественных языков. На основании анализа классов плановых языков, а также привлечения интерлингвистического подхода, который концентрируется на изучении международных искусственных языков как одного из средств преодоления языкового барьера, делается вывод о том, что плановые языки межнационального общения могут быть использованы в искусстве, в качестве хобби и средства моделирования и анализа естественных языков, и что особенно важно, в целях генерации осмысленной речи, осуществления языковых трансформаций.
Ключевые слова: интерлингвистика; плановые языки; философские языки; моделирование естественного языка; порождающие грамматики.