Научная статья на тему 'Generation of the state tree based on generative grammar over trees of strings'

Generation of the state tree based on generative grammar over trees of strings Текст научной статьи по специальности «Математика»

CC BY
102
36
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
NATURAL LANGUAGE GENERATION / GENERATIVE GRAMMARS / SEMANTICS

Аннотация научной статьи по математике, автор научной работы — Lichargin D. V.

In the article the principle of state trees generation is considered based on the generative grammars over trees of strings in such objects as the sentences of natural languages, as well as two and tree dimensional images. The image of the object as a forest is considered; including the trees of object different layouts for the purpose of complex system modeling.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «Generation of the state tree based on generative grammar over trees of strings»

With growth of component quantity T of the nonparametric estimations mixture of probability density, there is an increase in ratio values R2> 1 (figure, a), Rj > 1 (figure, c). The noticed deterioration of approximating mixture properties p (x) in comparison to traditional nonparametric estimation of density probability p (x) (12), points to the decrease in sample

sizes used during the estimation of compositions p (x).

This is a special feature of minor dimensions k of random variables. When complicating the estimating probability density with efficiency k , the growth of nonparametric estimations p (x) also decreases p (x). Criteria corresponding to them W2, W2' and Wj, Wj become commensurable; this is evident in the decreasing of ratio R2 and Rj values.

The offered mixture p (x) of probability density estimations has a lesser dispersion in comparison to the nonparametric estimation p (x), which is identified by its

structure, since statistics synthesis p(x) is carried out on

the basis of an averaging operator (figure, b). With a quantity increase in T composing the mixture of

nonparametric estimations p (x), the density probability

and dimension k of random dimensions increases.

On the basis of the asymptotic properties analysis for nonparametric estimations mixtures of probability density with a multidimensional random variable, the decomposition possibility for initial statistical data under a synthesis of nonparametric statistics in large samples conditions is justified. The researched statistics, in comparison to the traditional Rosenblatt - Parzen nonparametric evaluation, has a considerably smaller dispersion and allows using parallel calculating technologies.

References

1. Parzen E. On estimation of a probability density function and mode // Ann. Math. Statistic. 1962. Vol. 33. P. 1065-1076.

2. Epanechnikov V. A. Nonparametric estimation of a many-dimensional probability density // Teoriya veroyatnosti i ee primeneniya, 1969. Vol. 14. № 1. P. 156-161.

3. Lapko V. A., Varochkin S. S., Egorochkin I. A. Development and research of a nonparametric estimation of the probability density grounded on a principle of decomposition of learning sample on its size // Vestnik SibSAU. 2009. Vol. 1 (22). P. 45-49.

© Lapko A. V., Lapko V. A., 2010

D. V. Lichargin Siberian Federal University, Russia, Krasnoyarsk

GENERATION OF THE STATE TREE BASED ON GENERATIVE GRAMMAR OVER TREES OF STRINGS

In the article the principle of state trees generation is considered based on the generative grammars over trees of strings in such objects as the sentences of natural languages, as well as two and tree dimensional images. The image of the object as a forest is considered; including the trees of object different layouts for the purpose of complex system modeling.

Keywords: natural language generation, generative grammars, semantics.

The problem of natural language sentences generation is one of the key issues in the field of computer science and formal grammar theories. The issue of meaningful speech generation applies to the area of semantics and computer science [1-7]. The states tree generation issue is studied well enough in computer science and in system analysis. In respect to the question of meaningful phrases tree generation the problem is first of all connected to the method of sentence generation by means of Chomsky’s generative grammars. Generative grammars are successfully applied in software such as electronic translation systems, expert systems, systems of orthography checking, etc.

The flash point of the article is the analysis prospects for using generative grammars not over strings, but over trees of strings. In this respect it is possible to solve the

task of generating grammatically and semantically meaningful speech more effectively and increasing the efficiency of different images analysis and synthesis aspects.

The importance of the issue on effective generating language meaningful constructions and two or three dimensional images is generally understood and is connected with the demands of linguistic and other software.

The purpose of this research is to apply generative grammars on the necessity basis over trees as means of meaningful speech generation connected with greater heterogeneous context.

The novelty of the work is in the application of generative grammars not over strings but over trees of strings.

It is well-known that standard generative grammar over strings have the form of the four: G<S, T, N, R>, where S is an initial symbol of the generative grammar, T is a set of terminal symbols, N is a set of non-terminal symbols, and R is a set of rules for transforming one string into another.

For generative grammars over trees, the strings of symbols t and n are substituted by trees (or forest - trees with equal nodes). t = t <t’, t’’, ..., f>, where t’ = t’ <t\ t2, ..., tm> etc, n = n <n’, n’’, ..., nn>, where n’ = n’ <n1, n2, ., nm> etc.

One of the main particularities of any system is the presence of hierarchy elements in the system. Meanwhile the hierarchy relations can sometimes be presented as a set of hierarchies in different layouts of the system consideration. For example, the sum of three systems: a sentence as a distributed narration, a sentence with the purpose to order the tea, and a sentence with the purpose to support polite dialogue can result in a meaningful sentence of the natural language. At the same time, for the generation of such complex systems with several purposes and layouts of consideration it is necessary to use more complex means of generative grammar over the trees of strings, for the purpose of generating the tree of possible natural language sentences.

The generative grammar over trees of strings is composed in the following way. Let A<...B<...C1^C2...>,..., B’<...C1’^C2’>...> is a rule of generative grammar over trees, from a set of such rules with the trees of strings for terminal symbols T and nonterminal symbols N, «^» is a symbol of transferring one string to another. S<> is an initial symbol of generative grammar over trees.

During each stage of deepening the tree of states into another generated tree, or a forest of strings reduced to the multiplication of the obtained generated tree by the generative grammar rule.

It is also possible to consider the trees of heterogeneous information A<B{B1, B2}, C{C1, C2}> = {A<B1, C1>, A<B1, C2>, A<B2, C1>, A<B2, C2>} = {A<B1, C1>, A<B1, C2>, A<B2, C{C1, C2}>}. In this respect, the tree of states can be included into the elements’ tree and vice versa.

As a result, the sentence can be considered as the union (addition) of trees from different consideration layouts over the whole space (tree) of natural language points [4-6].

Let’s have a tree A <B <B’<...>, B’’<...>,..., B’’’<...>>, C <C’<...>, C’’<...>, C’’’<.»,...,

D <D’<...>, D’’<...>, ...,D’’’<...>>> or briefly

A <. B <. B’’. >. >, then the forest of trees can be considered as a set of trees with equal nodes over a set of the trees’ nodes: F<A<...B<...B’’(=L1)...>...>,

X<...Y<...Y’’(=L1)...>...>,...>, where L1 is an equal node of the first two trees from the example above.

Let’s consider an example; the tree maneuvers in a chess game: Board <Column [1] <Cell [1], Cell [2], ...>, .>, such a tree is formed by position multiplication over the chess board by a set of rules for possible half-moves.

The half-move of a knight can be such: Board <...Column [X] <...Cell [Y] <Knight^Empty>> Column [(X + 1) or (X - 1)] <Cell[(Y + 2) or (Y - 2)] <Empty^Knight>...>...> .

The generation of a chair image for example, can presuppose a potential image of a person on the chair. Chair <Seat, Legs, Back, Person(= L1) <Arms(= L2), Legs(= L3), Trunk(= L4), Head(= L5)>> + Gentleman(= L1) <Body <Arms(=L2), Legs(=L3), Trunk(=L4), Head(= L5)>, Clothes <Jacket <Trunk<= L4>>, Boots, Top Hat <Head(=L5)>>> = Image<Chair<...>,

Gentleman< ...>,...>.

The principle of reducing or adding the images is the following: semantically analogue elements - tree nodes are declared to be identical, in the case for several reduction variants an additional sub-tree of possible system states is built as a result of adding system element trees or generating system state trees.

The sentence of the natural language can be presented in the form of a tree as well. For example, the tree of grammatical sentence analysis the can be simplified as: Clause introductory Word, Modifier, Subject <Determiner, Attribute <Adverb of Degree, Adjective Group>, Nominal Group>, Predicate <Modality, Modifier, Verbal Part>, Object <Determiner, Attribute <Adverb of Degree, Adjective Group>, Nominal Group>, Modifier>.

The tree can be added to (reduced by) the tree of semantic analysis, for example, the Topic “Building” <Relation-Creature-Building {enter, build}, Properties-Building {marble, multistoried}, Здание {house, library}, Modifier 1 <with/without {with, without}, Essence of - Building/Rooms {corridor, hall}>, Modifier 2 <with/without {with, without}, Property-Thing (Essence of - Building/Architectural Element {large, beautiful}), Essence of -Building/Architectural Element {wall, corner}>>.

A tree of the following type can be used for the generation of natural language sentences:

1. Subject - Essence (the . / person / man / woman).

2. Modality - Action over Relation (want / wish / love / adore).

3. Predicate - Action with Clothes (buy / get / try on / wear).

4. Object - Clothes (the . / jeans / sweater /

footballer).

The given tree can be multiplied by the following rule o f generative grammar.

1 . 0 ^ the.

2. 0 ^ Attribute - Property of Clothes (stylish / fashionable / checked.

3. Object - Clothes (The ... ^ 0 / jeans / sweater / footballer).

In result, a sentence like: “the person wants to get the fashionable sweater” or “the woman wishes to buy the checked footballer” is obtained.

It can be assumed that the analyzing of the image recognition problem, natural language analysis, and a

number of other problems can be effectively solved only based on their synthetic joint consideration. For example, for the translation of the word-combination “up-link communication” not into the English language as “communication with a satellite” it is necessary to use a visual image of the facts discussed in the text. This way, in a system of translations, while the text translating is a semantically visual image of narration that should be grown, a translation without a latter close to the human one is impossible.

For the realization of the principles aforementioned, it is necessary to start the elaboration of the dictionary for semantic trees of heterogeneous data: images, patterns of sentences composition, algorithms, and so on. It will be necessary to use the already existing dictionaries of sentences generation in the “Electronic Dictionary” software for the system basis.

In conclusion it is necessary to mention that generative grammar over the trees of strings is an effective means of generating the state trees for such systems, like natural language sentence and semantically loaded images. It is thought to apply the generative grammars over the trees of strings on the basis of the Semantic Trees’ Dictionary, which is a classification of heterogeneous semantic data.

References

1. Agamdjanova V. I. Contextual Redundancy of the Lexical Meaning of a Word. M. : Higher School, 1977.

2. Apresyan Yu. D. Ideas and Methods of Modern Structural Linguistics. M. : Science, 1966.

3. Verdieva Z. N. Semantic Fields in the Modern English Language. M. : Higher School, 1986.

4. Lichargin D. V. Operations over the Natural Language Words Semes in Machine Translation // Works of the Conf. of Young Scientists. Krasnoyarsk : ICM SB RAS, 2003. P. 23-31.

5. Lichargin D. V. Elimination of Semantic Noise as the Means of Adequate Translation // Questions of the Theory and Practice of Translation. Works of All-Russian Conference. Penza : Privolzhye Region Knowledge House, 2003. P. 90-92.

6. Lichargin D. V. Generation of the Natural Language Phrases within the Task of Creating Natural Language Interface with Software // Problems of the Territory Information Development : Materials of the Eighth All-Russian Conf. PTID. 2003. Vol. 2. Krasnoyarsk : IP Centre of Krasnoyarsk State Technical University, 2003. P. 152-156.

7. Nikitin M. V. Lexical Meaning of a Word. M. : Higher School, 1983.

© Lichargin D. V., 2010

P. K. Lopatin

Siberian State Aerospace University named after academician M. F. Reshetnev, Russia, Krasnoyarsk

AN ALGORITHM FOR AN OBJECT GRASPING BY A MANIPULATOR IN AN UNKNOWN STATIC ENVIRONMENT

An algorithm for a n-link manipulating robot (MR) control in an environment with unknown static obstacles is considered. A theorem is proved which states that following the algorithm a MR in a finite number of steps will either grasp an object or will give a proved conclusion that an object cannot be grasped in any configuration.

Keywords: robot, unknown environment, obstacles, reachability.

In MR control the following typical problem arises: a MR should move from a start configuration q0 and grasp an object Obj by its gripper. Herewith sometimes the Obj may be grasped not in one but in several and sometimes in an infinite number of target configurations qj. The target configurations are united into a target set BT. The set BT has an arbitrary shape.

Let us consider that the BT does not grow during the whole movement of MR. Consider also that the coordinates of every point from BT are known and defined reliably.

A MR is represented in the configuration space (generalized coordinate space) as a point. MR functioning should take place in the bounded region X of the configuration space. Let’s consider that X is such that for any q e X the following inequalities are fulfilled:

a1 < q < a2,

(1)

where a1 = (a11, a21, ..., a„) is a vector of

lower bounds on the generalized coordinates values, a2 = (a12, a22, . , an2) is a vector of upper bounds on the generalized coordinates values of a MR, q = (q1, q2,..., qn) is a vector of the generalized coordinates of a MR. So X is a hyper parallelepiped. We will consider all points not satisfying (1) as forbidden.

Moreover, it is necessary to take into account that there also may be forbidden states inside X. Firstly these are the states (configurations) conditioned by constructive limitations of a MR, for example, those in which inadmissible intersection of MR links takes place. It is possible to calculate such forbidden configurations in advance. Secondly we will consider a configuration as forbidden in case when it intersects obstacles. It is impossible to calculate all such configurations in advance in the conditions of an unknown environment. So we will consider a configuration as forbidden if a MR cannot be

i Надоели баннеры? Вы всегда можете отключить рекламу.