Научная статья на тему 'INFORMATION PROCESSES MANAGEMENT'

INFORMATION PROCESSES MANAGEMENT Текст научной статьи по специальности «СМИ (медиа) и массовые коммуникации»

CC BY
23
6
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
INFORMATION TECHNOLOGIES / NATURAL LANGUAGE / MULTIFUNCTIONAL MODEL / LINGUISTIC MULTIFUNCTIONAL MODEL / SEMANTIC COINCIDENCE / AUTOMATED EDUCATIONAL SYSTEMS

Аннотация научной статьи по СМИ (медиа) и массовым коммуникациям, автор научной работы — Garashchenko Vladislav, Litovchenko Taras, Badyorina Lubov

An approach to automation of information-analytical activity tasks which is based on processing of the text documents content has been offered. Developing a methodology is based on a knowledge-oriented approach will allow qualitative assessment of the texts content. The tasks’ peculiarity of information and analytical activity has been determined. Applying the methods of automating the extraction of knowledge contained in natural-language texts, the formal presentation of them in machine languages will allow you to integrate and generalize knowledge in a particular subject area, in particular to check for meaningful compatibility and contradictions, which is an important component of information and analytical activity. There is a practical example of finding information on a content query, taking into account the conceptual structure of the subject area.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Текст научной работы на тему «INFORMATION PROCESSES MANAGEMENT»

Section 7. Technical sciences

https://doi.org/10.29013/ESR-20-11.12-45-55

Garashchenko Vladislav, Postgraduate student of the Kyiv National University of Culture and Arts Ukraine, Kyiv E-mail: vladang785@gmail.com Litovchenko Taras, Postgraduate student of the National University of Life and Environmental Sciences of Ukraine E-mail: borrrez@gmail.com Badyorina Lubov,

Doctor of Technical Sciences Lecturer Kyiv National University

of Culture and Arts Ukraine, Kyiv E-mail: vada@ukr.net

INFORMATION PROCESSES MANAGEMENT

Abstract. An approach to automation of information-analytical activity tasks which is based on processing of the text documents content has been offered. Developing a methodology is based on a knowledge-oriented approach will allow qualitative assessment of the texts content. The tasks' peculiarity of information and analytical activity has been determined. Applying the methods of automating the extraction of knowledge contained in natural-language texts, the formal presentation of them in machine languages will allow you to integrate and generalize knowledge in a particular subj ect area, in particular to check for meaningful compatibility and contradictions, which is an important component of information and analytical activity. There is a practical example of finding information on a content query, taking into account the conceptual structure of the subject area.

Keywords: information technologies, natural language, multifunctional model, linguistic multifunctional model, semantic coincidence, automated educational systems.

Introduction The need to create such systems is due to the fact

Recently, began to actively develop information - that government agencies as decision-makers, most

analytical systems. decisions have an information base.

Their main purpose is the automated preparation The security of decision-making and its conse-

of analytical documents on the totality of all available quences largely depends on how timely the neces-

natural and multilingual information. sary complete and objective information formed the

basis of the decision.

The basis of decision-making is adequate information, which means reliable. Which is delivered to a certain person on time and in an accessible form.

From these positions, information and analytical support (IAS) of public administration should be considered as one of the determining factors of efficiency and security of management.

Much of the information flows circulating in public administration are textual information, including multilingual.

One of the leading places in the decision-making process belongs to information-analytical units. Whose task is to analyze, summarize, and systematize both open and closed information on its reliability, usefulness, and compliance with the tasks to be solved by the governing bodies.

Objectives (IAS) are to assess the situation, forecast trends in its development, identify patterns, trends, and deviations in the development of situations.

The source product is analytical references and reviews that are formed by the requirements for their content, scope, and level of generalization.

The source analytical document is an information resource for the decision-maker.

In the process of preparation of analytical documents, specialists of information-analytical departments solve tasks to assess the quality of the material. Tasks to identify misinformation, logical and semantic compatibility, inconsistencies regarding the applied problem to be solved.

Solving these problems creates conditions for management to save time and more effective interaction.

Automation of informational and analytical activity tasks

The information which is contained in the text sources may be filed in different languages, which makes it necessary to converse the multilingual input information to its only representation in the knowledge base. The information presentation format is the basis for solving the complex issues of information and analytical activity (I & AA). Some tasks that require specific automation methods, but their effectiveness

is limited. Solving problems of information and analytical activity, for instance, planning of analytical certificates, reports, reviews or forecasting of acceptance or rejection managerial decisions, occurs through the stacking axiomatic models (model of cause and effect relations between sentences) which contain the relation of an implicative nature or another system of relations that will reveal implicit knowledge. The requirements for a formalized knowledge submission include: a form of representation that will ensure correct logical and semantic processing ofknowledge; the necessary information to solve specific information and analytical tasks so that maximum retains the text representation of knowledge elements.

Taking into account the quality requirements of the formalized submission ofknowledge, the conceptual structure (CS) of the content of the natural-language text (NLT) is chosen. It is a hierarchical structure, at the top level of which are the most common concepts and relations between them, each lower level is represented by concepts and relations that specify the relevant concepts, the ratio of a higher level. That is, the top level of the CS corresponding to the most general description of the content of the text, Its lower levels are the appropriate level of concretization of this description. Each concept and attitude in the CS is accompanied by characteristics that determine their properties (a concept and an attitude), modalities and other aspects; linguistic information that characterizes linguistic means, their reflection in the input text; semantic information (e.g. an object, a subject, a relation type, a direction, etc.). Thus formed CS contains all the necessary information to solve the applied problems of I & AA. The possibility of its formation is determined by the presence of relevant knowledge in the thesaurus of the system.

The peculiarity of the conceptual structure is its hybrid representation which combines properties of semantic networks and predicate models; in order to unify the presentation ofknowledge contained in texts, such as analytical reports which contain a variety of arguments, including a content structure; to reproduce

implicit knowledge and adapt it to further processing; also special facilities are introduced to store natural-language textual presentation - the prefixes and the postfixes of the predicates and concepts, the logical-linguistic connections and the anaphoric references.

CS corresponds to the formulated requirements and contains all necessary information for its further logical and semantic processing and for the synthesis of the description of the CS or its fragments in natural language, it is formed as a result of the linguistic processing of NLT.

It should be noted that during the processing of the "Notion" that occurred at the beginning of the text and turned out to be polysemic, is specified at the end of the text. In addition, the content contains a certain structure during the formation of the text.

The feature of the synthesis process of the description of the CS is the logical structure of the synthesized text, defined by the structure of the CS or user requirements. In the latter case, the necessary fragments "are removed" during the logical processing of the conceptual structure which are integrated into the only CS with a formalized representation of the content of the text. The linguistic synthesis of the text is carried out under the control of the structure and content of the elements of the CS incoming to it.

Therefore, at the first stage of the information processing, it is unified to the only form ofpresentation in the look of a conceptual structure (for natural-language texts by removing knowledge from textual sources and formalizing it). The natural-language texts are submitted in English, Russian and Ukrainian. The knowledge base integrates the everything necessary for a comprehensive analysis of a priori ("the old") and current information. Information is analyzed for functional completeness, compatibility, and contradictions by methods of logical and semantic knowledge processing. These methods solve the applied problems, which means generalization information, and formating an analytical review and recommendation certificates.

The core of the instrumental knowledge-oriented system of automation of natural-language information

processing subsystem which is named "Dictionary". For the complex solution of automation problems of knowledge extraction from natural-language texts, their formalization and processing in the interest of solving application problems "Dictionary" contains three main sections: the first section that contains linguistic knowledge of the language tools of natural language; the second section that contains general knowledge of the real world and knowledge of specific subject areas; the third section that contains the knowing of how knowledge about the real world is formulated in a specific language. "Dictionary" is a model of the process of displaying ("coding rules") knowledge about the world in a specific natural language.

The subsystem "Formalizer" implements procedures for extracting and formalizing of content (a knowledge) reflected in natural-language textual sources. The subsystem "Unififier" implements procedures for converting the formalized knowledge representation to a single look and can be used on its own. One ofthe functions of this subsystem is, for example, the replacement ofsuch action relations as "locomote", "fly", "go", "go" with a semantic synonym for "move".

Due to the formalization ofknowledge, their representation is formed in the form of a conceptual structure. The generated CS contains all the necessary information to solve the application problems of I & AA automation. The possibility of its formation is determined by the relevant knowledge in the "Dictionary".

Therefore, a formal presentation of the content can also synthesize its description in natural language. The subsystem that implements this procedure is named "Synthesizer". The "Identification" (ID) subsystem analyzes formalized representations of knowledge or fragments of them for the identity of the content they display and transforms those representations into a single representation. With the help of the considered subsystems, it is possible to solve the problem of identification of NLT fragments at the substantive level quite effectively and eliminating duplicate text fragments with the same content. Moreover, if "Formalizer" can process multilingual

text sources (Ukrainian, Russian and English) and at the same time transforms their content into a single formalized view, while "Synthesizer" can also formulate content descriptions in different languages. Also, the information systems that comprise these two subsystems are also multilingual.

The combination "Formalizer-Synthesizer" gives a possibility to realize machine translation and abstracting of text documents based on an analysis of their content. The peculiarity of constructing such a translator is that the translation does not do in separate phrases (sentences), as it does in modern translators. First, the input text is fully processed by the "Formalizer", resulting from which, form a unified formalized presentation of the content of the text. Then this view is "processed" by "Synthesizer". During the formation of formalizing a presentation, the information that is specific to a concept and is distributed throughout the text concentrates around that concept. Therefore, logical content structuring is carried out. In the case where the "Formalizer" and the "Synthesizer" operate in the same language mode. It will be a "machine translation" of the contents of the document, also is logically structured.

In the abstracting mode, the CS text is partially used, depending on user needs. If the user needs a generalized abstract of the text, so in the CS is separated by a layer of only the upper level, which is perceived as a conceptual image of the input text. Truly of using the term "image" is justified by the fact that the selected part of the CS reproduces the generalized content of the text. The description in a natural way synthesizes its description in natural language, which is a generalized abstract (summary) of the input text. In case the user wants to get an abstract with some level of detail about those aspects of the content of the text which are interested for him, then, apart from the upper layer, the corresponding fragments of the lower levels are separated from the CS. The selected fragments of the CS synthesize the conceptual image of the input text, which is the basis for the synthesis of his abstract. The amount and content

of detail may look like a way by keyword list or based on natural language requests. Thus, this approach makes it possible to form purposeful abstracts, that reflect the user's needs in extending the description of certain aspects of the text.

This approach allows you to build multilingual auto-referencing systems, which are able, for example, to form abstracts of English and Russian texts into Ukrainian.

The "Formalizer-Identification-Synthesizer" combination allows automating the solution of the problem-finding the necessary information on the meaning of the request, formulated in natural language.

The "Formalizer-Identification" combination with the active use of the "Dictionary" and the "Unifier" makes it possible to automate the solving of such problems:

• the elimination of duplication of the same content of different text documents or their fragments;

• the automatic indexing of multilingual text documents by their contents (the solution to this problem is based on the formation of a formalized presentation of the substantive essence of the rules of document indexing);

• automatic classification and distribution between thematic sections of the textual knowledge base of the multilingual documents by their content.

An important task of I & AA automation is to integrate the knowledge in the specific CS, which is contained in multilingual sources. This task is not treated even in the plan. The "Integrator" subsystem is introduced to automate this task, such as English text, which is devoted to the methods of presentation of knowledge, according to which the "Formalizer" the conceptual structure is formed by its content. This is how the conceptual structure is constructed, and according to the Russian-speaking text, which refers to the methods of the processing of knowledge. But it also describes the methods of representation of knowledge discussed in the English language text.

The identical fragments of incoming texts are generate and identical fragments of the conceptual structure, because their representations are standardized in the form and by the language of the intra-machine representation. This makes it possible to unite the conceptual structures of multilingual texts with the unification of their common parts. If necessary, links to sources that have formed those fragments of the conceptual structure can be fixed. Therefore, knowledge in some subject areas contained in multilingual textual sources may be accumulated.

One of the most important tasks of text information processing is the definition of correctness, in particular, logical and contextual compatibility or contradiction ofknowledge. The sources of contradictions are, for example, stylistic flaws in the input text, vague or negligence wording, incomplete informing the authors, distortion of textual information during its transmission through the network, as well as deliberate distortion of information (disinformation). The task ofautomation ofdetection ofcontradictions in information at the level of processing directly input text is extremely difficult. To automate the task of defining logical and semantic compatibility or contradictory knowledge extracted from the NLT, the subsystem "Logic" is introduced. The functioning of this subsystem is based on the implementation of a sufficiently wide range of formal and logical methods of logical and semantic knowledge analysis already developed and developed today for functional completeness, compatibility, or inconsistency.

The main task of analytical activity is the formation of analytical reviews and references. Knowledge-oriented approach to the development of information systems provides an opportunity to automate the process of forming analytical reviews and certificates by the user requirements for their volume and thematic orientation. This is especially important if the same sources contain diversified and multifacet-ed, in the thematic plan, information, and the user is interested in specific aspects, for example, the development of certain events. In this approach, the pres-

ence of a subsystem (which is called "Analytics"), which by the formalized submission of requirements for analytical review (a reference) would search and locate the necessary fragments of knowledge in the corresponding conceptual structure, their isolation, and unification into a single meaningful - holistic structure. This conceptual structure and is the basis for forming the text (the Synthesizer) of the desired review (a reference) in natural language. To solve this task, a significant portion of the "analytics" functions can be implemented in previously considered components ("Identification", "Unifier", "Integrator") about unified application tasks.

Representation of knowledge about the subject industry

The most widespread methods of knowledge processing are acquired in expert systems of various purposes and instrumental systems of automation of programming. A capsule of expert systems can be considered as instrumental systems of programming of applied expert systems, or programming of knowledge for them. From the point of automation view, I&AA is interesting not by expert systems, but by the methods of representation and processing the knowledge they are implemented. Analysis of practical aspects of the situation in this field demonstrates that in the classical theory of representation of knowledge, two principal forms are considered: the semantic and the logical. The semantic representation of knowledge about the subject area (SA) includes: semantic networks and their variants, frames, a model of universal semantic code (USC). Special attention is drawn in the semantic representation of knowledge on hierarchical models. Logical representation includes productive and predicate models.

Semantic networks and frames form a class of relational models of knowledge representation. In the basis of relational models are binary relationships of knowledge representation. The language of the formalized knowledge representation contains common patterns for the entire class of relational models. To model knowledge about the world (SA), we distin-

guish several classes of elements that are fed through linguistic units (words and phrases) as the basic concepts and relation of describing a certain subject area (SA). Thus, it distinguishes names, relation, concepts. The concepts are divided into "concepts-classes", "concepts-processes" and "concepts-states". The "concept-class" is a collection of specific objects and objects that contain certain properties like (a noise generator, a table, etc.). The concepts "processes" describe a group

of homogeneous processes (load, damage, etc.). The concept "status" defines the state of the object (normal mode, line on, etc.). "Names" serve as for the identification of concepts (airplane AH-64A, company "McDonnell Gelcopter", etc.). "Relation" serves as to establish relationships on plural concepts. In some sources, up to 200 simple (base) relations are highlighted. The most commonly used of these relations are shown in (table 1).

Table 1.

Relation type The name of the relation Marking the relation

Time to be at the same time Ru

to be previously Rn

Spatial to be surrounded r2!

to be R22

to be behind R23

• •• R2r

Dynamic to move to R31

• • • R3n

Classification to belong to a class R41

to have (properties) R42

• •• R4k

Identification to have a name

Pragmatic to serve for RS1

to have a conditional R52

to be the obstacle R53

The table shows not language relations, but the relations of the physical world. A particular language relationship may contain different descriptions. For example, in the following different phrases: "The part moves through the conveyor to the final bunker", "The robot moves to warehouse number 4", «The car is approaching the intersection» - one relationship is implemented R31 from (table 1).

To conclude, correctly constructed formulas (CCF), a set of syntactic rules proposed.

The class of relational models contains a very wide range of their varieties.

Semantic networks have gained active distribution in artificial intelligence systems to automate

management tasks. Semantic network in the general case indicated as:

C = <x1j x2j ... x^j rx, r2, ... r>, where, x, X2; . • • x s - fixed sets; r, r2, ... r - system of relations defined on the elements of these sets. There are a lot of different concepts in constructing semantic networks, the authors of which are trying to convey a more fully formalized representation of knowledge about SA. Examples of very common semantic networks are the language of conceptual dependencies, developed under the leadership of R. Schenk, pyramidal growth networks.

Syntagmatic chains and RX - codes are special cases of semantic networks. Three species (X r Y),

where X and Y are codes of concepts, R is a code of relation, it is necessary to call an elementary syntagm. Formulas that consist of elementary syn-tagms and the connections between them (logical operations) are called syntagmatic chains. Over such chains administered algebra, which makes a certain conversion over them. RX codes fix the binary relationship R between X. and Y RX-codes are used to build a statistical representation of the properties of objects in the subject area.

A separate class of relational models forms a frame representation. Frames are structural and role-playing. An analogy can be drawn between role frames and RX codes. So, any RX code is considered a role frame. They contain a branching hierarchical structure where the nodes of the higher layer correspond to a more general concept. The concept of each node is determined by a set of attributes called slots. It is possible to associate a certain number of procedures with each slot. These procedures can monitor the assignment of information to the appropriate node and control what actions must be taken when changing this information. These capabilities of frames are useful in those subject areas where the form of presentation and content of the information provided are important. The frame analysis procedure is used in a number of systems, including for the analysis of natural language text. But the frame is too large a unit of content, which does not allow to show all the features of NLT.

The universal semantic code was developed by Martinov. The elementary structure of the USC is represented by three: (S, A, O), where S - is the subject in the sentence, A - is the predicate, O - is the direct complement. The rules of transformation of a natural-linguistic sentence into USC are to single out elementary constructions in a sentence. Over elementary structures introduced algebra (assembly syntagms, integration, decomposition).

The structure of concepts in the subject area is hierarchical. This led to the development of hierarchical models of knowledge representation. Hierar-

chical models represent a "tree" of subordination. At the top level there is only one node called the root. Each node, except the root, is associated with one node at the highest level, which is called the input node for that node. No element has more than one input node, each element can be associated with one or more elements at the lower level. The hierarchical view is that each record contains its value only in the context of the tree. The subordinate item cannot be without its predecessor in the hierarchy. Hierarchical models are distributed during the laying of the thesaurus from the subject industry, but not all relationships fit in hierarchical boundaries.

Semantic models contain certain advantages in the description of knowledge in machine systems, namely: visibility and simplicity of perception. The use of binary relations is one of the most natural means of presenting information to the user; relationships between concepts, on the one hand, are realized through linguistic vocabulary, on the other hand, are subject to a formal description suitable for computer processing.

Most predicate languages are based on numerous first-order predicates. Multiple and single predicates, logical connections, and quantifiers are used. Knowledge is presented in the form of formulas of predicate logic. Joining the formulas to those obtained earlier makes it possible to obtain new statements about objects using the rules for deriving plausible formulas in numerous first-order predicates. This procedure is interpreted as a logical conclusion. The predicate numbering alphabet consists of the following set of characters:

- punctuation marks: {(,).};

- proposal links:;

- quantifiers {};

- symbols of variables x, j = 1,2,...;

- n - local functional symbols: f

From the symbols of the alphabet build different statements. Thus, we distinguish terms, elementary formulas (atoms) and correctly constructed formulas (CCF). Any symbol of a variable or functional

constant is a term. If p is a predicate symbol, and t, ..., t- terms, so p(t, ..., t) - is an atom. The atom is a correctly constructed formula (CCF). If A and B are CCF, then (A, AB, AB) are CCF too.

Methods of predicate representation of knowledge contain a fairly wide range. But the vast majority of them are based on the programming style developed by Kowalsky, who uses the logic of predicates to control the analysis of declarative statements. At the same time, each statement is written in the form: consequent: antecedent-1 [antecedent-2] ...

Antecedents are predicates whose probability values can be determined by the way, and the consequent is a predicate that contains the value "truth", when every antecedent and predicate also contains the meaning "truth".

The program that implements this mechanism selects the goal and compares it with the consequences of all statements. When it finds such consequences, it tries to prove the goal by considering the antecedents of the found consequence as subgoals. If, as a result of such a recursive process, all subgoals are proved then. The goal itself is considered proved. A classic example of the implementation of this approach is the Prolog system and other systems like prolog. At the present stage, there are practical systems and theoretical developments that implement logical systems with more complex mechanisms of logical inference and with much greater opportunities to present different aspects of knowledge. For example, the Platan-D1 logistics programming language is designed to formalize the tasks of action planning in a dynamic environment in the interests of synthesizing a model of the behavior of dynamic objects or an algorithm for achieving goals by objects that are the basis for compiling a program from autonomously debugged software modules. The core of this language is the procedure for deductive excretion that operates with a conditionally true form of presentation of logical statements. The language of Platan-D1 contains modal and modal-temporal logic and is constructed in such a way that it is open to expand the means of

other logics. In addition, the constructions of this language provide for the possibility of maintaining logical assertions of imperative information or information that characterizes the peculiarities of the use of software modules that interpret the content of primary predicates. This allows you to simply implement the mechanism of procedural accession to the logical output scheme. This capability can be used for a variety of purposes, for example, to extend the language using multi-valued logic, to estimate the probability of goal achievement, etc. In the context of discussed problems, Platan-Dl can be considered as a logical and instrumental core for constructing the language of representation and logical-semantic processing of knowledge extracted from the texts.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The advantage of the predicate type systems is that the logical systems contain an effective procedure. That is, for the final subject areas and in the presence of the functional fullness of the logical system is the algorithms of logical output, providing a publication of the answer on determining truth target instruction.

Among the disadvantages it should be noted that the number of predicates does not have a sufficient set of tools for the formalized representation of knowledge that is in natural language texts.

The presentation of knowledge in the form of products (or rules) provides a formal way to record recommendations, guidelines, or strategies. This method is most effective in cases where subject knowledge is formed on a base empirical associations, accumulated over the years ofwork while solving problems in a particular subject field. Products are noted as a statement, for example, "if-then". There are three ways to use these rules: direct chain, reverse chain, and a combination of them.

Development of a logical-semantic model of knowledge representation from the subject area of natural language text. The main components of knowledge in terms of their formalized presentation are the notion, the relation between them, their characteristics, and also the modality of these character-

istics. Therefore, the processing of input text must be directed to the detection of the text of main components of knowledge and establishment of logical-semantic relations between them in order to form the conceptual structure of the input text.

As noted, the information contained in the text sources can be filed in different languages, which determines the need to transform the multilingual input information into its single presentation in the knowledge base. Such reporting is the basis for solving the complex tasks of I&AA. It is also the basis for the synthesis of the description of the content, which is reflected in the formalized representation. There are requirements for a formalized presentation ofknowledge. The first is that the presentation must be of a certain form, which will provide the possibility of correct logical and semantic processing of knowledge; The second is that filling the necessary information to solve certain information and analytical problem, so maximally fully at the factual level to preserve the text representation of elements of knowledge.

An implicit predicate is a relation that has no corresponding lexical equivalent in the text. To preserve the means of expression of natural-linguistic textual representation, special means were introduced - prefixes and postfixes of predicates and notions. An elementary predicate formula can also contain quantifiers of unity ( " ) and existence ( $ ).

Elementary predicate formula: N P q (LX't, MY g ); where N, L, M - respectively, the prefixes of predicate and arguments that define the type of semantic class; P - the name of the semantic class of the relation; X, Y - names of semantic classes of notions. Arguments have a fixed position. The formula is interpreted in terms of classical predicates: "the notion X is in relation to P to the notion Y". Postfixes are the upper and lower indices of the predicate and arguments. The upper index of the predicate q (q e Q) determines the lexical and grammatical way of linguistic combination of relations and notions in the text. The plural Q is a list of linguistic units (for example, preposi-

tions, particles, etc.) and grammatical features (for example, the case of control between a verb and the corresponding noun), which reproduce the rules of combining relations and notions in the text. The lower index of the predicate k determines the specific lexical representative for the corresponding semantic class N. The upper indexes of the arguments i and j (i, j e A) determine the grammatical characteristics of notions (for example, number, creatures, etc.). The set A is a list of grammatical characteristics of notions that are arguments of the predicate formula. The lower indexes of the arguments t and g(t e L, e gM) determine the specific lexical representative of the respective semantic classes. In the process of formal-logical inference, postfixes are ignored. They are crucial at the stage of synthesis of the description of fragments of CS by natural language means. In a separate lexical-semantic class of relations are allocated lexical units that have the meaning of modality (want, be able, need, etc.).

Unification of lexical notions and relations within a certain semantic class is carried out by the relation "genus-species". The structure of the semantic class is a hierarchical structure, at the upper level of which are the most general concepts (relations), each lower level is a notion (relations) that specify the corresponding notion (relations) of the highest level. The choice of genus relation for the unification of notions and relations in the subject area is of fundamental importance. Replacing specific concepts (relations) with generic ones in free phrases does not lead to a violation of the semantic meaning of the expression (except about other hierarchical relationships).

For each elementary predicate formula, a size truth matrix is constructed that reproduces the possible values of the arguments for a particular predicate. In essence, this matrix is a fragment of knowledge about the relationship between notions in the subject area. The columns of the matrix reproduce the valid values of the argument X, and the rows -the argument Y. The elements of the matrix take the value: "1" (the predicate contains the value "true", so

the corresponding arguments X and Y are valid for the predicate), "0" (the predicate contains the values "wrong", so the corresponding arguments X and Y are invalid for the predicate) and "?" the predicate contains the meaning "conditional truth". So the admissibility of the corresponding arguments X and Y

Solving the problems of informational and analytical activity, for example acceptance or rejection managerial decisions, is presented by an axiomatic model that allows revealing the implicit knowledge and predict consequences of a decision. Such an axiomatic model in terms of content reproduces the knowledge of a particular applied problem. The main issue of recognition is the solution of polysemy and new words for the system. The proposed representation of CS allows you to recognize "new" words and make a choice on many possible meanings of words. Processing of new (unfamiliar to the system) words: if the word (lexical representative of a notion or relationship) is absent, then based on the matrix of truth it is possible to choose from the appropriate lexical-semantic class representative with a more general meaning. For example, if there is no dictionary entry for the English phrase "to go by bicycle" in the translation dictionary. Then using an axiomatic model, the program interprets the translation as "to ride by a vehicle". If the program does not recognize the lexical representative of the relation, then the text selects all contextual connections (it means the ways to use it with other concepts) and selects the most plausible axiomatic model, then assigns the name of the relation with the most common meaning. For example, the phrase "walking to the work" will be interpreted as "go to the work". To finding a suitable equivalent for a relation (notion), there must be a

must be determined based on the axiomatic model and linguistic context.

It should be noted that the truth matrix determines only the semantic correctness of a certain elementary predicate formula. A fragment of the truth matrix for the relation "move" is given in the (table 2).

context for the new word, which is determined by two or more elementary predicate formulas. Under the conditions of certain three elementary predicate formulas, the algorithm works with the reliability. Thus, while analyzing new words, the program may violate the stylistic integrity of the text but preserves its semantic integrity, which is important and meets the requirements of the analysis of scientific and technical texts.

The implementation of the relevant functions of ICS is based on the analysis of the content of texts. That allows, on the one hand, to implement a targeted search for information, on the other - increases the level of relevance of the selection of the necessary information material.

Therefore, based on the above, certain conclusions can be drawn.

1. The general fundamental feature of the tasks of IAA is that the subject of analysis is the content of textual information or knowledge contained therein. Taking into account this feature during the development of information technology for automation of support for IAA tasks necessitates the solution of the problem of modeling the process of human comprehension of textual information. The key point of this process is the implementation of methods for automating the extraction of knowledge contained in natural language texts and their formal presentation in machine formats.

Table 2.

N: relation-action; P: to move

M Q Y Who? What? L

People Animals Vehicle X

Spatial characteristics of action Where? Destination 1 1 1

Automation of IAA tasks, based on the processing of the content of text documents, provides primarily an intensive path of information technology development.

2. Today there are no systems that allow to integrate and generalize knowledge in a particular subject area, which are contained in multilingual text sources, as well as to check them for content compatibility and contradictions, which is an important component of IAA.

3. The proposed approach of IAA automation for processing multilingual text sources does not require in the system of preliminary (non-automated) processing of natural language text (separation of sentences and their fragments, separation of relations and their arguments).

4. The implementation of the relevant functions of IRS is based on the analysis of the content of texts, which allows, on the one hand, to implement

a purposeful search for information, on the other -increases the level of relevance of the selection of necessary information material.

5. The most important expected results are:

- creation of electronic information resources and their operative use in important spheres of state activity;

- significant expansion of the information space through the integration of foreign information funds (modern achievements of science, technology, etc.) in national information resources;

- improving the quality of assessment of the current situation in the political, economic, environmental, social and other areas on the basis of automation of processing large amounts of information contained in disparate sources;

- analysis for the presence of distorted and contradictory information, which will facilitate decisionmaking at the state level.

References:

1. Hopfield J. J. Neural networks and physical systems with emergent collective computational abilities [Text] J. J. Hopfield // Proc. Natl. Acad. Sci. 79, 1982.- P. 2554-2558.

2. Замаруева I. В., Бадьорша Л. М. Метод юльккного ощнювання вЦповЦей в системах тестування знань вЦкритого типу /наук. журнал "Системний аналiз та шформацшш технологи" / - К.: КП1, 2010.- С. 41-46.

3. Badorina L. Method of the relevance degree estimation of the text answer in computer training systems / L. Badorina // Вкник НАУ 2007.- № 1.- С. 80-84.

4. Шенк Р. Обработка концептуальной информации: Монографгя/ Шенк.- М.: Наука, 1980.- 360 с.

5. Badyorina L. Method of grammatical structure formalization of natural language // Austrian Journal of Technical and Natural Sciences, "East West" Association for Advanced Studies and Higher Education GmbH. - Vienna. 7-8 (4). 2015.- P. 106-111.

6. Uhr L. The compilation of natural language text in to teaching machine programs // American Federation of Information Processing Societies Conference Proceedings, 1964.- P. 26-35.

i Надоели баннеры? Вы всегда можете отключить рекламу.