Научная статья на тему 'Method of data expression from the Ukrainian content based on the ontological approach'

Method of data expression from the Ukrainian content based on the ontological approach Текст научной статьи по специальности «Компьютерные и информационные науки»

CC BY
112
36
i Надоели баннеры? Вы всегда можете отключить рекламу.
Ключевые слова
analysis / content-analysis / ontology / content management system / аналіз / контент-аналіз / онтологія / система управління контентом

Аннотация научной статьи по компьютерным и информационным наукам, автор научной работы — Lytvyn V. V., Vysotska V. A., Hrendus M. H.

Context. Nowadays there is a constantly increasing interest to the application of the intelligent systems (IS) in different areas such as information technologies (IT), engineering, medicine, biology, ecology, geography, jurisprudence etc. At the heart of architecture of modern IS’s knowledge bases (KB) are used, which are formed due to the subject area (SA), where the given IS is used. The main part of KB is ontology as clearly structured SA’s model, systematic set of terms, which explain the connections between objects of this SA. Ontologies are generally accepted and widely used in different branches of science such as knowledge engineering, presentation of knowledge, information search, knowledge management, database design, information modeling and object-oriented analysis. In particular, Gather company in their researches of IT-market attributed the use of taxonomy/ontology in his area. Consequently, research of syntactic ontological structures of KB, construction and research of optimal algorithm for syntactic analysis of Ukrainian language texts and the development of software-algorithmic means of content, automatic referencing of texts, gathering knowledge, translation etc. are relevant. Objective. The goal of the work develop a software system for formalizing the rules of syntax of the Ukrainian language in the form of an ontological basis of knowledge for the purpose of its use for working out natural language texts in the Ukrainian language. Method. Methods of solving the problem of creating a consolidated resource based on ontological KB were chosen decision trees, IDEF5 methodology and ontology construction methodology. The results of syntactic analysis work are taken into account by associative-semantic context analysis to optimize the process of constructing associative context relationships between words and sentence combinations within the hierarchical network of ontological BB. Results. A consolidated information resource is created – an ontological KB of parsing analysis of Ukrainian-language text documents with the help of Protégé 3.4.7. Conclusions. The method of data extraction based on ontological BZ and FPGA language is developed for the further development of a consolidated information resource for the syntactic elaboration of text documents. As a result, an ontological type of KB with FPSM was created. The syntactic structure of the input sentence is the foundation and frame for the next, not less important step – semantic analysis. This ontological KB of the consolidated LR of syntactic elaboration of Ukrainian-language text documents serves as a powerful basis for further development of an automated IS for parsing Ukrainian-language texts.

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

МЕТОД ВИДОБУВАННЯ ДАНИХ З УКРАЇНОМОВНОГО КОНТЕНТУ НА ОСНОВІ ОНТОЛОГІЧНОГО ПІДХОДУ

Актуальність. У даний час постійно зростає інтерес до застосування інтелектуальних систем (ІС) у різних галузях як інформаційні технології (ІТ), машинобудування, медицина, біологія, екологія, географія, юриспруденція тощо. В основі архітектури сучасних ІС використовують бази знань (БЗ), які формують відповідно до предметної області (ПО), де застосовують дану ІС. Основною частиною БЗ є онтологія як чітко структурована модель ПО, систематизований набір термінів, які пояснюють відношення об’єктів цієї ПО. Онтології є загальновизнані та широко застосовувані в таких різних галузях науки, як інженерія знань, подання знань, інформаційний пошук, управління знаннями, проектування баз даних, інформаційне моделювання та об’єктно-орієнтований аналіз. Зокрема, фірма Gartner в дослідженні ІТ-ринку віднесла використання таксономії/онтології на третє місце в десятці передових технологій у даній галузі. Тому є актуальним дослідження структур синтаксичних онтологічних БЗ, побудова та дослідження оптимальних алгоритмів синтаксичного аналізу україномовних текстів, та розроблення програмно-алгоритмічного засобу синтаксичного опрацювання текстів, наприклад, для рубрикації Інтернет-контенту, автоматичного реферування текстів, видобування знань із них, перекладу на інші мови тощо. Методами розв’язання проблеми створення консолідованого ресурсу на основі онтологічної БЗ були обрані дерева рішень, методологія IDEF5 та методології побудови онтологій. Результати роботи синтаксичного аналізу враховуються асоціативно-семантичним контекстним аналізом для оптимізації процесу побудови асоціативних зв’язків контексту між словами та словосполученнями речення всередині ієрархічної мережі онтологічної БЗ. Результати. Створено консолідований інформаційний ресурс – онтологічну БЗ синтаксичного аналізу україномовних текстових документів за допомогою Protégé 3.4.7. Висновки. Створенно метод видобування даних на основі онтологічної БЗ та ПСУМ мови для подальшого розроблення консолідованого інформаційного ресурсу синтаксичного опрацювання текстових документів. В результаті створено БЗ онтологічного типу із ПСУМ. Синтаксична структура вхідного речення є фундаментом та каркасом для наступного не менш важливого кроку – семантичного аналізу. Дана онтологічна БЗ консолідованого ЛР синтаксичного опрацювання україномовних текстових документів виступає потужною основою для подальшого розроблення автоматизованої ІС синтаксичного аналізу україномовних текстів.

Текст научной работы на тему «Method of data expression from the Ukrainian content based on the ontological approach»

UDC 004.9

METHOD OF DATA EXPRESSION FROM THE UKRAINIAN CONTENT BASED ON THE ONTOLOGICAL APPROACH

Lytvyn V. V. - F.D., Professor, Head of Information Systems and Networks Department of Lviv Polytechnic National University, Lviv, Ukraine.

Vysotska V. A. - PhD, Associate Professor, Associate Professor of Information Systems and Networks Department of Lviv Polytechnic National University, Lviv, Ukraine.

Hrendus M. H. - Assistant of Information Systems and Networks Department of Lviv Polytechnic National University, Lviv, Ukraine.

ABSTRACT

Context. Nowadays there is a constantly increasing interest to the application of the intelligent systems (IS) in different areas such as information technologies (IT), engineering, medicine, biology, ecology, geography, jurisprudence etc. At the heart of architecture of modern IS's knowledge bases (KB) are used, which are formed due to the subject area (SA), where the given IS is used. The main part of KB is ontology as clearly structured SA's model, systematic set of terms, which explain the connections between objects of this SA. Ontologies are generally accepted and widely used in different branches of science such as knowledge engineering, presentation of knowledge, information search, knowledge management, database design, information modeling and object-oriented analysis. In particular, Gather company in their researches of IT-market attributed the use of taxonomy/ontology in his area. Consequently, research of syntactic ontological structures of KB, construction and research of optimal algorithm for syntactic analysis of Ukrainian language texts and the development of software-algorithmic means of content, automatic referencing of texts, gathering knowledge, translation etc. are relevant.

Objective. The goal of the work develop a software system for formalizing the rules of syntax of the Ukrainian language in the form of an ontological basis of knowledge for the purpose of its use for working out natural language texts in the Ukrainian language.

Method. Methods of solving the problem of creating a consolidated resource based on ontological KB were chosen decision trees, IDEF5 methodology and ontology construction methodology. The results of syntactic analysis work are taken into account by associative-semantic context analysis to optimize the process of constructing associative context relationships between words and sentence combinations within the hierarchical network of ontological BB.

Results. A consolidated information resource is created - an ontological KB of parsing analysis of Ukrainian-language text documents with the help of Protégé 3.4.7.

Conclusions. The method of data extraction based on ontological BZ and FPGA language is developed for the further development of a consolidated information resource for the syntactic elaboration of text documents. As a result, an ontological type of KB with FPSM was created. The syntactic structure of the input sentence is the foundation and frame for the next, not less important step - semantic analysis. This ontological KB of the consolidated LR of syntactic elaboration of Ukrainian-language text documents serves as a powerful basis for further development of an automated IS for parsing Ukrainian-language texts.

KEYWORDS: analysis, content-analysis, ontology, content management system.

ABBREVIATIONS

KB is a knowledge base; AI is an artificial intelligence; IS is an information system; IT is an information technology; SA is a subject area;

PIR is a processing of information resources; LR is a linguistic resource;

RSUL is a rules of the syntax of the Ukrainian language.

NOMENCLATURE

O is an ontology;

X is a finite set of concepts for the subject area for describing the Ukrainian language;

R is a finite set of relations between concepts; F is a set of interpretation functions; Morphology is a finite set of concepts of the morphology of the Ukrainian language;

Punctuation is a finite set of concepts of the punctuation of the Ukrainian language;

Structure is a finite set of concepts of the structure of the Ukrainian language;

Syntax is a finite set of concepts of the syntax of the Ukrainian language;

Semantic is a finite set of concepts of the semantics of the Ukrainian language;

WordsCombination is a finite set of concepts of the formation of phrases;

Sentence is a finite set of concepts of the creation of sentences in the Ukrainian language;

SignWords is a finite set of signs of the formation of phrases;

LexicalSign is a finite set of lexical signs of the formation of phrases;

SyntacticSign is a finite set of syntactic signs of the formation of phrases;

Noun is a finite set of registered signs of the formation of phrases;

Adjective is a plurality of adjective signs of the formation of phrases;

Numeral is a finite set of numerical signs of the formation of phrases;

Pronoun is a plurality of pronoun signs of the formation of phrases;

Verb is a finite set of verb signs of the formation of phrases;

Adverb is a finite set of adverbial signs of the formation of phrases;

Coordinated is a finite set of coordinated signs of the formation of phrases;

Inferior is a finite set of subordinated signs of the formation of phrases;

SimpleWord is a finite set of simple signs of the formation of phrases;

ComplicatedWord is a finite set of complex signs of the formation of phrases;

AdversativeComt is a finite set of signs of dividing signs;

ConnectiveComt is a finite set of connecting signs;

DividingComt is a finite set of signs of opposing communication;

ContactComt is a finite set of signs of agreement;

ManagementComt is a finite set of signs of management;

AdjoiningComt is a finite set of signs of adjoining;

SignSent is the finite set of signs of the creation of sentences in the Ukrainian language;

SentenceMembers is a finite set of signs of the identification of sentence members;

NarrativeSent is a finite set of signs of the formation of narrative sentences;

PronouncedSent is a finite set of signs of the formation of questionnaire;

IncentiveSent is a finite set of signs of the formation of inductive sentences;

EmotionallyNeutral is a finite set of signs of the formation of emotionally neutral sentences;

EmotionallyColored is a finite set of signs of the formation of emotionally colored sentences;

SimpleSent is a finite set of concepts of the formation of simple sentences;

ComplicatedSent is a finite set of concepts of the formation of complex sentences;

MainSentMemb is a finite set of signs of identification of the main members of the sentence;

SecondSentMemb is the set of signs of identification of secondary members of the sentence;

AffirmativeSent is a finite set of signs of the formation of affirmative sentences;

NegativeSent is a finite set of signs of the formation of negative sentences;

SgSpSt is the finite set of signs of the formation of simple sentences.

INTRODUCTION

In the ontology study, questions arise from the first steps. Until now, there is no single definition for the concept of ontology. The concept of ontology comes from the Greek. "Ontos" - the existence, "logos" - the doctrine, the concept, this is a section of philosophy that studies existence. In computer science, this is an attempt of comprehensive and detailed formalization of a certain area of knowledge through the conceptual scheme [1]. Under the

© Lytvyn V. V., Vysotska V. A., Hrendus M. H., 2018 DOI 10.15588/1607-3274-2018-3-16

conceptual scheme should be understood a set of concepts + information about the concept (properties, relationships, constraints, axioms and assertions about the concepts necessary to describe the processes of solving problems in the selected software) [2]. Among the specialists in computer linguistics, the most established (classical) is the definition of ontology given by Gruber T. : "Ontology is a specification of conceptualization" [3-4]. In addition to problems with the exact definition of the concept of "ontology", there is a number of problems with the description of the model of ontology in the formal language [1]. However, not all existing ontological LR fall under the given definition. Today, the evolution of applied IS goes toward increasing their intellectualism. This significantly affects the direction of scientific and technological research related to the use of computers, and also gives the society practically important results. However, at the certain stage of development, further improvement of IT by nowadays available resources becomes impossible. In such periods, a qualitative leap is required for the development tools. One such leap in the field of AI, aimed at further intellectualization of the interactions between systems and users, was the emergence of ontologies.

The purpose of the work is to design and develop the system of formalization of RSUL in the form of an ontological KB with the aim of its use for processing the Ukrainian-language content of Web-resources and extraction of data from it.

The object of the research is the processes of extraction from Ukrainian language content of Web-resources based on the ontological approach taking into account the syntax and semantics of texts.

Subject of research is methods and means of technology for processing information resources and extracting data from them based on the ontological approach.

1 PROBLEM STATEMENT

To develop a software system S of the formalization of RSUL in the form of an ontological knowledge base for its use for the processing of natural language texts written in the Ukrainian language (for example, for automated referencing, extracting knowledge from texts, translating texts into other languages, etc.). At the input of the system there are verbal rules for the syntax of the Ukrainian language, which are given in textbooks and other books about Ukrainian grammar rules.

At the output of the system there is an ontological model of the rules of the syntax of the Ukrainian language O = <X, R, F>. The taxonomy of the ontology concepts X defines the syntax of the language (the root concept of ontology). The optimal determination of the plurality of relations between these concepts R and the set of rules F of the syntax of the Ukrainian language, formalized with the help of descriptive logic DL, will allow to effectively process the nature-language texts in the Ukrainian language, that is: S: RSUL ^O.

2 LITERATURE REVIEW

The authors of works [5-13] believe that in designing of ontologies conditionally distinguish two directions, which for some time developed separately. The first (formal) - based on logic (predicates of the first order, descriptive, modal, etc.) [14-15]. The second (linguistic) -based on the study of natural language (in particular, semantics) and the construction of ontologies on large text arrays, the so-called buildings [16-27]. Formal ontology is a set of concepts and assertions about these concepts, on the basis of which the classes, objects, relations, functions and theories are constructed [28-31]. Most models of ontologies contain the following components: concepts (concepts, classes); properties of concepts (attributes, roles); relationship between concepts (dependence, function); additional constraints that are defined by axioms [32-39]. The role of the concept is a description of the task, function, action, strategy, process of reasoning, etc. The main difference of the ontological system from the usual vocabulary is internal unity, logical interconnection and consistency of the concepts used. The second kind of ontologies is hierarchical lexical resources such as WordNet. They describe the lexical relations between the meanings of words given in the form of individual units in the hierarchical network - sinsets. The relationship between lexical units reflects the relation of objects of the outside world, therefore, such resources are often regarded as a special kind of ontology - lexical or linguistic ontologies. The main characteristic of linguistic ontologies is that they are tied to the meanings of verbal expressions (words, names groups, etc.). Linguistic ontologies cover most of the words of the language and at the same time have an ontological structure that manifests itself in the relation between concepts. Therefore, linguistic ontologies are considered as a special type of lexical database and a special type of ontology. The main difference between linguistic and formal ontologies is the degree of formalization. It is assumed that the development of such resources builds a hierarchy of lexical meanings of the natural language, and for a more rigorous description of the knowledge about world, they compare such resources with any formal ontologies. Thus, the content of one of

the projects is the establishment of the relationship between WordNet and EuroWordNet, on the one hand, and the formal ontology SUMO - Standardized Upper Merged Ontology - on the other. The project is to establish a match between the WordNet Sinsets and the concepts of ontology, in which each WordNet sinset is directly related to the concept of ontology, or is a hypo-nomy for some concept or instance (element) of the ontology concept. Participants in another OntoWordNet project consider that it is not enough to hold formal glue of a resource such as WordNet and formal ontology: a significant restructuring of the source lexical resource is required.

3 MATERIALS AND METHODS

The primary task of creating an ontological KB of syntactic analysis is to create diagrams of the syntax classes of the Ukrainian language, which are transformed into taxonomy of the concepts of X. Such diagrams are shown in Fig. 1-6.

In the production of data on the basis of ontological KB and the development of rules of the syntax of the Ukrainian language for the further development of a consolidated information resource of syntactic processing of text documents, it is necessary to focus on the concept of Syntax =< WordsCombination, Sentence >, in the first place. We will submit a finite set of concepts of the formation of phrases in the Ukrainian language as a cortage:

WordsCombination =< SignWords1, SignWords'2,SignWords3,SignWords4 >,

(1)

where the signs of the formation of phrases in the Ukrainian language are divided into four main groups

SignWords 1 =< LexicalSign, SyntacticSign > ,

SignWords'2 =< Noun,Adjective,

Numeral,Pronoun,Verb,Adverb >, (2)

SignWords3 =< Coordinated,Inferior >,

SignWords4 =< SimpleWord,ComplicatedWord > .

Syntactic | | Nominal | | Adjed. | | Numer. | |Pronoun | | Verb 11 Adverb "| | Compound"! | Complex| | Simple

Dividing Connective Controversial Coordination Management Adjoining Connection Connection Connection Connection Connection Connection

Figure 1 - A class diagram for representing the hierarchy of the classes "Syntax_PhraseCombination"

(3)

(4)

Composite signs of SignWords3 are described by the following sets of concepts:

Coordinated =< AdversativeComt, ConnectiveComt,DividingComt >, Inferior =< ContactComt, ManagementComt, AdjoiningComt >.

Accordingly, the finite set of concepts of the formation of sentences in the Ukrainian language (Fig. 2) will be presented as a tuple

Sentence =< SignSentj, SignSent2,

SignSent3, SentenceMembers >,

where the signs of the formation of sentences in the Ukrainian language are divided into several main groups as

SignSentl =< NarrativeSent,

PronouncedSent, IncentiveSent >,

SignSent2 =< EmotionallyNeutral,

EmotionallyColored >,

SignSent3 =< SimpleSent, ComplicatedSent > , SentenceMembers =< MainSentMemb, SecondSentMemb >, NarrativeSent =< AffirmativeSent, NegativeSent > .

To determine the simple sentence, it is necessary to analyze the sentences by using the eighth signs (Fig. 3), as described by the following tuples (6). Similarly, classes have been constructed to determine the members of a solution and a complex solution (Fig. 4-6).

(6)

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(5)

SimpleSent =< SgSpStl, SgSpSt2, SgSpSt3,

SgSpSt4, SgSpSt5, SgSpSt6, SgSpSt7, SgSpSt8 >.

One of the promising directions for further refinement of IS for PIR is the development of methodological, ontological and logical foundations of the design of KB intended for the analysis of text documents. Ontological aspects include a range of issues, ranging from the scope of application and to the formal description of the components of computer ontology SA. The main vector of research is aimed at formalizing the stages of construction, structuring and presentation of SA's material during the analysis and integrated with the information resource of the problem space, which allows for an effective combination of processed text materials. In turn, the effective implementation of these stages and obtaining the final result (in the form of a library ontological KB with SA) is impossible without conducting a system-ontological analysis of a given set of information LR. The information model of the SA, which is the basis of the functioning of this system, in order to record its state in time, must contain a time component. From this we can conclude that our information model has the following characteristics: the decomposition of the essence, depending on the time parameters; fixing the status of an object (registering changes in the values of a subset of the object's attributes, as changes in the status of an object); object archiving (extracting an object from the current state of the information model). The main contextual diagram reflects the external connections of the highest-level IS (Fig. 7a).

Sign 3

Simple Complex

Affirmative Negative

Figure 2 - A class diagram for representing the hierarchy of the classes "Sentence"

Figure 3 - A class diagram for "Simple sentence

Main sentence members

Second sentence members

Adjunct Object

Adverbial modifier

Coordinated Uncoordinated

Figure 4 - A class diagram for "The members of sentence' [Adverbial modifier]

Of place

Figure 5 - A class diagram for "Adverbial modifier"

I Uncompromising sentence

The compound sentence The complex sentence

Attributive Subject Adverbial With few

clauses clauses clauses subjects

Figure 6 - A class diagram for "Complex sentence"

detailed level, then it should be noted that each of the object classes contains a number of specific external entities, each of which describes its attributes and specifies the relationship between the classes of objects (Fig. 8).

Figure 7 - Diagram of IS «Parsing Analysis»: a - IDEF0; b -DFD

The object is the real essence of the software, which changes the state over time. In the developed SA there are such classes of objects: Text and Worked Out Sentence. The interaction of two data classes of objects, that is, the structure of the IS, is given in the context diagram of Fig. 7 b. If we consider the information model at a more

© Lytvyn V. V., Vysotska V. A., Hrendus M. H., 2018 DOI 10.15588/1607-3274-2018-3-16

Figure 8 - Detailing of the process "Analysis

Essences that make up IS and are real objects of a specific SA: IS, text, rule. We give the properties of the essence of the IP (all attributes of objects are static values).

1. Information system is a description of IP, which processes the Ukrainian-language text.

2. Text - displays information about the text, that is being analyzed.

3. The rule is the rules of the Ukrainian language, which are followed by the analysis of the text.

In the center of most ontologies there are classes. Protégé and other frame systems describe ontology in a declarative manner, clearly defining the class hierarchy and the affiliation of individual concepts to the corresponding classes. Ontology in OWL has similar components to frame-based ontologies. OWL terminology is based on the concepts of individual concepts or objects and properties that are generally consistent with Protégé, respectively, instances of classes and slots. Objects are separate instances of the subject field. An important difference between Protégé and OWL is that OWL does not use Unique Name Assumption (UNA). This means that two different names can, in fact, be sent to the same object. For example, the names "Queen Elizabeth", "Queen" and "Elizabeth Windsor" mean the same object. In OWL it should be clearly defined that objects are the same or different from each other, otherwise the names may belong to the same or different objects. Properties are binary links between objects. For example, the property "to have a color" ties the object "gold" with the object "yellow", or the property "used in" connects the object "gold" with the object "electrical engineering" (Fig. 9 a).

Gold

used in

a b

Figure 9 - Presentation: a - properties of the object; b -class structures.

In Protégé, properties are represented by slots, in descriptive logic - roles, in UML and other object-oriented views - by links. Properties can be inverse. For example, inversion to the property of the object "has a color" - "to be color". Properties can be functional (limited to a single value), transitive or symmetric. Classes in OWL are considered as sets containing objects that are described formally (mathematically) for the exact representation of their membership in a particular class. Classes organize a class-subclass into a hierarchy-taxonomy. The OWL subclass means the need to add. For example, "cast iron" and "steel" objects (Fig. 9b) belong to the "iron-carbon alloys" class, which together with "metal-ceramic alloys" and "non-ferrous alloys" is a subclass of "alloys". The OWL subclass means the need to add. For example, "cast iron" and "steel" objects (Fig. 9b) belong to the "iron-

carbon alloys" class, which together with "metal-ceramic alloys" and "non-ferrous alloys" is a subclass of "alloys". In the case of building a deeper hierarchy, the objects "iron" and "steel" are regarded as separate classes with their subclasses and objects. In OWL classes, they create descriptions that specify the conditions for the matching of the object to become part of the class instances.

4 EXPERIMENTS

The main stage of the realization of the task is to create an ontological KB based on the rules of the syntax of the Ukrainian language for the further development of a consolidated LR syntactic elaboration of Ukrainian-language text documents. For this purpose, the software program Protégé 3.4.7 was used to create a hierarchy of classes and subclasses of the hierarchy of syntactic concepts based on the rules of the syntax of the Ukrainian language (Fig. 10a). Information about the selected class is displayed on the right side of the window. The upper part of this window allows users to add comments, labels and other annotations. The bottom part displays the logical characteristics of the selected class, which are specified using certain buttons on the panel when you click on the "Create new expression" icon (Figure 10 b).

An example of creating a logical class characteristic is the expression: "A phrase with a junction bond has some united connection" (Fig. 11 a-e). The next step in building a KB is to enter the class representatives in the Individuals tab (Fig. 11e), for example, representatives of the class of lexical unbound phrases. Next, the relationship between certain classes and subclasses is created in the Properties tab in the main panel (Fig. 11f). The created KB has the following relationships: compoundOf (consists of); hasConjunctive (has a connector); hasMember-sOfTheSentense (has sentence members); hasPunctuation (has a punctuation mark). The SWRL Rules tab creates rules for parsing using Semantic Web Rules of Language (SWRL) and the use of a handy expression editor (Fig. 12). When writing rules, classes (subclasses), relationships, and representatives that interact with specific operands in the expression editor panel are used.

One of the benefits of Protégé 3.4.7 is the ability to create queries through the Open SPARQL Query panel. At the bottom of the main window, there will be two more windows: Query - for the query itself, Results - to output the result of the query (after pressing the button Execute query) (Figure 13). In fig. 14a the query reflects the subclasses of the Complex sentence without conjunctions, and in Fig. The 13b request reflects simultaneously the subclasses of the Sentence and Phrase Class, and also sorts their comments in alphabetical order.

I Protégé ЗЛ? (Ils L0wu.i.mU42tI«rf420Sd:lr.pU«r Patv^tilVrun ИЛИ*™»!

№ .* IKri UH. ' ' . v Я-И l^Jr 19Ш

non * В Й iaü <? , а Й й

;; protégé

П

№ g« Priort QM Fftnmj Гла» Trrts VAntrw ùMn^iv

□ » S) Bs fil tä

< >

protégé

a b

Figure 10 - In the Protégé 3.4.7 environment: a - Create a hierarchy of classes, subclasses in the OWLClasses tab and comments; b -

call the panel to create the logical characteristics of classes

♦ Individuals — Forms [ SWRL Rules |_

♦ Metadata(Onitology1253189272. owl)

For Project: • ukr1

»If

7 ■

BL CLASS EDITORfor connectiue (instam

For Class: |http:flW'ww.owl-ontologies.со

\sf ITS' I_E3 L-M

Asserted Hierarchy 9 owl: Thing

► 9 morphology

► 0 punctuation

► A structure

) swrla: Entity T # Syntax

▼ 9 combination_of_words

► 9 asyndetic_1

► ®sign_2 T ® sign_3

▼ 0 coordinated_coinmunicatlon 0 adversative connective

■ dividing I inferior communication

4 *

coordinated_communication 0 hasConjunctive some binder

t Sentence

- Ф emotional_colouring

- Ф members_of_the_sentence

- Ф purpose_of_utterances ' 9 structure_1

▼ |Ф complex

► w asyndetic 9 complex_syntactic_stru(

► Ф conjunctival ► Ф simple_1

1 ►

A b

structurejl © compoundOf min 2 simple_1

*L

A U £ ® О

Q complicsted_sentence I Щ incomplete_sentence Ф one-parts_sentence 0 order_of_words_ 0 presence_of_second-r| :: Ф uncommon

О widespread ф syntacticjelationship Urj

< пш »

& * c*

Asserted Conditions

-NECESS/iFY & SUFFICIENT

- NECESSARY

0 presence_of_second-rate_members | E |

® (hasMembersOfTheSentense only main) and (hasMembersOfTheSentense some (addition or circumstance or definition)) | E |

- INHERITED

hasMembersOfTheSentense exactly 1 (main or (hasMembersOfTheSentense only (predikate or the_subjekt))) [from sim...] | ç |

4 - * ® ©

+ Meta data (Onto log y1253189272 .owl) OWLCIasses ■

B r

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

For Class: 9 lexical | Asserted j Inferred |

<§) Logic View 0 Properties View

For Project: • ukrl Class Hierarchy

0 owl: Thing ф morphology ф punctuation Ф structure О swrla: Entity 0 Syntax

T Ф combinstion_of_words ▼ C) asyndeticjl lexical (5) ф syntactic (1 )

► 9 sign_2

► 0 sign_3

Asserted Instances

ф lnternet_Explorer ф Дубл1нсьве_ядро ^ MeTOftonoriaJDEFO ♦ операцйна _c ф cepBep_MySQL

c

Пт Fiäd BS] : .... . 1 T,, i - Wjiiln,. jll.l.I. d » tu.

I □ ь и Ч- в ffl ùjii да^ анзш <Щрго!ёдё

T ф Syntax

► § combination_of_words T ® Sentence

► 1J emotional_colouring

► Ф members_of Jhejentence

► ® purpose_of_utterances T 0 structurej

► § complex

► Ф simple J

■■ « >ä

A structure J

б hasMembersOfTheSentense exactly 1 (main or

С

4 ч I os

® Logic View 0 Properties View

•etrf) ОМСКЮМ Ш Гтлрг--11г': фмш = Fnims SWfit .RUM

lut Иг I CfeKt Driatype

■ toittjnctuaton

Supui PrupHr(icE

I* :

4. ■

hil Pi u|i -л i.y; hflp:iY|i,r*'W.CiWt-onlologt« comOrtotooyl 25Ï1 <H272.'j4»1fcC4Tp:'j-<j;'1

J $ t.* :n

[ÏRiwlko

П

»

a *

e f

Figure 11 - Class: a - "Connective"; b - "Complex"; c - "Simple_1"; d - "Lexical", e - "Widespread"; f - creating

relationships between classes

0 ukr 1 Protégé 3.4.7 (file:\C:\Documents<Vi»20andl^o20Sett¡ngs\Llser\Pa6o4MÍilVo20cTOfl\HaíWí20диплом\Диплом\0..

File Edit Project ÇWL Reasoning Code lools BioPortal Window Collaboration Help

\ protégé

Figure 12 - Rules of KB for parsing

SUBCLASS EXPLORER ■ E CLASS EDITOR for Syntax (instance of owl:Class)

For Project: • ukr1 Asserted Hierarchy % Г For Class: |http:№www.owl-ontologies,com/Ontology1253189272,owI#|G Inferred Vie J & % H □

) swrla: Entity - Property Value

T # Syntax ► 1 combination of words □ rdfs: comment синтаксис • граматична будова речень та словосполучень у г fhVHIÍI МмН'"' F; ЯНН Я F; МП F; ПЙННГ

<1 1 1И < I ►

: * Í й is - £ ® ® Logic View О Properties Viev

Query cÜl eS Results

SELECT * subject

ŒjtTl FROM <#Syntax> WHERE {?subject rdfs:subClassOf <#asyndetic>) mixed_with_the_sentence with_homogeneous_members_of_the_sentence

S

Ш

Query

Ш £

SELECT *

FROM <http:/Aw ww .owl-ontologies .com/Ontology 1253189272.owl> where {

^Sentence rdfs:comment Rvalue. SiSentence rdfs:subClassOf <#Sentence>.

$Comblnation01Words rdfs:comment Rvalue. $Combination01Words rdfs:subClassOf <#combination_of_words>.

ORDER BY ASC(?value)

Execute Query

CombinationOIWords

I emotional_ooloui... емощйне забарвл«...

Ф purpose_of_utte.. мета висловлюва.

ознака 2 Ф sign_2

ознака 3 Ф sign_3

ознака 4 ф sign_4

ознака 5 Ф sign_5

Ф structure_1 структура

Ф members_of_thí.. члени речения

JÜlsPARQL r

b

Figure 13 - Examples

5 RESULTS

The proposed and developed procedure for extracting data from the Ukrainian-language test on the basis of parsing analysis makes it possible to supplement the conceptual graphs of text documents comparable to the context of the SA determined by the ontology. Recognizing the content of a text document in the first stage is to "recognize" the concepts and statements of this document by

of sample requests

defining the degree of similarity of these concepts to their likely counterparts in the ontology of the IRS, taking into account the results of the parsing analysis. A set of recognized concepts is complemented by ontology with all the concepts associated with elements of such a set by generalizing links "IS-A" on one level up, as well as by other semantic ties whose weight exceeds a given threshold value. This add-on provides the recognized text with the

a

conceptual context of the given SA. The relationships between concepts in the text under investigation, in turn, are recognized and used to eliminate the ambiguity of the recognition of concepts if terms with a similar name are present in an ontology in a different context and, accordingly, in different meanings. Recognized in this way, the text, supplemented by a semantically related conceptual structure from ontology, forms a coherent graph of the semantic image of this text. After that, comparing the similarity of the texts will be to calculate the semantic distance between the documents (Fig. 14).

tions obtained with the model was made (Table 1). The estimation of the effectiveness of these methods for information search is made by the parameter - the accuracy of the search:

precision =

the amount of found relevants (exp erts) the amount of all found (programs)

Table 1 - Results of comparison of methods

Methods Precision X2

The method is based on the Dys coefficient 10/15=0,66, (66 %)

The method is based on the vector-spatial model 9/12=0,75, (75 %)

Method of adaptive ontologies (developed) 11/12=0,916, (92 %)

Find distance between CG1 and CG2

Figure 14 - The scheme of semantic comparison of documents

The procedure for comparing the text and its ranking according to similarity is included in the general algorithm of the search system of text documents by the model and consists of constructing:

1) weighted graph G of the text document.

2) supplemented by the ontology of the weighted graph of the model document, applying to each vertex of the graph G the procedure for finding his father in accordance with the connections between the concepts.

3) graph G = G u G' taking into count the results of the parsing analysis.

4) Reducing the redundant elements of the graph.

5) other graphs of documents for their ranking, applying paragraphs 1-4.

6) the calculations of the three centers of the importance of the graphs and the semantic distance between the graphs G and G'.

The effectiveness of the developed method on an example analysis of annotations is shown by comparing this method with the method of the vector-spatial model and the Dies coefficient [2]. An experiment with annotations of publications in the field of creativity T. G. Shevchenko (Fig. 15), which showed that the approach proposed on this basis on the basis of adaptive ontology increases the accuracy of the search of documents by an average of 20%. For this purpose from the keywords of an annotation-model the request is generated on the Internet. As a result, 25 annotations from sites corresponding to publications were received. In three methods: the method of conceptual graphs (Montes-Gomes), the Dies coefficient (variant of the vector-spatial model) and the method of adaptive ontologies, a comparative analysis of the annota© Lytvyn V. V., Vysotska V. A., Hrendus M. H., 2018 DOI 10.15588/1607-3274-2018-3-16

6 DISCUSSION

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The need for ontologies is related to the inability of adequate automatic processing of natural language texts by existing means. Creating thesauri does not solve the problem, since different user groups and communities use special terminology that is used by others in the second context to process and analyze information. Also, different communities often encounter different notations for the same concepts. Therefore, for qualitative elaboration of texts it is necessary to have a detailed description of the SA with a plurality of logical connections, which show the relationship between terms. The use of ontologies allows the submission of natural-language text in a suitable form for automatic processing. In addition, ontologies are used as an intermediary between the user and the IS, which allows formalizing the terms used among all users of the project.

Using this approach, account is taken of the context of documents and the context of the semantics of the terms and phrases they take. This makes it possible to automate the search for the documents most relevant to the prototype request and to reject those that are of minor importance and not in accordance with the SA.

According to the results of the experiment, we note that the method of comparison by Dace in 40% determined the most similar to the model those annotations that had the largest number of common words, in addition, the least consistent with the prototype of the content. The method of the vector-spatial model also did not give a satisfactory result. At the same time, taking into account the prior information about the SA, through weighing the vertices and links of the conceptual graphs of the reference and the annotated research, it was possible to select the most relevant annotation model.

This experiment illustrates the effectiveness of using the approach developed to work to automate the search for documents that are most relevant to the prototype query and can be used in constructing intelligent meta-search systems.

E

I

1.

2(D #

3.(II.6) #

I

4.(II.5) #

I

5.(II.2) # E

I I

6.(II.2) # E

I I

7.(II.2) # E A^

I I

8.(III.3) # E

N ч

N ж

N ж

N ж

I I

9.(II.1) #

10.(II.1) # I

11.(II.2) #

I

12.(II.2) #

E A

I

E A3:

I

E A3:

I

E A

I

E AJi

I

E A3i

I

I

13.(II.5) #

I

14.(II.1) # I

15.(II.2) # 16-18 (II.4)

I I

19.(II.4) # E A3i

I I

20.(II.3) # E Aж

21(I13) # E Aж V 1

22-35 (I

36

A

м ж,од

I

Аж од

N,

N.

N,

N4

ч,од,н,3

I

м ж,од

I

I

A

д, м ж,од

I

A

од,м ж,од

I

мАж ,од мАж ,од

ж,од, м,3 1 ч,од,н,3

ж,од,м,3 ч,од,н,3

N ж од „3 Nч

>Яо.

м 1 ж,од,м,3 ч,од,н,3 од,тп,3

~ I ~ I I

м Nж,од,м,3Nч,од,н,3 Rод,тп,3

~ I J 1

м Nж,од,м,3 Nч,од,н,3 Rод,тп,3

~ I ~ I I

, Nж од м 3 Nч од н 3 R-од тп 3 Ач

~ R

R R

К

R

Nч, N..

од,тп,3

I

од,тп,3

I

од,тп,3

I

N ч

N

с,од, р,3

N

с,од, р,3

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

N

N ч

N.

с,од, р,3

м ж,од,м,3 ч,од,н,3 од,тп,3 ч,од,з ч,од,з,3

III I ~ I

N4.

I I I I I

N,

с,од, р,3

I

с,од, р,3

I

м Nж,од,м,3Nч,од,н,3 Rод,тп,3 Ач,од,з Nч,од,з,3 Ас,од,р Nс,од,р,3 Nс,од,р,3

tNж од м 3 Nч од н 3 йод тп 3 А,

ж,од,м,3 ч,од,н,3 од,тп,3 ч,од,з ч,од,з,3 с,од, р с,од, р,3

N ч

А,.

N с

E

N.

~ I ~ I I I ~ I I ~ I

м^ж,од,м,3 Nч,од,н,3 Rод,тп,3 Ач,од,з Nч,од,з,3 Ас,од,Р N,

~ I ~ I I I ~ I I ^ I J

м^ж,од,м,3 Nч,од,н,3 Rод,тп,3 Ач,од,з Nч,од,з,3 Ас,од,р Nс,од,р,3 E Nс,од,р,3 Аз

III II I I I I I

с,од, р,3

с,од, р,3 E Nс,од, р,3 N ж,од, р,3

N ж

I ~ I I I I I I I J I

од,м Nж,од,мNч,од,н,3 Rод,тп,3 Ач,од,з Nч,од,з Ас,°д,р Nс,од,р E Nс,од,р,3 Аж

III II I I I J I

од,м Nж,од,mNч,од,н,3 Rод,тп,3 Ач,од,з Nч,од,з Ас,од,р Nс,од,р E Nс,од,р,3 Аж

III II I I I I I

N \т займ n Л N AN Z7 ътзайм л

од, „J v ж,од, mn ч,од,н,3 Код,тп,3 Ач,од,з -ÍVч,од,з ^с,од, р ^ с,од, р E N с,од,р,3 Аж

III II I I I I I

N д

од, р ж,од

N д

од, р ж,од

N д

од, р ж,од

р ft I

р #

I

р #

IV.7 IV.5 IV.6 IV.5 IV.1 IV.4 IV.8 lV.5 lV.2 lV.6 Iv3 Iv7 lV.4 lV.6 lV.1 '

# у свой нaйбiльш важливiй poôorni eiN показуе барвистий ceim укранського села в його неповторнй привабливост#. ВАГИ: 3 2 3 2,6 8,5 6 1 6,9 7,6 12,8 40,1 35,8 18 .

iV.t

'.1 V.4 iV.8

Figure 15 - The result of using syntactic analysis in Ukrainian language

S

#

#

#

#

м

#

од

A

#

ж,од

од

A м ж,од

#

од

A м ж,од

#

од

A м ж,од

A

м ж,од

#

од

#

A

м ж,од

од

A

м ж,од

од

A м ж,од

A

м ж,од

од

A м ж,од

од

A

м ж,од

од

A

од,м ж,од

ж

CONCLUSIONS

The article deals with the scientific and practical task of extracting data from Ukrainian-language content based on the ontological approach, taking into account the features of syntax and semantics of this language.

The scientific novelty of the results obtained is that for the first time an ontological KB has been created on the basis of RSUL - for the further development of a consolidated LR syntactic elaboration of Ukrainian-language text documents. As a result of the system analysis of the software for the first time was designed KB, which contains consolidated information on RSUL-taking into account its features. The ontological aspect of designing the KB of analytical purpose is considered, which is one of the important practical applications of the direction of ontologi-cal engineering. The proposed approach solves and improves the results of solving the following urgent tasks for the processing of text documents: automated development of analytical and syntactic KB on the basis of lingual© Lytvyn V. V., Vysotska V. A., Hrendus M. H., 2018 DOI 10.15588/1607-3274-2018-3-16

semantic analysis of large volumes of texts using original instrumental means (the source text is used from a variety of sources, for example from tested ones in educational institutions of textbooks in Ukrainian with SA); structuring terms and concepts in information resources from a specific SA; a significant reduction in the complexity of compiling the KB analytically and syntactically. For the first time, a method of data extraction was created based on ontological KB and RSUL for forming a consolidated information resource for syntactic processing of text documents. As a result, an ontological type KB with RSUL was created. It serves as a powerful foundation for further development of an actual ultra-complicated process of an automated system of parsing text analysis in Ukrainian. The practical significance of the results obtained is to develop a programmatic system for formalizing the RSUL by means of Protégé 3.4.7 in the form of an ontological knowledge base for its use for the processing of natural-language texts in the Ukrainian language (for

example, for heading out, referencing, extracting knowledge, translation, etc.). The created KB is sufficiently developed and allows you to perform the following important functions: creating a hierarchy of classes and subclasses of SA concepts; the introduction of representatives of classes and subclasses, which extends the possibilities of understanding and using KB; creation of a system of links between classes and subclasses; creation and execution of requests of various character; construction of rules for processing data; application of IT in the development of applications, etc. Perspectives for further research are the development of rules for analyzing the semantics of texts in Ukrainian for the more efficient extraction of knowledge from Ukrainian-language Web-resources. Using the method of evaluating the similarity of text documents on the content, based on the adaptation of its ontology to the user's SA, enables to increase the efficiency of automated search of relevant documents. However, we note that the developed method is not an alternative to well-known methods for searching documents, but their additions. For example, you first need to search for keywords, and then add the result to the contextual search.

ACKNOWLEDGEMENTS

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

The work was carried out within the framework of joint research of the department of information systems and networks of NU "Lviv Polytechnic" on the topic "Research, development and implementation of intelligent distributed information technologies and systems based on database resources, data warehouses, data and knowledge spaces in order to accelerate the processes of formation of modern information society". Scientific researches were also carried out within the framework of the research topics of the Department of ICN of Lviv Polytechnic National University on the topic "Development of intellectual distributed systems based on the ontological approach for the integration of information resources".

REFERENCES

1. Lytvyn Vasyl, Vysotska Victoria, Chyrun Lyubomyr, Dosyn Dmytro Methods based on ontologies for information resources processing. Saarbrücken, LAP, 2016, 324 p.

2. Lytvyn V. V., Vysotska V. A., Dosyn D. H. Metody ta zasoby opratsyuvannya informatsiynykh resursiv na osnovi ontolohiy. Lviv, LA «Piramida», 2016, 404 p.

3. Gruber T. A translation approach to portable ontologies specifications, Knowledge Acquisition, 1993, Vol. 5 (2), pp. 199-220.

4. Gruber T. Toward Principles for the Design of Ontologies Used for Knowledge Sharing, International Journal HumanComputer Studies, 1995, Vol. 43(5-6), pp. 907-928.

5. Guarino N. Formal Ontology, Conceptual Analysis and Knowledge Representation, International Journal of Human-Computer Studies, 1995, Vol. 43(5-6), pp. 625-640.

6. Sowa J. Conceptual Graphs as a universal knowledge representation, Semantic Networks in Artificial Intelligence, 1992, Vol. 23 (2-5), pp. 75-95.

7. Bulskov H. Knappe R., Andreasen R. On Querying Ontologies and Databases / H. Bulskov, // Flexible Query Answering Systems, 2004, pp. 191-202.

8. Calli A., Gottlob G., Pieris A. Advanced processing for ontological queries, Very Large Databases : 36th International Con© Lytvyn V. V., Vysotska V. A., Hrendus M. H., 2018

DOI 10.15588/1607-3274-2018-3-16

ference, Singapore, September 13-17, 2010 : proceedings. Singapore, VLDB Endowment, 2010, Vol. 3, No. 1, pp. 554-565.

9. Galopin A., Bouaud J., Pereira S., Seroussi B. An Ontology-Based Clinical Decision Support System for the Management of Patients with Multiple Chronic Disorders, Studies in health technology and informatics, IMIA and IOS Press, 2015, pp. 275-279.

10. Zhao Tian An Ontology-Based Decision Support System for Interventions based on Monitoring Medical Conditions on Patients in Hospital Wards, Master Thesis in Information and Communication Technology IKT590, Spring. Grimstad, University of Agder, 2014, 125 p.

11. Ugon A., Sedki K., Kotti A., Seroussi B., Philippe C., Ganascia JG., Garda P., Bouaud J., Pinna A. Decision System Integrating Preferences to Support Sleep Staging, Studies in health technology and informatics, 2016, Vol. 228, pp. 514-518.

12. Rospocher M., Serafini L. An Ontological Framework for Decision Support, Part of the Lecture Notes in Computer Science book series, Semantic Technology : Second Joint International Conference, JIST 2012, Nara, Japan, December 2-4, 2012 : proceedings. Nara, Springer, 2012, Vol. 7774, pp. 239254.

13. Rospocher M., Serafini L. Ontology-centric decision support, Semantic Technologies Meet Recommender Systems & Big Data, 2012, Vol. 919, pp. 61-72.

14. Sutton R. S., Barto A. G. Reinforcement Learning: An Introduction. Cambridge, Massachusetts London, England : A Bradford Book, The MIT Press, 2012, 320 p.

15. Otterlo van M., Wiering M. Reinforcement learning and markov decision processes, Reinforcement Learning. Berlin, Springer, 2012, pp. 3-42.

16. Chen J., D. Dosyn, V. Lytvyn, A. SachenkoSmart Data Integration by Goal Driven Ontology Learning, Advances in Big Data, 2016, pp. 283-292.

17. Wong W., Liu W., Bennamoun M. Ontology learning from text: A look back and into the future, ACM Computing Surveys (CSUR), 2012, Vol. 44(4):20, pp. 1-36.

18. Lytvyn V., Vysotska V., Pukach P., Bobyk I., Pakholok B. A method for constructing recruitment rules based on the analysis of a specialist's competences, Eastern-European Journal of Enterprise Technologies, 2016, Vol. 6/2(84), pp. 4-14.

19. Montes-y-Gomez M. Gelbukh A., Lopez-Lopez A. Comparison of Conceptual Graphs, Artificial Intelligence, Vol. 1793, 2000, pp. 548-556.

20. Su J., Vysotska V., Sachenko A., Lytvyn V., Burov Y. Information resources processing using linguistic analysis of textual content, Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), 9th IEEE International Conference, 2017, pp. 573-578.

21. Lytvyn V., Vysotska V., Veres O., Rishnyak I., Rishnyak H. Classification Methods of Text Documents Using Ontology Based Approach, Advances in Intelligent Systems and Computing. Springer, 2017, Vol. 512, pp. 229-240.

22. Lytvyn V., Pukach P., Bobyk I., Vysotska V. The method of formation of the status of personality understanding based on the content analysis, Eastern-European Journal of Enterprise Technologies, 2016, Vol. 5/2(83), pp. 4-12.

23. Lytvyn V., Vysotska V., Veres O., Rishnyak I., Rishnyak H. Content Linguistic Analysis Methods for Textual Documents Classification, Computer Science and Information Technologies: Proc. of the XI-th Int. Conf (CSIT'2016), 2016, pp. 190192.

24. Bisikalo O. V., Vysotska V. A. Identifying keywords on the basis of content monitoring method in ukrainian texts, Radio Electronics, Computer Science, Control, Vol. 1(36), 2016, pp. 74-83.

25. Bisikalo O. V., Vysotska V. A. Sentence syntactic analysis application to keywords identification Ukrainian texts, Radio

Electronics, Computer Science, Contro., Vol. 3(38), 2016, pp. 54-65.

26. Lytvyn V., Bobyk I., Vysotska V. Application of algorithmic algebra system for grammatical analysis of symbolic computation expressions of propositional logic, Radio Electronics, Computer Science, Control, Vol. 4(39), 2016, pp. 54-67.

27. Alieksieieva K., Berko A., Vysotska V. Technology of commercial web-resource management based on fuzzy logic, Radio Electronics, Computer Science, Control, Vol. 3(34), 2015, pp. 71-79.

28. Vysotska V. Linguistic Analysis of Textual Commercial Content for Information Resources Processing, Proceedings of the XIIIth International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET), 2016, pp. 709-713.

29. Lytvyn V., Vysotska V., Y. Burov, O. Veres, I. Rishnyak The Contextual Search Method Based on Domain Thesaurus, Advances in Intelligent Systems and Computing, Vol. 689. Springer International Publishing AG 2017, pp. 310-319.

30. Mykich K., Burov Y. Algebraic model for knowledge representation in situational awareness systems, Proceedings of the 11th International Scientific and Technical Conference Computer Sciences and Information Technologies (CSIT), 2016, pp. 165167.

31. Mykich K., Burov Y. Uncertainty in situational awareness systems, Proceedings of the 13th International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET), 2016, pp. 729-732.

32. Mykich K., Burov Y. Algebraic Framework for Knowledge Processing in Systems with Situational Awareness, Advances in Intelligent Systems and Computing. Springer, pp. 217-228.

33. Mykich K., Burov Y. Research of uncertainties in situational awareness systems and methods of their processing, Eastern

European Journal of Enterprise Technologies, 2016, Vol. 1(79), pp. 19-26

34. Lytvyn V., Vysotska V., Pukach P., Bobyk I., Uhryn D. Development of a method for the recognition of author's style in the ukrainian language texts based on linguometry, stylemetry and glottochronology, Eastern-European Journal of Enterprise Technologies, 2017, Vol. 4/2(88), pp. 10-18.

35. Lytvyn V., Vysotska V., Veres O., Rishnyak I., Rishnyak The Risk Management Modelling in Multi Project Environment, Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2G17, 2017, pp. 32-35.

36. Korobchinsky M., Vysotska V., Chyrun L., Chyrun L. Peculiarities of Content Forming and Analysis in Internet Newspaper Covering Music News, Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2G17, 2017, pp. 52-57.

37. Naum O., Chyrun L., Kanishcheva O., Vysotska V. Intellectual System Design for Content Formation, Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2G17, 2017, pp. 131-138.

38. Lytvyn Vasyl, Vysotska Victoria, Peleshchak Ivan, Rishnyak Ihor, Peleshchak Roman Time Dependence of the Output Signal Morphology for Nonlinear Oscillator Neuron Based on Van der Pol Model, International Journal of Intelligent Systems and Ap-plications(IJISA), 2018, Vol. 10, No.4, pp. 8-17. DOI: 10.5815/ijisa.2018.04.02.

39. Lytvyn Vasyl, Vysotska Victoria, Dosyn Dmytro, Holoschuk Roman, Rybchak Zoriana Application of Sentence Parsing for Determining Keywords In Ukrainian Texts, Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2G17, 2017, pp. 326-331.

Received 14.01.2018.

Accepted 26.02.2018.

УДК 004.9

МЕТОД ВИДОБУВАННЯ ДАНИХ З УКРАШОМОВНОГО КОНТЕНТУ НА ОСНОВ1 ОНТОЛОГ1ЧНОГО

ЩДХОДУ

Литвин В. В. - д-p тexн. наук, пpoфeсop, зав^вач кaфeдpи «Iнфopмaцiйнi та мepeжi» Нaцioнaльнoгo

yнiвepситeтy «Львшська пoлiтexнiкa», Львш, У^аша.

Висоцька В. А. - канд. тexн. наук, дoцeнт, дoцeнт кaфeдpи «Iнфopмaцiйнi сисгеми та мepeжi» Нaцioнaльнoгo yнiвepситeтy «Львшська пoлiтexнiкa», Львш, У^аша.

Грендус М. Г. - ara^ern кaфeдpи «Iнфopмaцiйнi raCTe^M та мepeжi» Нацюнальшго yнiвepситeтy «Львiвськa нoлiтexнiкa», Львiв, У^аша.

АНОТАЦ1Я

Актуальшсть. У даний час пoстiйнo 3poCTae imepec дo застосування iнтeлeктyaльниx ŒCTe! (IC) у piзниx гaлyзяx як iнфopмaцiйнi тexнoлoгiï (IT), машишбудування, мeдицинa, бioлoгiя, eкoлoгiя, гeoгpaфiя, юpиснpyдeнцiя тoщo. В oснoвi apxiтeктypи сучаснж IC викopистoвyють бази знань (БЗ), яю фopмyють вiдпoвiднo дo пpeдмeтнoï oблaстi (ПО), дe застото-вують дану IC. Оснoвнoю чaстинoю БЗ e oнтoлoгiя як чiткo стpyктypoвaнa мoдeль ПО, систeмaтизoвaний нaбip тepмiнiв, якi нoяснюють вiднoшeння oб'eктiв uieï ПО. Онтoлoгiï e загальшвизнаш та шиpoкo зaстoсoвyвaнi в тaкиx piзниx гaлyзяx науки, як iнжeнepiя знань, шдання знань, iнфopмaцiйний нoшyк, угфавлшня знаннями, нpoeктyвaння баз дaниx, iнфopмaцiйнe мo-дeлювaння та oб'eктнo-opieнтoвaний aнaлiз. Зoкpeмa, фipмa Gartner в дoслiджeннi IT-pинкy вiднeслa викopистaння тaксoнoмiï/oнтoлoгiï на тpeтe мiсцe в дeсятцi пepeдoвиx тexнoлoгiй у данш гaлyзi. ^му e актуальним дoслiджeння стpyктyp синтaксичниx oнтoлoгiчниx БЗ, нoбyдoвa та дoслiджeння oнтимaльниx aлгopитмiв синтaксичнoгo aнaлiзy yкpaïнoмoвниx тeкстiв, та poзpoблeння нpoгpaмнo-aлгopитмiчнoгo затобу синтaксичнoгo oнpaцювaння тeкстiв, нaнpиклaд, для pyбpикaцiï Iнтepнeт-кoнтeнтy, aвтoмaтичнoгo peфepyвaння тeкстiв, видoбyвaння знань iз ниx, reperaa^ на iншi мoви тoщo.

Методами poзв'язaння нpoблeми ствopeння кoнсoлiдoвaнoгo peсypсy на oснoвi oнтoлoгiчнoï БЗ були oбpaнi дepeвa piшeнь, мeтoдoлoгiя IDEF5 та мeтoдoлoгiï нoбyдoви oнтoлoгiй. Peзyльтaти poбoти синтaксичнoгo aнaлiзy вpaxoвyються aсoцiaтивнo-сeмaнтичним кoнтeкстним aнaлiзoм для oнтимiзaцiï пpoцeсy нoбyдoви aсoцiaтивниx зв'язкв кoнтeкстy мiж слoвaми та слoвoснoлyчeннями peчeння всepeдинi iepapxiчнoï мepeжi oнтoлoгiчнoï БЗ.

Результати. Crêpera кoнсoлiдoвaний iнфopмaцiйний peсypс - oнтoлoгiчнy БЗ синтaксичнoгo aнaлiзy yкpaïнoмoвниx тeкстoвиx дoкyмeнтiв за дoнoмoгoю Protégé 3.4.7.

Висновки. Cтвopeннo мeтoд видoбyвaння дaниx на oснoвi oнтoлoгiчнoï БЗ та ЖУМ мoви для пoдaльшoгo poзpoблeння кoнсoлiдoвaнoгo iнфopмaцiйнoгo peсypсy синтaксичнoгo oнpaцювaння тeкстoвиx дoкyмeнтiв. В peзyльтaтi ствopeнo БЗ oнтoлoгiчнoгo тину iз ПCУМ. Cинтaксичнa стpyктypa вxiднoгo peчeння e фyндaмeнтoм та кapкaсoм для нaстyпнoгo нe мeнш вaжливoгo Hpo^ - сeмaнтичнoгo aнaлiзy. Дана oнтoлoгiчнa БЗ кoнсoлiдoвaнoгo ЛР синтaксичнoгo oпpaцювaння

украшомовних текстових документа виступае потужною основою для подальшого розроблення автоматизовано1 1С синтак-сичного аналiзу укра1номовних тексттв.

КЛЮЧОВ1 СЛОВА: аналiз, контент-аналiз, онтологш, система управлшня контентом.

УДК 004.9

МЕТОД ИЗВЛЕЧЕНИЯ ДАНИХ З УКРАИНОЯЗЫЧНОГО КОНТЕНТА НА ОСНОВЕ ОНТОЛОГИЧЕСКОГО

ПОДХОДА

Лытвын В. В. - д-р техн. наук профессор, заведующий кафедры «Информационные системы и сети» Национального университета «Львовская политехника», Украина.

Высоцкая В. А. - кандидат технических наук, доцент, доцент кафедры «Информационные системы и сети» Национального университета «Львовская политехника», Украина.

Грендус М. Г. - асистент кафедры «Информационные системы и сети» Национального университета «Львовская политехника», Украина.

АННОТАЦИЯ

Актуальность. В настоящее время постоянно растет интерес к применению интеллектуальных систем (ИС) в различных областях как информационные технологии (ИТ), машиностроение, медицина, биология, экология, география, юриспруденция и т.д. В основе архитектуры современных ИС используют базы знаний (БЗ), которые формируют в соответствии с предметной областю (ПО), где применяют данную ИС. Основной частью БЗ является онтология как четко структурированная модель ПО, систематизированный набор терминов, которые объясняют отношения объектов етой ПО. Онтологии являются общепризнанные и широко применяемые в таких различных областях науки, как инженерия знаний, представления знаний, информационный поиск, управление знаниями, проектуваннят баз данных, информационное моделирования и объектно-ориентированный анализ. В частности, фирма Gartner в исследовании ИТ-рынка отнесла использования таксономии / онтологии на третье место в десятке передовых технологий в данной отрасли. Поэтому является актуальным исследование структур синтаксических онтологических БЗ, построение и исследование оптимальных алгоритмов синтаксического анализа украиноязычных текстов, и разработка программно-алгоритмического средства синтаксического обработки текстов, например, для рубрикации Интернет-контента, автоматического реферирования текстов, извлечения знаний из них, перевода на другие языки тому подобное.

Методами решения проблемы создания консолидированного ресурса на основе онтологической БЗ были выбраны деревья решений, методология IDEF5 и методологии построения онтологий. Результаты работы синтаксического анализа учитываются ассоциативно-семантическим контекстным анализом для оптимизации процесса построения ассоциативных связей контекста между словами и словосочетаниями предложения внутри иерархической сети онтологической БЗ.

Результаты. Создан консолидированный информационный ресурс - онтологическую БЗ синтаксического анализа украиноязычных текстовых документов с помощью Protégé 3.4.7.

Выводы. Создан метод добычи данных на основе онтологической БЗ и ПСУЯ для дальнейшей разработки консолидированного информационного ресурса синтаксического обработки текстовых документов. В результате создана БЗ онтологического типа с ПСУЯ. Синтаксическая структура входного предложения является фундаментом и каркасом для следующего не менее важного шага - семантического анализа. Данная онтологическая БЗ консолидированного ЛР синтаксического обработки украиноязычных текстовых документов выступает мощной основой для дальнейшей разработки автоматизированной ИС синтаксического анализа украиноязычных текстов.

КЛЮЧЕВЫЕ СЛОВА: анализ, контент-анализ, онтология, система управления контентом.

Л1ТЕРАТУРА / ЛИТЕРАТУРА

1. Methods based on ontologies for information resources processing / [Vasyl Lytvyn, Victoria Vysotska, Lyubomyr Chyrun, Dmytro Dosyn]. - Saarbrücken : LAP, 2016. -324 p.

2. Литвин В. В. Методи та засоби опрацювання шформацшних ресурсгв на основ1 онтологш / В. В. Литвин, В. А. Висоцька, Д. Г. Досин. - Льв1в : ЛА «Шрамща», 2016. - 404 с.

3. Gruber T. A translation approach to portable ontologies specifications / T.Gruber // Knowledge Acquisition. -1993. - Vol. 5 (2). - P. 199-220.

4. Gruber T. Toward Principles for the Design of Ontologies Used for Knowledge Sharing / T. Gruber // International Journal Human-Computer Studies. - 1995. - Vol. 43(5-6). -Р. 907-928.

5. Guarino N. Formal Ontology, Conceptual Analysis and Knowledge Representation / N. Guarino // International Journal of Human-Computer Studies. - 1995. - Vol. 43 (56). - Р. 625-640.

6. Sowa J. Conceptual Graphs as a universal knowledge representation / J. Sowa // Semantic Networks in Artificial Intelligence. - 1992. - Vol. 23(2-5). - P. 75-95.

7. Bulskov H. On Querying Ontologies and Databases / H. Bulskov, R. Knappe, R. Andreasen // Flexible Query Answering Systems. - 2004. - P. 191-202.

8. Calli A. Advanced processing for ontological queries / A. Calli, G. Gottlob, A. Pieris // Very Large Databases : 36th International Conference, Singapore, September 13-17, 2010 : proceedings. - Singapore : VLDB Endowment, 2010. - Vol. 3, No. 1. - Р. 554-565.

9. An Ontology-Based Clinical Decision Support System for the Management of Patients with Multiple Chronic Disorders / [A. Galopin, J. Bouaud, S. Pereira, B. Seroussi] // Studies in health technology and informatics, IMIA and IOS Press. - 2015. - Р. 275-279.

10. Zhao Tian. An Ontology-Based Decision Support System for Interventions based on Monitoring Medical Conditions on Patients in Hospital Wards / Tian Zhao // Master Thesis in Information and Communication Technology IKT590, Spring. - Grimstad : University of Agder, 2014. - 125 p.

11. Decision System Integrating Preferences to Support Sleep Staging / [A. Ugon, K. Sedki, A. Kotti et al.] // Studies in health technology and informatics. - 2016. - Vol. 228. -P. 514-518.

12. Rospocher M. An Ontological Framework for Decision Support / M. Rospocher, L. Serafini // Part of the Lecture

Notes in Computer Science book series, Semantic Technology : Second Joint International Conference, JIST 2012, Nara, Japan, December 2-4, 2012 : proceedings. - Nara : Springer, 2012. - Vol. 7774. - P. 239-254.

13. Rospocher M. Ontology-centric decision support / M. Rospocher, L. Serafini // Semantic Technologies Meet Recommender Systems & Big Data. - 2012. - Vol. 919. -P. 61-72.

14. Sutton R. Reinforcement Learning: An Introduction / R. Sutton, A. Barto. - Cambridge, Massachusetts London, England : A Bradford Book, The MIT Press, 2012. - 320 p.

15. Otterlo van M. Reinforcement learning and markov decision processes / M. van Otterlo, M. Wiering // Reinforcement Learning. - Berlin : Springer, 2012. - P. 3-42.

16. Smart Data Integration by Goal Driven Ontology Learning / [J. Chen, D. Dosyn, V. Lytvyn, A. Sachenko] // Advances in Big Data. - 2016. - P. 283-292.

17. Wong W. Ontology learning from text: A look back and into the future / W. Wong, W. Liu, M. Bennamoun // ACM Computing Surveys (CSUR). - 2012. - Vol. 44(4):20. -P. 1-36.

18. A method for constructing recruitment rules based on the analysis of a specialist's competences / [V. Lytvyn, V. Vysotska, P. Pukach et al.] // Eastern-European Journal of Enterprise Technologies. - 2016. - Vol. 6/2(84). - P. 4-14.

19. Montes-y-Gómez M. Comparison of Conceptual Graphs / M. Montes-y-Gómez, A. Gelbukh, A. López-López // Artificial Intelligence. - Vol. 1793. - Springer, 2000. - P. 548556.

20. Information resources processing using linguistic analysis of textual content / [J. Su, V. Vysotska, A. Sachenko, V. Lytvyn, Y. Burov] // Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), 9th IEEE International Conference. -2017. - P. 573-578.

21. Classification Methods of Text Documents Using Ontology Based Approach / [V. Lytvyn, V. Vysotska, O. Veres, I. Rishnyak, H. Rishnyak] // Advances in Intelligent Systems and Computing. - Springer, 2017. - Vol. 512. -P. 229-240.

22. The method of formation of the status of personality understanding based on the content analysis / [V. Lytvyn, P. Pu-kach, I. Bobyk, V. Vysotska] // Eastern-European Journal of Enterprise Technologies. - 2016. - Vol. 5/2(83). - P. 4-12.

23. Content Linguistic Analysis Methods for Textual Documents Classification / [V. Lytvyn, V. Vysotska, O. Veres et al.] // Computer Science and Information Technologies: Proc. of the XI-th Int. Conf. (CSIT'2016). -2016. - P. 190192.

24. Bisikalo O. V. Identifying keywords on the basis of content monitoring method in ukrainian texts / O. V. Bisikalo, V. A. Vysotska // Radio Electronics, Computer Science, Control. -Vol. 1(36). - 2016. - P. 74-83.

25. Bisikalo O.V. Sentence syntactic analysis application to keywords identification Ukrainian texts / O. V. Bisikalo, V. A. Vysotska // Radio Electronics, Computer Science, Control. - Vol. 3(38) - 2016. - P. 54-65.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

26. Lytvyn V. Application of algorithmic algebra system for grammatical analysis of symbolic computation expressions of propositional logic / V. Lytvyn, I. Bobyk, V. Vysotska //

Radio Electronics, Computer Science, Control. -Vol. 4(39). - 2016. - P. 54-67.

27. Alieksieieva K. Technology of commercial web-resource management based on fuzzy logic / K. Alieksieieva, A. Berko, V. Vysotska // Radio Electronics, Computer Science, Control. - Vol. 3(34). - 2015. - P. 71-79.

28. Vysotska V. Linguistic Analysis of Textual Commercial Content for Information Resources Processing / V. Vysotska // Proceedings of the XIIIth International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET). - 2016. - P. 709-713.

29. The Contextual Search Method Based on Domain Thesaurus / [V. Lytvyn, V. Vysotska, Y. Burov et al.] // Advances in Intelligent Systems and Computing. - Vol. 689. - Springer International Publishing AG 2017. - P. 310-319.

30. Mykich K. Algebraic model for knowledge representation in situational awareness systems / K. Mykich, Y. Burov // Proceedings of the 11th International Scientific and Technical Conference Computer Sciences and Information Technologies (CSIT). - 2016. - P. 165-167.

31. Mykich K. Uncertainty in situational awareness systems / K. Mykich, Y. Burov // Proceedings of the 13th International Conference on Modern Problems of Radio Engineering, Telecommunications and Computer Science (TCSET). -2016. - P. 729-732.

32. Mykich K. Algebraic Framework for Knowledge Processing in Systems with Situational Awareness / K. Mykich, Y. Burov // Advances in Intelligent Systems and Computing. - Springer. - P. 217-228.

33. Mykich K. Research of uncertainties in situational awareness systems and methods of their processing / K. Mykich, Y. Burov // EasternEuropean Journal of Enterprise Technologies. - Vol. 1(79). - 2016. - P. 19-26.

34. Development of a method for the recognition of author's style in the ukrainian language texts based on linguometry, stylemetry and glottochronology / [V. Lytvyn, V. Vysotska, P. Pukach et al.] // Eastern-European Journal of Enterprise Technologies. - Vol. 4/2(88). - 2017. - P. 10-18.

35. The Risk Management Modelling in Multi Project Environment / [V. Lytvyn, V. Vysotska, O. Veres et al.] // Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2017. - 2017. - P. 32-35.

36. Peculiarities of Content Forming and Analysis in Internet Newspaper Covering Music News / [M. Korobchinsky, V. Vysotska, L. Chyrun, L. Chyrun] // Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2017. - 2017. - P. 52-57.

37. Intellectual System Design for Content Formation / [O. Naum, L. Chyrun, O. Kanishcheva, V. Vysotska] // Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2017. - 2017. - P. 131-138.

38. Time Dependence of the Output Signal Morphology for Nonlinear Oscillator Neuron Based on Van der Pol Model / [V. Lytvyn, V. Vysotska, I. Peleshchak et al.] // International Journal of Intelligent Systems and Applica-tions(IJISA). - Vol.10, No.4. - 2018. - P. 8-17.

39. Application of Sentence Parsing for Determining Keywords In Ukrainian Texts / Vasyl Lytvyn, Victoria Vysotska, Dmy-tro Dosyn, Roman Holoschuk, Zoriana Rybchak // Computer Science and Information Technologies: Proc. of the XII-th Int. Conf. CSIT'2017. - 2017. - P. 326-331.

i Надоели баннеры? Вы всегда можете отключить рекламу.