Научная статья на тему 'Объектно-ориентированные данные как перезаписывающие системы'

Объектно-ориентированные данные как перезаписывающие системы Текст научной статьи по специальности «Математика»

CC BY
37
7
i Надоели баннеры? Вы всегда можете отключить рекламу.
Область наук
Ключевые слова
PREFIX REWRITING / TERM REWRITING / OBJECT-ORIENTED DATA SYSTEM / INFORMATION SYSTEM / CONSISTENCY VERIFICATION / ONTOLOGY OF A DATA MODEL

Аннотация научной статьи по математике, автор научной работы — Гутман А.Е.

Рассматриваются перезаписывающие системы, не содержащие пар правил вида $X{\to}Y$, $X{\to}Z$, где $Y{\ne}Z$, в которых перезаписи подлежат только самые длинные префиксы. В рамках таких систем определяются и исследуются аналоги концепций, характерных для систем объектно-ориентированных данных: наследование классов и объектов, экземпляры классов, атрибуты экземпляров и классов, концептуальная зависимость и непротиворечивость, концептуальные схемы, типы, подтипы и др. Особое внимание уделяется эффективной проверке разнообразных свойств рассматриваемых перезаписывающих систем. В частности, приводятся алгоритмы для ответа на следующие вопросы: Все ли слова конечно переписываемы? Существуют ли рекуррентные слова? Является ли система концептуально непротиворечивой? Концептуально зависит ли данное слово $X$ от слова $Y$? Совпадают ли типы $X$ и $Y$? Является ли тип $X$ подтипом типа $Y$?

i Надоели баннеры? Вы всегда можете отключить рекламу.
iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.
i Надоели баннеры? Вы всегда можете отключить рекламу.

Object-Oriented Data as Prefix Rewriting Systems

A deterministic longest-prefix rewriting system is a rewriting system such that there are no rewriting rules $X{\to}Y$, $X{\to}Z$ with $Y{\ne}\,Z$, and only longest prefixes of words are subject to rewriting. Given such a system, analogs are defined and examined of some concepts related to object-oriented data systems: inheritance of classes and objects, instances of classes, class and instance attributes, conceptual dependence and consistency, conceptual scheme, types and subtypes, etc. A special attention is paid to the effective verification of various properties of the rewriting systems under consideration. In particular, algorithms are presented for answering the following questions: Are all words finitely rewritable? Do there exist recurrent words? Is the system conceptually consistent? Given two words $X$ and $Y$, does $X$ conceptually depend on $Y$? Does the type of $X$ coincide with that of $Y$? Is the type of $X$ a subtype of that of $Y$?

Текст научной работы на тему «Объектно-ориентированные данные как перезаписывающие системы»

Владикавказский математический журнал 2015, Том 17, Выпуск 3, С. 23^35

YflK 519.682.1+519.683+519.7+519.1

OBJECT-ORIENTED DATA AS PREFIX REWRITING SYSTEMS

A. E. Gutman

To S. S. Kutatelaze on the occasion of his 70th birthday

A deterministic longest-prefix rewriting system is a rewriting system such that there are no rewriting rules X^ Y, X^ Z with Y = Z, and only longest prefixes of words are subject to rewriting. Given such a system, analogs are defined and examined of some concepts related to object-oriented data systems: inheritance of classes and objects, instances of classes, class and instance attributes, conceptual dependence and consistency, conceptual scheme, types and subtypes, etc. A special attention is paid to the effective verification of various properties of the rewriting systems under consideration. In particular, algorithms are presented for answering the following questions: Are all words finitely rewritable? Do there exist recurrent words? Is the system conceptually consistent? Given two words X and Y, does X conceptually depend on Y? Does the type of X coincide with that of Y? Is the type of X a subtype of that of Y?

Mathematics Subject Classification (2000): 68Q42, 68P05, 68N19, 68T30.

Key words: prefix rewriting, term rewriting, object-oriented data system, information system, consistency verification, ontology of a data model.

1. Introduction

The classical object-oriented approach to describing structured data employs the two primary relations, has (or "has a") and is (or "is a").

The has relation links objects (and classes) with their attributes. By saying "X has a Y" we mean that the object (or class) X possesses an attribute named Y, and we thus can speak of "the Y of X" or "X's Y" as a property of X conventionally denoted by X.Y. For instance, if a web page has a submit button whose style assumes a border of a particular width, we can speak of "the width of the border of the style of the submit button of the page" and thus arrive at the object page.submitButton.style.border.width.

The is relation can be used for (1) instantiating objects from classes; (2) inheriting classes from classes; and (3) assigning values to attributes. By saying "40 is an Integer" we associate the object 40 with the class Integer and mean that 40 is an instance of Integer. The phrase "Integer is Number" means that the class Integer inherits from the class Number. By claiming that "John.age is 40" we assign the value 40 to the age attribute of John.

As is seen from the above examples, the wide interpretation of the is relation makes it possible to eliminate the difference between objects and classes. A single data system can syndicate the class declarations ("metadata") and the object instantiations and initializations ("data"). We do not assert that data and metadata are worth more combined than separated;

© 2015 Gutman A. E.

nevertheless, this approach allows us to unify data analysis and develop a common tool for verifying conceptual and semantical consistency.

The is and has relations are naturally connected. By interpreting "is" as "inherits," we assume that if "X is Y" then all the attributes of Y are inherited by X. In particular, if "X is Y" and "Y has a Z" then "X has a Z." Moreover, if "X is Y" is the only explicit information on X, we can conclude that "X.Z is Y.Z." (By the explicit information we mean the is rules which form the data system under consideration.) In doing so, we derive an implicit information on X and say that "X.Z is Y.Z implicitly." Therefore, when evaluating the object X.Z, we rewrite its prefix X with Y according to the explicit rule "X is Y." The same is applicable to objects of any length. For instance, if we know explicitly that "A.B is P.Q.R" then A.B.C.D rewrites implicitly to P.Q.R.C.D.

It is clear that the explicit rules supersede any implicit derivatives; therefore, if "X is Y" "Y has a Z," and the data system contains the explicit rule "X.Z is A" the latter wins over the implicit "X.Z is Y.Z " However, a conflict of another kind is possible in case several explicit rules are simultaneously applicable. Consider the following fragment of a data system:

block.style.color is blue

header.style.color is red

button is block

button.style is header.style

Let us try to evaluate the button's style color, i.e., button.style.color. Since button is a block, we might conclude that button.style.color is block.style.color, which is blue. On the other hand, button.style is header.style; therefore, button.style.color is header.style.color, which is red. Intuitively, the latter evaluation should win, since "button.style is header.style" seems to take priority over "button is block" The reason is not the fact that the former rule occurs next to the latter (we treat a data system as an unordered set of is rules). The key point is that the rule " button.style is header.style " is more concrete as it evaluates a longer object, button.style rather than button. Therefore, when evaluating an object, we should rewrite the longest prefix (i.e., use the most concrete rule applicable).

We will now dwell on data consistency. Obviously, when designing a set of definitions, conceptual cycles should be avoided. By saying "man is man" we define nothing, since evaluation of man results in a dead cycle. However, conceptual consistency in no way outlaws recursion. For instance, the rule "man.son is man" is quite legal. On the other hand, the rule "man is man.son" seems incorrect: we still do not know what man is unless man's son is defined, while the latter is senseless prior to defining man. Furthermore, the rules "man is Adam" and "Adam.rib is man.rib" form an inconsistent pair, since Adam.rib is man.rib, while the latter implicitly rewrites to Adam.rib. Such examples justify the need for a formal definition of conceptual consistency and the search for the corresponding effective verification. (This is similar to analyzing the ontology of a data system as a set of concept definitions.)

It is clear that, prior to defining a set of concepts (classes or objects), we need at least one concept which does not require definition. In general, there can be several primary concepts; however, a single "generic object" is sufficient. We denote the latter by w. Given a data system and a word X of the form entity. attri. attr2 . ■ ■ ■. attrn, we rewrite X by applying the most concrete is rule, thus obtaining a new word, and continue rewriting the longest prefixes of the subsequent words until w is reached. In this case we conclude that the initial word X is an object (or a concept). Otherwise, if the rewriting process either

(1) ends with a nonrewritable word other than w or (2) never terminates, we claim that X is senseless. The possibility of (2) makes the analysis nontrivial and justifies the search for an effective verification if a given word makes sense. (This is close to analyzing the ontology of a concept within a data system.)

Given an object X and a word 5 of the form attr1. attr2 . ■ ■ ■. attrn, say that 5 is a detail of X if X. 5 makes sense. The set ||X|| of all details of X can be regarded as the type of X. Whenever an algorithmic procedure assumes a formal argument A, the body of the procedure contains A along with some words A.5j. For the procedure to operate correctly with X substituted for A, it is necessary (and probably sufficient) that all the words X.5j make sense. This results in the requirement that X be of an appropriate type. Therefore, we need an algorithm for comparing object types: given two objects X and Y, we should be able to effectively compare the types of X and Y, i.e., determine which of the relations ||X|| = ||Y||, ||X|| C ||Y||, ||X|| 5 ||Y|| hold. The problem is not trivial if for no other reason than the fact that the type of an object can be infinite. (For instance, given the rule "man.son is man" the type of man contains all the words son , son.son, son.son.son, ...)

In what follows we give formal definitions for the notions under consideration, state some results, and present algorithms for all the problems mentioned above. (The paper does not contain proofs of the theorems and justifications of the algorithms. All the details, including various examples, will be published elsewhere.)

To make notation less cumbersome, we treat the names of entities and attributes as single symbols (letters) of some alphabet A and agree to write the property paths a1. a2 . ■ ■ ■ . an as a1a2 ... an thus making them words over A. The explicit rules "X is Y" will be written as X — Y.

2. Definitions and Main Results

Throughout the paper, A is a finite alphabet and A* (resp. A+) is the set of all (all nonempty) words over A. The elements of A are called letters. We conventionally identify the letters with the corresponding single-letter words. Say that X is a prefix (resp. a proper prefix) of Y € A+ and write X c Y or Y □ X (X c Y or Y □ X) if X € A+ and Y = XS for some S € A* (S € A+). The length of a word X is denoted by |X|. Given an integer n ^ 1 and a word X € A+ such that |X| ^ n, define X\n € A+ so that X\n c X and |Xfn| = n. For brevity, in the sequel we say "word" instead of "nonempty word over A"

Given any binary relation we conventionally denote by — the transitive closure of — and by —, the reflexive transitive closure of

Consider a finite binary relation — on A+ (i.e., a finite subset of A+ x A+) and a letter w € A. Say that the pair (—, w) is a deterministic longest-prefix rewriting system, or a system for short, if — is nonempty and the following hold:

(a) X — Y and X — Z imply Y = Z;

(b) there are no S, Y € A* such that wS — Y.

Put E := {X : X — Y for some Y} and call the elements of E explicit words. Say that E is an explicit prefix of X if E € E and E c X. Say that a word X is rewritable if X has an explicit prefix.

As is easily seen, condition (a) means that, for each E € E, there is a unique word E' such that E — E', while condition (b) amounts to the fact that all words of the form wS, with S € A*, are not rewritable.

Given a rewritable word X, consider the longest explicit prefix E of X, determine the suffix S € A* so that X = ES, and put X' := E'S. We call X' the rewrite of X. Introduce

the binary relation = on A+ by setting X = Y if and only if X is a rewritable word and

Y = X'.

By way of recursion, put Wo := A+, X(0) := X for X € Wo and, for each n ^ 1, put Wn := {X € Wn-1 : X(n-1) is rewritable} and X(n) := (X(n-1))' for X € Wn. The word X (n) is called the nth rewrite of X. It is clear that, for each X € Wn, we have X = X(0) = X(1) ==••• = X(n), X + X(n) for n > 0, and X X(n) for n ^ 0.

The elements of P|Wn are called infinitely rewritable words. The other words are finitely rewritable. Given a word X, call the maximal (finite or infinite) sequence of the form (X(0), X(1), X(2), ...) the rewriting sequence of X. Therefore, a word X is finitely (infinitely) rewritable if and only if the rewriting sequence of X is finite (infinite).

Say that X € A+ is an object if X * w. Let O be the set of all objects. Note that the rewriting sequence of every object X € O\{w} has the form X = X(0) ==••• = X(n) — w, where n ^ 0.

From the above notation and definitions it is clear that we treat a system (—,w) as a rewriting system and assume that only longest prefixes of words are subject to rewriting. The system is then regarded as a recognition device, with O the accepted language (see [1]). We have called such a system "deterministic," since every rewritable word has a unique rewrite.

We may regard the notion of an object as isolating "concepts" from "senseless words." An object is a word X possessing a "meaning," the rewrite X' = X(1), which also possesses a meaning, (X')' = X(2), and so on up to the final rewrite, the "generic object" w, whose meaning is assumed predefined. The relation — is thus treated as conceptual definition, and a rule X — Y is regarded as a definition of X via Y: "X is a Y." Next, a rule Xa — Z is an attribute definition, "the a of X is a Z," while XaP — Z means "the P of the a of X is a Z," etc. In this respect, condition (a) imposed on the relation — amounts to conceptual unambiguity (no concept can have several meanings).

We may also treat the relation — as object-oriented inheritance or instantiation and regard a rule X — Y as an explicit indication of the fact that "class X directly inherits class Y" or "object X is an instance of class Y." Next, a rule Xa — Z may be regarded as an attribute declaration or property evaluation: "class X has attribute a of class Z" or "the property Xa has value Z" or "the property Xa is an instance of class Z." In this respect, having imposed condition (a) on the relation —, we thereby disallowed multiple inheritance (therefore, no object can belong to several incomparable classes).

Introduce the binary relation ==w on A+ by setting X ==w Y if and only if X = ES and

Y = E'S for some E € E, S € A*. Therefore, ==w is the rewriting corresponding to the system (—, w) regarded as an ordinary prefix rewriting system rather than a longest-prefix rewriting system. (It is clear that X — Y implies X = Y, and X = Y implies X ==w Y. We may thus read the formulas X + Y and X +w Y as "X rewrites to Y" and "X weakly rewrites to Y." The formula X — Y can be read as "X explicitly rewrites to Y.")

Introduce the binary relation a on A+ by setting X a Y if and only if X □ Y or X ==w Y. As is easily seen, the transitive closure A is the least transitive relation on A+ possessing the following three properties for all X, Y, S € A+:

if X — Y then X A Y; if X — Y then XS A YS; XS A X.

In case X A Y we say that X depends on Y. A word X is well-defined if X does not depend on X. Say that a system (—, w) under consideration is conceptually consistent if all words are well-defined, i.e., no word depends on itself. For brevity, introduce the following

named condition:

The system is conceptually consistent.

(Con)

The above terminology is justified by our informal treatment of a rewriting rule X - Y as a definition of X via Y ("X is a Y ") and regarding a rule XS — Z as a detail definition ("the S of X is a Z"). Therefore, informally, the relation X A Y can be understood as follows: the definition of X explicitly or implicitly employs Y; in particular, when subsequently describing a conceptual scheme, the concept Y should be introduced before X, otherwise X becomes ill-defined.

If — is a binary relation on A+, put | — | := {X € A+ : E — X for some E € E} and denote by [—] the directed graph whose nodes are the words in |—| and arcs are the pairs (X, Y) such that X, Y € | — | and X — Y.

Given a system, we call [a] the conceptual scheme and [^w] the weak rewriting scheme. Since X Y implies X a Y, the weak rewriting scheme is a subgraph of the conceptual scheme.

Proposition 1. Given X, Y € A+, we have X A Y if and only if X □ Y or X YS for some S € A*.

Say that a word X is weakly recurrent if X +w XS for some S € A*.

Corollary 2. A word is well-defined if and only if it is not weakly recurrent.

Theorem 3. The following properties of a system are equivalent:

(1) all words are well-defined, i.e., (Con) holds;

(2) each explicit word is well-defined;

(3) there are no weakly recurrent explicit words;

(4) there are no weakly recurrent words;

(5) the conceptual scheme is acyclic;

(6) the conceptual scheme is acyclic and finite;

(7) the weak rewriting scheme is acyclic and finite.

Say that a word X is recurrent if X + XS for some S € A*. Introduce the following named condition:

Proposition 4. (Con) implies (Rec).

Put Ae := min{A C A : E C A+}, i.e., AE is the explicit alphabet, the set of all letters occurred in explicit words. In addition, put ^ := max{|E| : E € E}.

Theorem 5. If each word X € AE with |X| ^ ^ is not recurrent then all words are not recurrent, i.e., (Rec) holds.

Remark 6. Let B(S) be a set of words defined via a system S by some condition. Say that B(S) is a recurrence basis if, given an arbitrary system S, nonrecurrence of all words in B(S) implies nonrecurrence of all words. Theorem 5 states that the set {X € AE : |X| ^ is a recurrence basis. Despite its finiteness, the set can be rather large. However, we are not aware of conditions which determine considerably smaller recurrence bases. (There are examples showing that neither the set E of explicit words, nor the set of all prefixes of the explicit words can serve as a recurrence basis.)

Introduce the following named condition:

There are no recurrent words.

(Rec)

All words are finitely rewritable.

(Fin)

Theorem 7. If each explicit word is finitely rewritable then all words are finitely rewritable, i.e., (Fin) holds.

Theorem 8. (Rec) implies (Fin).

Therefore, according to Proposition 4 and Theorem 8, we have the implications (Con) ^ (Rec) ^ (Fin). As examples show, the converse implications are not true in general. Introduce the following two named conditions:

Proposition 9. (1) Let X, Y € A+, X * Y. Then X € O if and only if Y € O.

(2) Assume (Obj). Then, given X € A+, we have X € O\{w} if and only if X * E for some E € E.

(3) Assume (PreObj). If X, S € A+ and XS € O then X € O.

Let X € O and a € A. Say that a is an attribute of X if Xa € O. Denote by ||X||1 the set of all attributes of X. It is clear that ||w|1 = 0.

Say that a is an explicit attribute (resp. implicit attribute) of X if Xa € OnE (Xa € O\E).

Say that a is an overriding attribute (resp. added attribute) of X if Xa € O n E and, in addition, X'a € O (X'a € O). If a is an overriding attribute of X, we say that Xa overrides X'a.

Therefore, every attribute is either explicit or implicit, and every explicit attribute is either overriding or added.

Proposition 10. For all X € O\{w} and a € A the following hold:

(1) a is an implicit attribute of X if and only if Xa € E and X'a € O;

(2) a is an added attribute of X if and only if Xa € O and X'a / O.

Proposition 11. Assume (Obj). If X, Y € A+, a € A, X ^ Y, and Ya € O then

Proposition 12. Assume (Obj). Consider the rewriting sequence X = X(0) = ••• = X (n) = w of an object X. If a € ||X||1 then there is a number 0 ^ i ^ n such that X(0)a,..., X(i)a € O, X(i+1)a,..., X(n)a € O, and a is an added attribute of X(i).

Corollary 13. Assume (Obj). A letter a is an attribute of an object X if and only if there is a number n ^ 0 such that X(n)a € E.

Given an object X, say that 5 is a detail of X if 5 € A+ and X5 € O. Denote by ||X|| the set of all details of X and call | X| the type of X. (It is clear that | w| = 0.) Note that the set O of all objects can be infinite and, moreover, some object types | X| can be infinite. On the other hand, we will see that the set {||X|| : X € O} of all object types is always finite (see Theorem 27).

Proposition 14. For all objects X and Y we have

(1) if ||X|| = ||Y|| then ||X5|| = ||Y5|| for all 5 € ||X||;

(2) if ||x|| C ||y|| then ||x5|| C ||y5|| for all 5 € ||x||.

On assuming (PreObj), we also have

(3) ||X || = ||Y || if and only if ||Y ||1 = ||X ||1 and ||Xa|| = ||Ya|| for all a € ||X ||1;

(4) ||X || C ||y || if and only if ||x ||1 C ||Y ||1 and |Xa|| C |Ya|| for all a € ||x ¡1.

Proposition 15. If X, Y € O, X = Y, XS € E for all S € A+ then ||X|| = ||Y||.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

If we informally interpret the relation X =+ Y as "X inherits Y" (or "X is a particular case of Y" or "X is a Y") and treat the objects Xa as "properties of X," then in case X == Y

All explicit words are objects, i.e., E C O. If X € A+, a € A, and Xa € E then X € O.

(Obj) (PreObj)

Xa € O.

the object X should in a sense inherit the properties of Y and optionally make them more concrete and enlarge their totality. Formally, this requirement amounts to the following:

if X, Y € O and X + Y then ||X|| 5 ||Y||. (*)

Introduce the following named condition:

If X, Y € A+, a € A, Xa € E, Ya € O, and X ^ Y

then Xa € O and ||Xa|| 5 ||Ya||. (CoInh)

Theorem 16. Assume (PreObj). The following are equivalent:

(1) condition (CoInh) is satisfied;

(2) if X, Y € O\{w} and X ^ Y then ||X|| 5 || Y||;

(3) if X, Y € O and X Y then ||X|| 5 ||Y||.

Therefore, with (PreObj) satisfied, (*) is equivalent to (CoInh).

Corollary 17. Assume (PreObj) and (CoInh). If X, Y,S € A+, X Y, and YS € O then XS € O and ||XS|| 5 ||YS||. In particular, if Y € O and X Y then ||Y||i 5 ||X||i and ||Xa|| 5 ||Ya||.

Object systems with attribute value typing usually satisfy the following (or analogous) requirements: Suppose that a class Y has a declared attribute a with value type t. If x is an instance of Y then x has attribute a whose value type is equal to or more concrete than t. Similarly, if X is a class inherited from Y then X has the inherited attribute a whose value type is equal to or more concrete than t. If we interpret the relation X ^ Y as "the object X is an instance of the class Y" or "the class X directly inherits the class Y" and treat the relation Xa — V as "V is the explicit value of the property Xa " or "V is the declared value class of the attribute a within the class X," then the above requirements can be formalized by the following named condition:

If X, Y € A+, a € A, Ya € O, Xa — V, and X ^ Y

then V € O and ||V|| 5 ||Ya||. (CoVal)

Theorem 18. Conditions (PreObj) and (CoVal) imply (CoInh).

The conceptual dependence was introduced above as a relation on the set of all words. As soon as non-object words are regarded as "senseless," it is reasonable to describe dependence between objects involving objects only; namely, if a concept X depends on a concept Y, then there should be a chain of concepts (rather than arbitrary words) connecting X with Y. This principle is justified by the following theorem (see also Corollary 20).

Theorem 19. Assume (Obj), (PreObj), and (CoInh). Given X, Y € O, X depends on Y if and only if there exist X1,..., Xn € O, n ^ 1, such that X a X1 a ■ ■ ■ a Xn = Y.

Corollary 20. Assume (Obj), (PreObj), and (CoInh). The restriction of A onto O is the least transitive relation on O possessing the following three properties for all X, Y € O and 5 € A+ :

if X — Y then X A Y;

if X — Y and 5 € ||X|| n ||Y|| then X5 A Y5;

if 5 € ||X|| then X5 A X.

The last assertion shows that, with (Obj), (PreObj), and (CoInh) satisfied, the conceptual dependence relation between objects can be described in full conformity with the initial

definition of this relation on the set of all words. The only distinction consists in the fact that the latter description does not go beyond the set of objects.

Remark 21. In Theorem 19 and Corollary 20, none of the conditions (Obj), (PreObj), or (CoInh) can be omitted.

3. Algorithmization

The rest of the paper is devoted to the effective verification of various properties of rewriting systems under consideration, and the following theorem is the main step in this direction.

Theorem 22. Given a system, put / := max{|E| : E € E}. A word X is infinitely rewritable if and only if one of the following two (mutually exclusive) conditions holds:

(a) there are integers n ^ 0 and r > 0 such that X(n) = X(n+r);

(b) there are integers n ^ 0 and r > 0 such that

j

< |x(n)| < |x(n+1) I

|x (n+r) 1

X(n) = X(n+r), X(n) Ï = X(n+r) ï

In case (a) we have X X^

X(n+r-1) x(n)

In case (b) put

period

Y = X(n) and let S € A* be such that X(n) = YS. Then there is a word R € A+ such that

Y (r) = YR and the rewriting sequence (X(0), X(1), ...} contains a subsequence constituted by the words X(n+mr) = YRmS, m ^ 0, each of which starts a regular "growth period" of length r:

X ^ X(n) = YS ^ Y(1) S ^ X(n+r) = YRS ^ Y(1) RS ^ X(n+2r) = yr2S ^ Y(1) R2S ^

^ Y(r-1)S ^ Y(r-1)R S ^ Y(r-1)R2S

X(n+mr) = YRmS ^ Y(1) RmS ^

Y (r-1)RmS

In particular, {X(n), X(n+1), ... } = {Y(j)RmS : 0 < j < r, m ^ 0}.

Let P be an arbitrary set of "constructive entities" (i.e., a set whose elements can be used as inputs for algorithms) and let C(Y,p) be a condition imposed on words Y € A+ with additional parameters in P. Formally we may assume that C is a subset of A+ x P and, for all Y € A+, p € P, the expression C(Y, p) means the containment (Y, p} € C.

Given C(Y,p) as above, introduce the condition C'(Y, R, S,p) for Y, R € A+, S € A*, p € P as follows:

C'(Y, R, S,p) if and only if there is an m ^ 1 such that C(YRmS,p).

Say that C(Y,p) is cyclically decidable if the following two conditions hold:

(a) there is an algorithm verifying C(Y, p) for Y € A+, p € P;

(b) there is an algorithm verifying C'(Y, R, S,p) for Y, R € A+, S € A*, p € P.

Note that (a) does not in general imply (b), which fact can be derived from existence of a recursive set C C N2 such that the set {n € N : (3 m € N) (m,n} € C} is not recursive (see, for instance, [2, Chapter C.1, § 6]).

Let C(Y,p) be a cyclically decidable condition. Theorem 22 justifies the following simple algorithm which, given a system, a word X € A+, and a parameter p, verifies existence of a word Y € A+ such that X ^ Y and C (Y,p).

Algorithm 23. [Is there a word Y such that X ^ Y and C(Y,p)?]

• If C(X, p), return Yes. If X is not rewritable, return No.

• Otherwise, subsequently calculate the rewrites X(1),..., X(i),... and, at each step i ^ 1, subsequently analyze the fragments (X(n), X (n+1), ..., X (n+r)) for 0 ^ n<n + r = i, as follows:

o If C(X(n+r),p), return Yes. If X(n) = X(n+r), return No.

o If (X(n), ..., X(n+r)) satisfies condition (b) of Theorem 22, put Y := X(n) ; let S € A* be such that X(n) = YS; let R € A+ be such that Y(r) = YR (such an R exists by Theorem 22); if C'(Y, R, S, p), return Yes; otherwise return No.

o If X(n+r) is not rewritable, return No. Otherwise proceed to the next step, i + 1.

By Theorem 22, the above procedure terminates for every input.

As is easily seen, given a system S, the condition C(Y, S) = " Y is not rewritable within S " is cyclically decidable. Therefore, specialized with this condition, Algorithm 23 verifies finite rewritability of a given word within a given system. A simplified version is presented below.

Algorithm 24. [Is a word X finitely rewritable?] Start subsequent calculation of the rewrites X(0), X(1), ... . If a nonrewritable word X(i) occurs, return Yes. Otherwise, according to Theorem 22, a fragment (X(n), X(n+1), ..., X(n+r)) will occur which satisfies (a) or (b) of Theorem 22; in this case return No.

Since the set E of explicit words is finite, Theorem 7 implies that (Fin) is effectively verifiable: it suffices to apply Algorithm 24 to all explicit words.

The specialization of Algorithm 23 with the condition C(Y) = " Y = w " checks if a given word is an object. (This can be also verified by a slight modification of Algorithm 24.) It is now clear that (Obj) and (PreObj) are effectively verifiable.

Note that if (Fin) holds, the containment X € O can be trivially verified: just check if the (finite) rewriting sequence of X ends with w. In addition, if (Obj) holds, we can stop calculating the rewriting sequence of X if X(n) € E at some step n is 0; furthermore, if both (Obj) and (PreObj) hold, the calculation can be terminated if some X (n) becomes a prefix of any explicit word (see Proposition 9).

As is easily seen, the condition C(Y, X) = "Y 3 X" is cyclically decidable. Therefore a properly specialized version of Algorithm 23 checks if a given word X is recurrent. By Theorem 5, we conclude that (Rec) is effectively verifiable: to check if all words are not recurrent, it suffices to apply the algorithm to all words over AE of length at most (However, the resultant verification occurs exponential-time; see Remark 6.)

Another approach to verifying (Rec) can be based on Theorem 8 which states that (Rec) implies (Fin). Check (Fin) first. If it fails then (Rec) also fails. If (Fin) holds true, condition (Rec) can be verified by processing all words X over AE of length at most ^ and returning Yes whenever an X occurs such that X c X(n) for some n ^ 1.

By Theorem 3, condition (Con) can be effectively verified by constructing the conceptual scheme and checking (during the construction) if the scheme is acyclic. The algorithm described below checks if a given system is conceptually consistent. If it is so, the algorithm returns the conceptual scheme of the system; otherwise it returns an example of a cycle in

the conceptual scheme. The algorithm uses a variable directed graph r = (rN, Ta} (with rN the nodes and Ta the arcs) and a variable A whose values are finite subsets of A+ x A+. Algorithm 25. [Is a system conceptually consistent?]

(1) Put Tn := E, Ta := 0.

(2) Put A := {(X, Y} : X is a sink of r and X ^ Y}.

(3) If A = 0, claim that the system is conceptually consistent and return r as the conceptual scheme.

(4) Otherwise do the following for each pair (X, Y} € A: if X = Y or r contains a path from Y to X,

claim that the system is conceptually inconsistent and return a cycle: X ^ X or Y ^ ■ ■ ■ ^ X ^ Y; otherwise put Tn := Tn U {Y}, Ta := Ta U {(X, Y}}.

(5) Go to (2).

When applied to a conceptually inconsistent system, Algorithm 25 returns an example of a cycle X0 ^ X1 ^ ■ ■ ■ ^ Xm = X0 in the conceptual scheme, but it is not guaranteed that all words Xj in the cycle are objects (even in case X0 is an object). On the other hand, Theorem 19 implies that, with (Obj), (PreObj), and (CoInh) satisfied, every path X0 ^ X1 ^ ■ ■ ■ ^ Xm between objects X0,Xm can be transformed into a path of objects Y0 ^ Y1 ^ ■ ■ ■ ^ Y^, with Y0 = X0 and Y^ = Xm. We note that such a transformation can be performed effectively.

By slightly modifying Algorithm 25, we can obtain a procedure for constructing the weak rewriting scheme rather than the conceptual scheme. The algorithm presented below checks in an arbitrary system if there are weakly recurrent words, and either returns an example of such a word or constructs the weak rewriting scheme of the system. Algorithm 26. [Is there a weakly recurrent word?]

(1) Put Tn := E, Ta := 0.

(2) Put A := {(X, Y} : X is a sink of r and X Y}.

(3) If A = 0, claim that there are no weakly recurrent words and return r as the weak rewriting scheme.

(4) Otherwise do the following for each pair (X, Y} € A:

if X E Y or r contains a path from a prefix Y0 E Y to X, return a weakly recurrent word:

X ^w Y 3 X or Y0 ^w •••^w X ^w Y 3 Y0; otherwise put Tn := Tn U {Y}, Ta := Ta U {(X, Y}}.

(5) Go to (2).

According to Theorem 3, the weak rewriting scheme of a conceptually inconsistent system includes a path X0 +w Xn 3 X0 with X0 € E. Given an inconsistent system, Algorithm 25 returns an example of a path Y0 +w Yn 3 Y0, but the leading word Y0 need not be explicit. In this connection it is worth noting that every path of the form Y0 +w Yn 3 Y0 can be effectively transformed into a path X0 +w Xn 3 X0 (of the same length) with X0 € E.

To the author's opinion, the most convincing indication of conceptual inconsistency is an example of some cycle X0 ^ ■ ■ ■ ^ Xn = X0 which is constituted by objects and starts with an explicit word X0. We note that, if an inconsistent system satisfies (Obj), (PreObj), and (CoInh), then a cycle of this kind can be found effectively. We now turn to the effective analysis of object types.

Say that C € O is a character if either C = w or C c E for some E € E. Let C be the set of all characters. It is clear that C is finite.

Given an object X, consider the rewriting sequence X(0) ■ ■ ■ X(n) = w and denote by ch(X) the first character in the sequence (X(0),..., X(n)). Call ch(X) the character of X.

It is clear that ch(C) = C for all C € C. Therefore, C = {ch(X) : X € O}.

Theorem 27. For every object X we have ||X|| = ||ch(X)||. Consequently, {||X|| : X € O} = {| C| : C € C}, and the set {| X| : X € O} is finite.

By the type comparison problem we mean the following: given two objects X and Y, effectively compare the types of X and Y, i.e., determine which of the relations ||X|| = ||Y||, ||X|| C ||Y||, ||X|| 5 ||Y|| hold.

It is clear that the character of an object can be effectively determined. Therefore, due to Theorem 27, the type comparison problem reduces to effective comparison of the character types.

Given X € C and a € ||X||1, introduce the notation Xa := ch(Xa) and, given an arbitrary function t : C — {1,..., N}, define the following equivalence relation on C:

X ~ Y if and only if ||X||1 = ||Y||1 and t(Xa) = t(Ya) for all a € ||X||1.

The following algorithm uses two variable functions t, f : C — {1,2,... } (each of which can be encoded as an array of naturals indexed by the characters).

Algorithm 28. [Compute a character typing.]

(1) Define t by putting t(X) := 1 for all X € C.

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

(2) If X ~ Y for all X, Y € C such that t(X) = t(Y), return t.

(3) Assign f a copy of t.

(4) For each k € {t(X) : (3 Y € C) (X Y & t(X) = t(Y)) } do the following:

arbitrarily enumerate the set {X € C : t(X) = k} thus making it a sequence

(X1,..., Xn), n ^ 2, and, for each i = 2,..., n, do the following:

if Xj ~ Xj for some 1 ^ j < i, reassign f(Xj) := f(Xj);

otherwise reassign f(Xj) := max{f(X) : X € C} + 1.

(5) Assign t := f and go to (2).

Theorem 29. Algorithm 28 halts for any system and, if the system meets (PreObj), the resultant function t : C — {1,..., N} is such that t (X) = t (Y) if and only if ||X || = || Y ||.

Remark 30. Algorithm 28 is analogous to Vizing's algorithm of partitioning the vertex set of a graph into classes of similar vertices (see [3]). The author is grateful to S. V. Av-gustinovich for discovering the analogy.

The following algorithm uses two variable sets £, £0 C C2.

Algorithm 31. [Compute the character subtyping.]

(1) Put £ := C2.

(2) Put £0 := {(X, Y) € £ : ||X||1 C ||Y||1 & (Va € ||X||1) (X«,Y„) € £}.

(3) If £0 = £, return £. Otherwise reassign £ := £0 and go to (2).

Theorem 32. Given any system, Algorithm 31 always halts and, if the system meets (PreObj), the resultant set £ equals {(X, Y) € C2 : ||X|| C ||Y||}.

Therefore, given a system subject to (PreObj), we can effectively verify the relation ||X|| C ||Y|| for X, Y € O. (As is easily seen, this implies that (CoInh) and (CoVal) are effectively verifiable.) However, the mere claim "||X|| ^ ||Y||" is not always sufficient, and one may require a particular reason why the inclusion fails. Within a bulky system, it may occur nontrivial to find a particular detail 5 € ||X||\||Y||, and a corresponding algorithm would thus be a useful troubleshooting tool. Recall that, due to Theorem 27, it suffices to automate the solution for characters only.

The following algorithm uses a variable set A C C2 and a variable function 5 : A — A+. Algorithm 33. [Compute a diagnosis for subtyping failure.]

(1) Put A := 0 and n := 1.

(2) For all pairwise distinct X, Y € C do the following:

if there is an a € ||X||1\y Y||1, add (X, Y} to A and assign 5(X, Y) := a.

(3) For all pairwise distinct X, Y € C such that (X, Y} € A do the following: if there is an a € ||X||1 such that (Xa, Ya} € A and |5(Xa, Ya)| = n,

add (X, Y} to A and assign 5(X, Y) := a5(Xa, Ya).

(4) If there were no assignments at step (3), return 5. Otherwise put n := n + 1 and go to (3).

Theorem 34. Given any system, Algorithm 33 always halts and, if the system meets (PreObj), the resultant function 5 : A — A+ is such that A = {(X, Y} € C2 : ||X|| £ ||Y||}, 5(X, Y) € ||X||\||Y|| for all (X, Y} € A, |5(X, Y)| = min{|5'| : 5' € ||X||\||Y||} for all (X, Y} € A.

Acknowledgments. The author is grateful to Sergei Vladimirovich Avgustinovich for fruitful discussions.

References

1. Salomaa A. Formal Languages. X. Y.: Academic Press, 1973.^336 p.

2. Barwise J. (ed.) Handbook of Mathematical Logic.—Amsterdam: North-Holland, 1977.—1165 p.

3. Vizing V. G. Distributive Coloring of Graph Vertices // Diskretn. Anal. Issled. Oper.—1995.—Vol. 2, № 4.—P. 3-12.

Received October 17, 2013.

Gutman Alexander Efimovich

Sobolev Institute of Mathematics,

Head of the Laboratory of Functional Analysis

Acad. Koptyug av. 4, 630090, Novosibirsk, Russia;

Novosibirsk State University, Professor Pirogova 2, 630090, Novosibirsk, Russia E-mail: gutman@math.nsc.ru

ОБЪЕКТНО-ОРИЕНТИРОВАННЫЕ ДАННЫЕ КАК ПЕРЕЗАПИСЫВАЮЩИЕ СИСТЕМЫ

Гутман А. Е.

Рассматриваются перезаписывающие системы, не содержащие пар правил вида X^ У, X^ Z, где У = Z, в которых перезаписи подлежат только самые длинные префиксы. В рамках таких систем определяются и исследуются аналоги концепций, характерных для систем объектно-ориентированных данных: наследование классов и объектов, экземпляры классов, атрибуты экземпляров и классов, концептуальная зависимость и непротиворечивость, концептуальные схемы, типы, подтипы и др. Особое внимание уделяется эффективной проверке разнообразных свойств рассматриваемых перезаписывающих систем. В частности, приводятся алгоритмы для ответа на следующие вопросы: Все ли слова конечно переписываемы? Существуют ли рекуррентные слова? Является ли система концептуально непротиворечивой? Концептуально зависит ли данное слово X от слова У? Совпадают ли типы X и У? Является ли тип X подтипом типа У?

Ключевые слова: префиксная перезаписывающая система, полутуэвская система, система объектно-ориентированных данных, информационная система, проверка непротиворечивости, онтология модели данных.

i Надоели баннеры? Вы всегда можете отключить рекламу.