On termination of transactions over Semantic document models

Mantsivoda Andrei; Ponomaryov Denis

АЛГЕБРО-ЛОГИЧЕСКИЕ МЕТОДЫ В ИНФОРМАТИКЕ И ИСКУССТВЕННЫЙ ИНТЕЛЛЕКТ

ALGEBRAIC AND LOGICAL METHODS IN COMPUTER SCIENCE AND ARTIFICIAL INTELLIGENCE

Серия «Математика» 2020. T. 31. С. 111-131

Онлайн-доступ к журналу: http: / / mathizv.isu.ru

ИЗВЕСТИЯ

Иркутского государственного ■университета

УДК 510.62:004.82 MSG 68Т27, 68N19

DOI https://doi.org/10.26516/1997-7670.2020.31.lll

On Termination of Transactions over Semantic Document Models *

A. V. Mant.sivoda1'2, D. K. Ponomaryov2'3'4

1 Irkutsk State University, Irkutsk, Russian Federation

2 Sobolev Institute of Mathematics SB RAS, Novosibirsk, Russian Federation

3 Ershov Institute of Informatics Systems SB RAS, Novosibirsk, Russian Federation

4 Novosibirsk State University, Novosibirsk, Russian Federation

Abstract. We consider the framework of Document Modeling, which lays the formal basis for representing the document lifecycle in Business Process Management systems. We formulate document models in the scope of the logic-based Semantic Modeling language and study the question whether transactions given by a document model terminate on any input. We show that in general this problem is undecidable and formulate sufficient conditions, which guarantee decidability and traetability of computing effects of transactions.

Keywords: Semantic Modeling, document model, transaction, chase.

1. Introduction

In [10] a Document Modeling approach has been proposed as a fundamental basis for document processing in Business Process Management

* The research was supported by the Russian Science Foundation (Grant No. 17-1101176)

Systems (BPMS). Importantly, within this approach basic entities and primitives have been identified, which are common to BPMS such as Enterprise Resource Planning Systems, Customer Relationship Management Systems, etc. The approach rests on the natural idea that document life-cycle lies at the core of these systems. Typically, there is a static part, which describes the forms and statuses of documents (i.e., a schema), and a dynamic part, which describes changes in documents (i.e., transactions over them). In contrast to conventional architectures of BPMS, the approach of the Document Modeling shows that both parts can be given in a fully declarative fashion, thus making programming unnecessary. It suffices to describe the static part of a document model by giving a specification to document forms and fields, and to describe the dynamic part by defining transactions, their conditions, and effects. Then, given an initial state of a document model (a collection of documents), the natural problem is to compute a state (an updated collection of documents), which results from the execution of a sequence of transactions. It is argued within the Document Modeling approach that this problem can be solved with the tools of formal logic such as automated inference or model checking.

In [11], the ideas of the Document Modeling have been implemented in a logical framework in terms of the language of the Semantic Programming (also known as Semantic Modeling) [1]. It has been shown that the approach of the Document Modeling implemented this way goes beyond the common capabilities of today's Business Process Management Systems. In particular, it allows for checking document models for consistency and solving important problems like projection (e.g., what documents will be created after an accountant performs certain actions) and planning (e.g., what actions must be made in order to get an item on stock). The method follows the same line with some of the well-known approaches like Situation Calculus [13] and similar formalisms, but it addresses the topic of Business Process Management, which is a novel area of application for logic-based formalisms.

Obviously, an important question is how hard the above mentioned problems are from the computational point of view. In this respect, the key problem is computing effects of transactions over a document model. Transactions can be fired due to an input of an oracle (a user or an algorithm, which provides some input to a document model), which in turn, can cause other transactions to fire, and so on. Thus potentially, this can result in an infinite chain of updates of a document model, under which a finite resulting state is never obtained. We consider this problem in the paper and formulate a number of complexity results, which demonstrate the expressiveness of document models.

The contributions of this work are as follows. We refine the formalization of the Document Modeling given in [11] and provide a more succinct formalization in an extension of the language of the Semantic Modelling with

(non-standard) looping terms. We formulate the problem of transaction termination over document models and show that in general it is undecid-able. Then we describe a sufficient condition, which guarantees decidability. For this we introduce a formal definition of a locally simple document theory (the notion previously discussed in [9]) and we show that over any such theory transaction termination is decidable. Then we estimate the complexity of computing effects of transactions and identify a case when they are polynomially bounded.

2. Preliminaries

Document Modeling follows the idea of declarative representation of documents and transactions over them. A document model consists of a description of fields, which can appear in documents (their cardinality and default values), a definition of document forms (given as collections of fields), and a definition of so called daemons, which specify conditions and effects of transactions and field triggers. Transactions can be fired on an input of a user or an external procedure (e.g., a Machine Learning algorithm like in [14]), or they can be fired by other transactions. Field triggers can be viewed as a special kind of transactions, but they can fire only in the event of changing a value of some document field.

The formalism of the Document Modeling includes at least three ingredients that can influence the complexity of computation. The first one is the set of operators over field values. In real-world applications of the Document Modeling, the language is restricted to basic arithmetic operations (like, summation, subtraction, etc.), which can be computed efficiently. For this reason, we do not consider the whole variety of operators over field values in the paper. We describe only basic operations and examples of their implementation in order to show that they make no contribution to the complexity of computing effects of transactions. The second ingredient is the query language used in the Document Modelling to describe collections of documents, which have certain properties. Transactions can refer to document collections given by queries and hence, the complexity of the query language influences the complexity of computing effects of transactions. We leave this effect out of the scope of this paper and focus on the complexity of transactions caused solely by their relationships to each other. For this, we adopt a simple query language implemented by predefined document filters, which can be used in the definition of transactions and are computationally simple. In the remaining part of this section, we introduce basics of the Semantic Modeling and conventions used in this paper. We refer an interested reader to [1]- [5] for details on the Semantic Modeling.

24. Basics of the Semantic Modeling

The language of the Semantic Modeling is a first-order language with sorts 'urelement' and 'list' , in which only bounded quantification of the following form is allowed:

— a restriction onto the list elements Ух € t and Эх £ t;

— a restriction onto the initial segments of lists Ух Q t and Эх С t. where t is a list term. A list term is defined inductively via constant lists, variables of sort 'list', and list functions given below. A constant list (which can be nested) is built over constants of sort 'urelement' and a constant ( ) of sort 'list', which represents the empty list. The list functions are:

head - the last element of a non-empty list and ( ), otherwise;

tail - the list without the last element, for a non-empty list, and ( ),

otherwise;

cons - the list obtained by adding a new last element to a list;

conc - concatenation of two lists;

Terms of sort 'urelement' are standard first-order terms. The predicates are allowed to appear in Ao-formulas without any restrictions, i.e., they can be used in bounded quantifiers and atomic formulas.

Formulas in the language above are interpreted over hereditarily finite list superstructures HW(M), where M is a structure. Urelements are interpreted as distinct elements of the domain of M and lists are interpreted as lists over urelements and the distinguished 'empty list' ( ). In particular, the following equations hold in every HW{M) (the free variables below are assumed to be universally quantified):

-<Эх x € ( )

cons(a;, y) = cons (ж', у') —> x = x' Л у = у' tail(cons {x,y)) = x, head(cons {x,y)) =y tail« )) = (), head« )) = () conc« ),x) = сопс(ж, {}) = X cons(conc(x,y),z) = сопс(ж, cons(y, z)) conc(conc (x,y),z) = сone(ж, conc (y,z))

It was shown in [12] that for any appropriate structure Л4, there exists a representation of its superstructure of finite lists HW(M), in which the value of any variable-free list term t can be computed in time polynomial in the size of t (given as as string). Throughout the text, we omit subtleties related to the representation of hereditarily finite structures and we assume that for any variable-free list term t one can compute a constant list t' in time polynomial in the size of t such that HW(M) |= t = t', for any structure HW(M) under consideration. For list terms ti,...,tn, n ^ 1, we will use ( ti,...,tn ) as a shortcut for the term

cons(cons(cons(( ), ¿i), ¿2) • • •, tn)...). For a list s, the notation |s| stands for the number of elements in s.

In [6-8], the basic language of the Semantic Modeling was extended with non-standard list terms, which represent conditional operators (they correspond to the common 'if-then-else' or 'switch' constructs of programming languages), bounded list search, and bounded recursion (similar to the restricted 'while' operator). We refer to the obtained language as C. The non-standard terms in C are called Cond-, bSearch- and Rec-terms, respectively, and are defined as follows. By default any standard term in the language of the Semantic Modeling is a £-term and any formula of the language of the Semantic Modeling is a ^-formula.

If t and 0(v, x) is a £-term of sort list and ^-formula, respectively, then the expression bSearch(6,t)(v) is a bSearch-term. It is equal to the last element a of t(v) such that 9{v,a) holds and it is equal to ¿(17), otherwise (i.e., if there is no such a).

If do, ■ ■ ■, 9n are ^-formulas and q\,..., qn+\ are £-terms, where n ^ 0, then the term Cond[0i, <?i][02, Q2} ■ ■ ■ [On, Qn}[Qn+i}(v) is a Cond-term term with the following interpretation:

qn(v) if 0ra(ü) A-.0i( qn+ i(v) if -<9i(v) A

A -■6,2(v) A ... A -i0n_i( J) A ... A ~^9n(v)

Finally, if f(v),h(v,y,z) and t(v) are £-terms of sort list then the expression Rec[/, h,t](v) is a Rec-term and its value is given by g(v,t) with the following definition:

— g(v, cons(a, b)) = h(v,g(a), b), for any lists a, b such that cons(a, b) Ci In this paper, we refine the formalization of the Document Modeling from [11] in the language of the Semantic Modeling extended with the above mentioned non-standard list terms. In particular, we obtain a more succinct formalization in comparison with [11]. Further in Section 3, we will introduce document theories, which formalize the key ingredients of the Document Modeling approach, and in the next section we describe conventions used in our formalization.

2.2. Conventions in Formalization of Document Theories

We use the following notions and informal conventions: — There are pairwise disjoint finite sets FieldNam.es, FormNames, Filter Names, and Trans Names of constants of sort urelement, which

provide document field, form, and filter names, and transaction names, respectively, which can be used in the axioms of a document theory

— Natural numbers are modelled in a straightforward way as lists consisting of n empty lists, for n ^ 0, and 0 is represented by the empty list () (we also show how to model real numbers in a decimal representation with a given precision).

— An instruction is given as a list of the form ( formName, CreateDoc ) (in which case it is called CreateDoc-instruction) or (value, fieldName, docID, SetField } (a SetField-instruction), or ( params,docID, transName ) (a transaction), where formName € FormNames, fieldName € FieldNames, transName € Trans Names, docID represents a natural number, and value, params are some lists, which specify a field value and transaction parameters, respectively

— A queue is a list of instructions to be executed. A queue is updated by daemons, which implement actions on the events such as changing a field value in a given document or executing a transaction. Creating a new document triggers no events.

— A situation is a list of instructions, which represents the history of executed instructions. The last executed instruction appears first in a situation.

— A field is given as a list, with the head being an element of FieldNames and the tail being a list, which represents a value for a field. Every field has a default value it gets when a new document is created.

— A document is a list of fields (the order of fields in the list is arbitrary).

— A (document) model is a list consisting of tuples ( sit, form, doc, ID ), where ID corresponds to a natural number, doc is a document, form € FormNames, and sit a situation. A model stores a version of each document in each situation which has ever taken place. The head of this list is a tuple, in which the situation is the current one, i.e., it consists of instructions (a history) that have given the model.

Situations represent contexts, in which documents are created or modified, and this information can be used in querying a document model. We note that this feature is irrelevant for the results in this paper, but we prefer to keep situations to comply with the original formalization of document models from [11]. A document theory consists of axioms, which specify document fields, forms, filters (i.e., the static structure of documents and query templates), and axioms for the dynamic part. The latter is given by so called daemons (similar to the notion used in process programming), which specify the instructions that must be executed whenever certain event happens (i.e., whenever a value of a specific field in a document is changed or a certain transaction is fired). Although formally we distinguish between CreateDoc-, SetField-instructions and transactions, we make no terminological difference between them when talking about the transaction

termination problem. The results on computing effects of transactions refer to the instructions of the form above as well.

3. Document Theories

We define a document theory T as a theory in signature E, where E consists of the list functions introduced in Section 2.1 and the predicate and function symbols introduced in the axioms below. In particular, E contains pairwise disjoint finite subsets of constants FieldNames, FormNames, FilterNames, and TransNames, which specify field, form, filter, and transaction names, which can be used in the axioms of T. The set FormNames is supposed to be non-empty. Besides, E contains distinguished constants CreateDoc and SetField, ExecTrans, fault, which are used to represent instructions, and fault (analogous to exception in programming languages). We formulate the axioms of T in the language of the Semantic Modeling with non-standard terms. Initially, this language contains only two sorts: urelement and list. For convenience, we will assume that there is also a subsort Real of the sort list, which corresponds to (nonnegative) real numbers with a given precision (denoted further as prec). In the following subsection, we define the sort Real, together with the corresponding predicates and functions, and we show how basic arithmetic operations can be implemented via list terms. In general, there are many such implementations possible, so the next subsection is best viewed as a number of introductory examples to the language of the Semantic Modeling. The only important observation is that the proposed implementation is tractable, as stated by Lemma 2 in Section 3.2. Throughout the text we assume that all the free variables in formulas are universally quantified.

3.1. Numeric Terms and Predicates

Let us define Nat{x) = Vt € x t = ( ). For a natural number new, denote by n the list consisting of n empty lists. Given prec € u, prec ^ 1, we define a subsort Real of the sort list as follows:

Real(x) = len(x) = prec A Vt € x Nat(t) A len(t) C 9

where len(x) is an abbreviation for the term Rec[( }, cons(g(a), ( )),x]{x), i.e., len(x) gives the number of elements in a list x. In other words, we assume that a list of sort Real corresponds to the decimal representation of a real number using prec-many digits, for a fixed number prec € oj.

For lists x, i, let x.i be a shortcut for the term

Cond[^Nat(i) V -.(i Ç len(x)), fault] [ Rec[( },b,i] }

i.e., it gives the constant list fault if i does not correspond to a natural number or i is greater than the number of elements in x. Otherwise it gives the i-th element of x.

For lists x,y, let x < у be the conjunction of Real{x) Л Real{y) with

3i С ргёс ( x.i Ц y.i Л x.i ф у л Л \/j С ргёс (г Ц j —> x.j = y.j) )

i.e., we assume that the first digit of a real number given by a list x is head(a;). The corresponding predicate x ^ у is defined similarly For a list t, let min(t) be a notation for the term

Cond[t =

= ()V3s € t(^Real(s)), fault] [ Rec[ head(i), Cond[6 < g(a),b][g(a)],t] ]

The term max(t) is defined similarly

Finally, for lists x, y, let x + у be a shortcut for the term

Cond[-i(_Rea/(a;) Л Real (у)) V tzil(sum) = 1, fault] [head(s-um)]

where

sum = Rec[( (},(}}, cons( tail(s), cons(head(g(a:)),head(s)) ), ргёс], s = sumnat{ conc( x.ccms(a,b),tzil(g(a)) ), y.ccms(a,b) ), sumnat{x,y) =

= Cond[10 ^ сопс(ж, у), cons( 1, modlO(conc(x, y)) )][cons(0, сопс(ж, у))] and

modlO(x) =

= head( Rec[( (),() ), Cond[tail(#(a:)) = TO, cons(tail(g(a)), cons(head(5i(Q;)), b))] [cons( с cms (t ail (g (a)), b), hezd(g(a)) )],ж] ).

We note that negative reals and other arithmetic operations, e.g., subtraction, multiplication, etc., can be defined in a similar fashion.

Let prec = к + m, where к, m are some constants, which give the length of the integer/fractional part of real numbers, respectively For a (nonnegative) real number n, let dec(n) be the decimal representation of n such that the number of digits in the integer and fractional part of dec(n) is exactly к and m, respectively This is achieved by using auxiliary zeros, e.g., for n = 3/2 and к = m = 2, we have dec(n) = 01.50. If dec(n) exists, let List(n) be the list representation of dec(n), i.e., the list such that len(List(n)) = ргёс and for all г € {1,... ,prec} and j € w, it holds List(n).i = j iff j is the (prec + 1 — i)-th digit in dec{n).

The following lemma sums up the properties of the given formalization:

Lemma 1 (Implementation of Arithmetic with Precision). Let HW(A4) be a list superstructure and prec € w a precision. For any (non-negative) real numbers ai such that dec{ai) exists, for i = 1,..., n and n ^ 3:

— dec(a\) (xdec(a2) iff HW(M) \= List(a\) oc List(a2), for oc€ {<, =}

— dec(ai) + dec(a2) = dec(as) iff HW(M) |= List(a\) + List(a2) = List(as)

— dec(dec(a\) + dec{a2)) does not exist iff HW(M) |= List(a\) + List{a2) = fault

For n ^ 1, i/ie value ofmin{{ List{a\),... ,List(an) )) or max{{ List(a\), ... ,List(an) )) in HW(M) is List(a) iff a is minimal/maximal among dec(ai),..., dec(an), respectively.

3.2. Document Terms

Let us introduce notations for terms, which are used to access documents and field values in a document model.

The following term gives the last used ID for a document in a model:

GetLastDocI D(model) =

= maa;(cons(Rec[( ), cons(g(a:), head(6)), model}),

i.e., it implements a search for the greatest value occurring as the head of a tuple from model and outputs 0 if there are no documents in the model.

The next term gives the last version of a document (from a model) by its ID. It implements search for the last tuple with a given ID (contained in a model) and outputs the found document. If no tuple with the given ID is present in the model, the term gives fault.

GetDocByID(docID,model) =

= Cond[doctuple = model, fault] [doctuple]

where doctuple = bSearch[head(a;) = docID, model]. The next term provides a field value from the last version of a document with a given ID:

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

GetFieldValue(docID, fieldName, model) =

Cond[document = fault, fault][ tail(bSearch[head(a;) =

= fieldName, document]) ]

where document = head(tail {GetDocBy ID {docID, model)))]. Finally, we define the term FindFieldPosition, which 'splits' a document into a partitioned one (denoted as pdocument below), which has the form (Usti,list2) such that conc{listi,list2)=document and head{list\) is a field with the required name (if there exists one in a document). This auxiliary term is employed in the axioms of a document theory to implement change of a field value in an existing document:

FindFieldPosition(document, fieldName) =

Cond[tail(pdocument) = ( ), fault][pdocument]

where

pdocument = Rec[ ( ),Cond[ head(tail(g(a:))) =

= fieldName, ( tail(g(a)), cons(head(g(a:)), 6) ) ]

[ ( cons(tail(g(a)),b), ( ) ) ], document }

Now we define by induction the notion of document term, which generalizes the definitions above.

Definition 1 (Document Term). Any standard list term (i.e., which does not contain Cond-, bSearch, or Rec-terms) is a document term. If s,t,u,i are document terms then s.i, s + t, min(s), max(s), GetLastDocID(s), GetDocByID{s,t), and GetFieldValue{s,t,u) are document terms. The definition of document term is complete.

An important property is that these terms are computationally tractable as stated in the following lemma.

Lemma 2 (Tractability of Document Terms). For any prec € ш, document terms s(u), t(v), and vectors of constant lists a, b:

— a constant list с such that HW(M) |= s(a) = c, for any list superstructure HW(M) (which contains all the urelements from s,t,a,b), can be computed in time polynomial in the size of s (a) and prec;

— it can be decided in time polynomial in the size of s(a), t(b), and prec whether s(a) oc t(b), for oc€ {<, =}, holds in any structure as above.

Proof Sketch. The first point of the lemma is proved by induction on the form of the term s. For a standard list term, the claim readily follows from Lemma 2 in [12]. For an arbitrary document term s the claim is shown by analyzing the syntactic form of the terms Л, +, mini), maxQ, GetLastDocI D(), GetDocBylDQ, and GetFieldValue(). It follows from their definition that each of these terms can be computed in polynomial time in the size of their parameters and prec. The second point of the lemma is shown by an analysis of the definition for <: it gives a polynomial time algorithm to verify whether there is a segment i Q prec, for which the condition from the definition of < is true. □

3.3. Axioms of a Document Theory

A document theory has the form T = 7/ U Ts U Td, where the theory 7/ gives predefined filters, which can be used to select collections of documents, Ts gives definitions to document fields and forms (i.e., it describes the data schema, hence, the subscript s), and Td describes possible transactions and triggers, their execution rules, and instruction processing rules, which generate documents or update existing ones. Thus, Td describes the dynamics of documents (hence, the subscript d).

First, let us introduce auxiliary terms, which will be used in axioms of T. The first one gives a form name of a document

Form (document) = head(tail(tail (document)))

while the second one gives a list, in which the order of elements is reversed:

rev (list) = Rec[( ), conc(( b ),g(a)), list]

We begin with a definition of theory 7/. For each name £ Filter Names, it contains a definition of a filter term of the form below. Every filter gives a list of IDs of (the last version of) those documents from a model, which satisfy conditions specified by the filter:

GetDocsByFiltername(f Name, model,params) =

=head(Rec[( ), selection, rev(model)])

where selection is a term of the form

Cond[head(6) € g(a),g(a)] [f liter (params, b), cons(g(a), head(fe))] [g(a)]

filter (params, doc) is a formula, which represents conditions on the documents to be selected:

filter (params, doc) = Form (document) = fName A <p

where ip is a Boolean combination of formulas of the form s <xt, where oc€ {<,=} and s,t are document terms over variables params, doc such that in every term GetLastDocID(m), GetDocByID(x,m), or GetFieldValue (x,y, m) from s or t, we have m = model.

Next, we define the theory Ts- First of all, it contains axioms that describe fields and cardinalities for their values:

Field(x) = \f ( head(a;) = / A Card(tail(a;)) )

/ GFieldNames

where Card(y) is a cardinality predicate, which restricts the number of elements in a list y. We consider the following cardinalities: the list is empty; it contains zero or one element (we use notation'?' for this predicate); it contains exactly one element (notation '!'); it contains one or more elements. For example, '?' is defined as

?(x) =Vt £ x cons(( },t) = x

The other predicates are defined similarly.

Further, Ts introduces document forms by describing which fields (with their default values) are present in a blank document of a given form:

Blank(name) = document =

( Д name ф / A document = fault)V

f£FormNames

V \J (name = f Л <pf) (3.1)

f£FormNames

where iff = document = { ) or ipf has the following form, for a nonempty subset Nf С FieldNames (we assume that the elements of Nf are enumerated, Nf = {1,... ,n}):

3x\ € document... z\xn € document Д (head(:rj) =

i&Nf

= i A tail^) = defvaluei A Field(xi)) ЛУх € document (\J x = Xi)

i&Nf

where defvaluei is a list, which respects the cardinality restriction given in the definition of the Field{x) predicate for head(a;) = i. The definition of the theory Ts is complete.

Now we are ready to define the theory Td■ It contains definitions of daemons and a definition of a recursive Update function, which given a queue, updates a model to a new state based on the definition of daemons. First, we define the Update function. For the sake of readability, we split its definition into three formulas combined with disjunction and comment on them separately.

First of all, if the queue is not empty and the first instruction in the queue is not a valid one (i.e., it is neither CreateDoc, SetField instruction, nor a transaction name t € TransNames ) the whole queue is skipped and the model given by the Update function is the initial model. If the queue is empty, then it is assumed that all the instructions in the queue have been processed and thus, Update returns the value of model:

Update(initialmodel, model, queue) = model' = ( head(head(queue)) qL (CreateDoc, SetField, tnamei,..., tnamek) A

queue ф ( ) A model' = initialmodel )

V ( queue = ( ) A model' = model )

V

(3.2)

where {tnamei, • • •, tnamek} = TransNames, for к ^ 0.

Otherwise the queue contains an instruction to create a document of a specific form, change a field value in a document having a certain ID, or

launch a specific transaction. In the first CclSG, cl blank document of a given form is created (which is implemented by using existential quantification) and added to the model, the instruction is removed from the queue, and the Update function is evaluated recursively on the resulting input. If a blank document of a form with name formName can not be created (due to formName qL FormNames) then the queue is skipped and Update returns the initial model:

( head(head(queue)) =

= CreateDoc A 3document document = Blank(formName) A ( (document = fault A model' = initialmodel) V (document / fault A model' = Update(initialmodel, coils (model, new do c),tzil(queue))) ) V

(3-3)

where formName = head(tail(head(g-ue-ue))), newdoc is a term of the form

( newsituation, formName, document, cons(GetLastDocID(model), { }) }

newsituation = cons(Situation(model), ( formName, CreateDoc )), and Situation(modei) = head(tail(tail(tail(head(modeZ))))).

The case of SetField instruction in the queue is formulated similarly, but the formalization is technically more complex, since modifying an already existing document requires more steps than creating a fresh one:

( head(head(g-ue-ue)) = SetField A ( ( pdocument = fault V -iFie/d(cons(newFldValue, fldName)) ) A model' = initialmodel) V (pdocument / fault A model' = Update( initialmodel, cons (model, (newsituation, form, updateddoc, docID )), extendedQueue)))

V (3.4)

where form = Form(GetDocByID(docIU, model)), pdocument denotes FindFieldPosiiion(head(tail(GetDocByI_D(docID, model)), model), fldName) and updatedDoc is a shortcut for

conc(tail(t ail (pdocument)), cons(head(pdocument),

cons(newFldValue, fldName)))

in which

docID = head(tail(head(g-ue-ue)))

fldName = head(tail(tail(head(g-ue-ue))))

newFldValue = head(tail(tail(tail(head(g«e«e)))))

newsituation = cons(Situation (model),

( newFldValue, fldName, docID, SetField ))

Situation(modei) = head(tail(tail(tail(head(modeZ)))))

(recall the instruction modeling conventions). Finally, extendedQueue is a shortcut for

SetFieldTrigger(docID, fieldName, newFieldValue,

tail (queue), model)

Thus, updatedDoc is a document with an updated field value and extendedQueue is a sequence of instructions provided by a trigger on a field value change. By the definition above, the whole queue is skipped whenever there is no field with the specified name in a given document. Note that in this case tail(pdocument) = fault holds by the definition of FindFieldPosition term.

Finally, if head(head(g-ue-ue)) is a transaction name, a call to the daemon is made, which defines the corresponding transaction:

Update(initialmodel, model, ExecTrans(tName, docID, params, tail(g-ue-ue), model))) )

where docID = head(tail(head(g-ue-ue))) is a document, for which the transaction is to be executed, and params=head(tail(tail(head(g-ue-ue)))) specifies parameters for the transaction.

Now we are in the position to define functions, which implement daemons. Their purpose is to extend the queue with a sequence of instructions depending on whether a field value in an existing document is changed or a transaction is fired. Both functions have similar definitions:

SetFieldTrigger(docID, f Name, f Value, queue, model) = $ ExecTrans(tName, docID, params, queue, model) = \I>

and for all « € {1,..., n), n ^ 0, 9i is a condition of the form

Form(GetDocByID(docID, model))=f ormName A fName=îieldName A p

(in this case 9i is called

(formName, fieldName)-condition) such that formName € FormNames,

fieldName € FieldNames and qi=c one (queue, instri),

where p is a Boolean combination of formulas of the form vali oc val2,

( head(head(g-ue-ue)) = tName A model'

where

Ф = Cond[6>i ,qi],...,[en, qn] [queue]

where oc€ {<, =}, and vali, val2 are document terms over variables docID, fValue, model and instri (called queue extension) is a list

C0I1C(S1, C0I1C(S2 ■ ■ ■ COIlc(Sfc_i, Sfc) ...)

such that

— each Si for i = 1,..., k, k ^ 1 (called instruction term) is a list term of the form ( ( val, fieldName', docID, SetField } ) or Rec[( },h, DocFilter] with the definition: g(( )) = (), g(cons(a,id)) = h(id), where h(id) = conc(( ( params, id, transName ) ), g(a)),

iНе можете найти то, что вам нужно? Попробуйте сервис подбора литературы.

for all a, id such that cons(a, id) Ç DocFilter

— params is a list of the form ( t\,..., tm ), for m ^ 0, where every ti is a document term over variables docID, fValue, model

— fieldName' € FieldNames and val is a document term over variables docID, fValue, model

(then Si is called (f ormName, f ieldName')-instruction)

— DocFilter = GetDocsByFilternaiae(f rmName, model, p)

— p is a document term over variables docID, fValue, model

— name € FilterNames, frmName € FormNames, and transName € TransNames (then Si is called (frmName, transName)-msiruciùm)

Thus, changing a field value in a document may cause addition of instructions to the queue, which change other fields in the same document or execute transactions over sets of documents defined by filters.

The formula is defined similarly, but with the following minor modification (we use the notations above):

— every condition 9i has the form

Form (GetDocBy ID (docID, model)) =

= f ormName A tName = transName A ip

(in this case 9i is called (formName, tra.nsName)-condition), where transName € TransNames

— every Si is a list term of the form (( val, fieldName', docID, SetField )) or Rec[( ), h, DocFilter], where h is given as conc(( ( params', id,

transName ) ), g(a)) or conc(( ( frmName, CreateDoc } }, g(a)), or Si is of the form ( ( frmName, GreateDoc } }

(in the latter two cases Si is called (frmName, CreateDoc)-instruction)

— val, vali, val2, p are document terms over variables docID,params, model and params' is a list of the form ( t\,..., tm ), for m ^ 0, where every ti is a document term over variables params, model.

Thus, executing a transaction over a document may cause addition of instructions to the queue, which change fields in the document, create new documents (of the same or different document form), or execute transactions over sets of documents defined by filters.

The definition of the document theory T is complete.

Let the size of T be the total size of its axioms (given as strings).

4. Termination of Transactions

The recursive definition of Update function yields the natural notion of chase operator, which for a given document theory T and constant lists model, queue, where queue ф (), outputs lists model' and queue' obtained after processing the first instruction from queue (i.e., head (queue)). In other words, for any list superstructure HW(M), it holds

HW(M) |= Update(model, model, queue) = Update(model, model', queue')

where model' is obtained from model by the definition of Update function in T without applying recursion and either queue' is obtained in the same way from queue, or it holds that queue = ( )■ We denote this fact as ( model, queue } н> ( model', queue' ). A chase sequence wrt T for a list ( mo, qo } of the form above is a sequence of lists ( mo,qo ),( rrii,qi },..., where ( mi,qi } н> ( mi+\,qi+i ), for all i ^ 0. A chase sequence is terminating if it is of the form so,---,Sk, for some к ^ 1, where s к = {mk,{} ).

In the following, we note that there may not exist a terminating chase sequence for a given list (m,q) and a theory T. Then we formulate a sufficient condition on the form of T, which guarantees chase termination, and finally we estimate the complexity of computing the chase. Due to space constraints we provide here only proof sketches. Full proofs can be found in the extended version of the paper available at https://arxiv.org/abs/2002. 05064

Theorem 1 (Termination of Transactions is Undecidable). It is undecid-able whether there is a terminating chase sequence for a list s = ( model, queue ) wrt a document theory T.

Proof Sketch. The theorem is proved by a reduction of the halting problem for Turing machines. Given a Turing machine M, we define a document theory T, which encodes M. The theory T contains axioms, which specify a single document form and a field used for storing the content of the tape of M, and axioms for daemons, which encode transitions of M. Then we define a list initqueue of instructions, which encode the first symbols of the initial configuration of M and enforce execution of a transaction, which launches a transition of M from the initial state. Then it can be shown

that there is a terminating chase sequence for ( ( ), initqueue } iff M halts.

□

In fact, the form of the theory T used in the theorem shows that non-termination may be caused by the possibility to change a field value of

the same document or execute the same transaction infinitely many times. The definition of SetFieldTrigger and ExecTrans functions in T admits cyclic references between instructions and transactions. In the following, we observe that if one forbids cycles then chase termination is guaranteed.

Definition 2 (Dependency Graph). A dependency graph over a document theory T is a directed graph with the set of vertices V equal to Form Names x (FieldNames U TransNames U {CreateDoc}) and the set of edges E defined as follows.

For any (form, name), (form', name') € V, there is an edge from (form, name) to (form',name') if there is [9,q] in the definition of SetFieldTrigger or ExecTrans functions in T, in which 9 is a (form, name)-condition and q = conc(queue, instr), for a list queue and queue extension instr, such that there is a (form', name')-instruction in the definition of instr.

Definition 3 (Locally Simple Document Theory). A document theory T is called locally simple if the dependency graph over T is acyclic.

Theorem 2 (Local Simplicity Implies Termination of Transactions). For any locally simple document theory T and constant lists model, queue, there is a terminating chase sequence for ( model, queue ) wrt T.

Proof Sketch. We show that for any such model and queue, there is a finite chase sequence So,... ,sn, where So = ( model, queue ), n ^ 1, such that sn = ( model', tail(queue) ), where \model'\ = |model| + p, for some p ^ 0. This yields that any instruction from queue can be processed in a finite number of steps, from which the claim follows. □

Although the theorem states that local simplicity guarantees termination, it does not provide any insight on how difficult it is to compute the effects of transactions. The next result indicates that the complexity is high, which is due to the possibility to create exponentially many documents by using recursive instruction terms. For n ^ 0, let lexp(n) be the notation for 2n and for k ^ 1, let (k + l)exp(n) = 2kexP(ra).

Theorem 3 (Computing Effects of Transactions is Hard). For any k ^ 1, n ^ 0, there exists a locally simple document theory T and a constant list queue, both of sizes linear in k,n, such that the terminating chase sequence for so = (( ), queue ) wrt T has the form so, ■ ■ ■, sm, where m ^ kexp(n) and sm = ( model, ( ) ), for a list model such that \model\ ^ kexp(n).

Finally, let us formulate a sufficient condition, which guarantees polynomial boundedness of effects of transactions. Let G be a dependency graph over a document theory T and for form € FormNames, name € FieldNames U TransNames U {CreateDoc}, let s be a (form,name)-instruction in a queue extension from the definition of SetFieldTrigger

or ExecTrans functions in T. We call the term s document generating if either name = CreateDoc or s = Rec[( ), h, DocFilter], where h = conc(( ( form, CreateDoc ) ), g (a)), or (form,name) has a successor vertex (form', name') in G, which is given by a document generating term.

Theorem 4 (Polynomially Bounded Effects of Transactions). Let T be a locally simple document theory such that in any queue extension from the definition of SetFieldTrigger or ExecTrans functions in T, there are no document generating Rec-terms.

Then for any constant list model and a list of instructions queue, the terminating chase sequence for So = ( model, queue ) has the form So, ■ ■ ■ ,sn, where n is exponentially bounded by the size ofT,so, andsn = ( model', ( ) ), for a list model' of size polynomially bounded by the size of T and so.

Proof Sketch. Let N be the maximal number of instruction terms in a queue extension from the definition of SetFieldTrigger or ExecTrans functions in T. Clearly, N is bounded by the size of T. Let model, queue be lists, which satisfy the conditions of the lemma, and let к be the rank of instruction t = head(queue) wrt T, model. By definition, к is bounded by the number of vertices in the dependency graph over T and thus, it is bounded by the size of T. We show by induction on к that there is a chase sequence So, Si,..., sn, such that So = ( model, queue ), sn = ( model', tail(queue) ), \model'| ^ |model| + N, and n ^ (N ■ (\model\ + N))k. Then there is a terminating chase sequence sq,... ,sm for so, where sm = ( m, ( ) ), \m\ ^ |queue| • (|model| + N), and m ^ I queue \-(N ■ (\model\ +N))k, which proves the theorem. □

5. Conclusions

We have shown that document theories (and thus, the Document Modeling approach) implement a Turing-complete computation model even in the presence of a tractable language of arithmetic operations (over document field values) and queries (for selecting collections of documents). This confirms that one of the main sources of the computational complexity are the definitions of daemons, which specify transactions and relationships between them. If the definitions are given in a way that allows for executing the same transaction or changing the value of a document field infinitely many times, then it is possible to implement computations of any Turing machine. We have shown that disallowing cyclic relationships between transactions guarantees decidability of transaction termination (importantly, cycles can be easily detected by a syntactic analysis of axioms of a document theory), but the complexity of computing effects of transactions even in this case is high, if creating documents in loops is possible. In fact, using looping in transactions is natural, since it allows for performing updates over collections of documents. If documents can be

only modified in loops, but not created, then the complexity of computing effects of transactions is decreased and we have noted a case when the effects are polynomially bounded. In further research, we plan to make a more detailed complexity analysis for various (practical) restrictions on the definition of daemons. In this paper, we did not study the contribution of query languages to the complexity of computing effects of transactions and we have adopted a relatively simple query language. Since daemons employ document queries to modify collections of documents, it would be important to study the interplay between these two sources of complexity.

References

1. Ershov Yu.L., Goncharov S.S., Sviridenko D.I. Semantic Programming. Information processing 86: Proc. IFIP 10th World Comput. Congress. 1986, vol. 10, Elsevier Sci., Dublin, pp. 1093-1100.

2. Ershov Yu.L., Goncharov S.S., Sviridenko D.I. Semantic Foundations of Programming. Fundamentals of Computation Theory: Proc. Intern. Conf. FCT 87, Kazan, 116-122. Lect. Notes Comp. Sci., 1987, vol. 278. https://doi.org/10.1007/3-540-18740-5.28

3. Goncharov S.S., Sviridenko D.I. ^-programming. Transl. II. Amer. Math. Soc., 1989, no. 142, pp. 101-121.

4. Goncharov S.S., Sviridenko D.I. ^-programming and its Semantics. Vychisl. Systemy, 1987, no. 120, pp. 24-51. (in Russian).

5. Goncharov S.S., Sviridenko D.I. Theoretical Aspects of ^-programming. Lect. Notes Comp. Sci., 1986, vol. 215, pp. 169-179. https://doi.org/10.1007/3-540-16444-8.13

6. Goncharov S.S. Conditional Terms in Semantic Programming. Siberian Mathematical Journal, 2017, vol. 58, no. 5, pp. 794-800. https://doi.org/10.1134/S0037446617050068

7. Goncharov S.S., Sviridenko D.I. The Logic Language of Polynomial Computability. Doklady Mathematics, 2019, vol. 99, no.2, pp. 11-14.

8. Goncharov S.S., Sviridenko D.I. Recursive Terms in Semantic Programming. Siberian Mathematical Journal, 2018, vol. 59, no. 6, pp. 1279-1290.

9. Kazakov I.A., Kustova I.A., Lazebnikova E.N., Mantsivoda A.V. Building locally simple models: theory and practice. The Bulletin of Irkutsk State University. Series Mathematics, 2017, vol. 21, pp. 71-89. (in Russian). https://doi.org/10.26516/1997-7670.2017.22.71

10. Malykh A.A., Mantsivoda A.V. Document modeling. The Bulletin of Irkutsk State University. Series Mathematics, 2017, vol. 21, pp. 89-107. (in Russian). https://doi.org/10.26516/1997-7670.2017.21.89

11. Mantsivoda A.V., Ponomaryov D.K. A Formalization of Document Models with Semantic Modelling. Bulletin of Irkutsk State University, Series Mathematics, 2019. vol. 27, pp. 36-54.

12. Ospichev S., Ponomarev D. On the Complexity of Formulas in Semantic Programming. Siberian Electronic Mathematical Reports, 2018, vol. 15, pp. 987-995.

13. Reiter R. Knowledge in Action: Logical Foundations for Describing and Implementing Dynamical Systems. MIT Press, 2001. https://doi.org/10.7551/mitpress/4074.001.0001

14. Vityaev E.E. Semantic Probablistic Inference of Predictions. The Bulletin of Irkutsk State University. Series Mathematics, 2017, vol. 21, pp. 33-50. (in Russian).

Andrei Mantsivoda, Doctor of Sciences (Physics and Mathematics), Professor, Irkutsk State University, 1, K. Marx st., Irkutsk, 664003, Russian Federation; Sobolev Institute of Mathematics SB RAS, 4, Koptyug pr., Novosibirsk, 630090, Russian Federation, tel.: +7 (3952) 521241, e-mail: andreiQbaikal.ru

Denis Ponomaryov, Candidate of Sciences (Physics and Mathematics), Sobolev Institute of Mathematics SB RAS, 4, Koptyug pr., Novosibirsk, 630090, Russian Federation; Ershov Institute of Informatics Systems SB RAS, 6, Lavrentyev pr., Novosibirsk, 630090, Russian Federation; Novosibirsk State University, 1, Pirogov st., Novosibirsk, 630090, Russian Federation, tel.: +7 (383) 3306660, e-mail: ponom0iis.nsk.su

Received 15.11.19

О завершаемое™ транзакций над семантическими документными моделями

А. В. Манцивода1'2, Д. К. Пономарев2,3,4

1 Иркутский государственный университет, Иркутск, Российская Федерация

2Институт математики им. С. Л. Соболева СО РАН, Новосибирск, Российская Федерация,

3Институт систем информатики им. А. П. Ершова СО РАН, Новосибирск, Российская Федерация

4Новосибирский государственный университет, Новосибирск, Российская Федерация

Аннотация. Рассматривается парадигма документного моделирования, которая дает формальную основу для работы с документами в системах управленческого планирования. В данной работе документные модели формулируются в рамках логического формализма - языка семантического моделирования - и рассматривается проблема распознавания завершаемости транзакций, заданных документной моделью, для любого возможного входа. Показывается, что в общем случае данная проблема алгоритмически неразрешима. Формулируются достаточные условия, гарантирующие разрешимость и полиномиальную ограниченность результата выполнения транзакций.

Ключевые слова: семантическое моделирование, документная модель, транзакция, цепь.

Список литературы

1. Ershov Yu. L., Goncharov S. S., Sviridenko D. I. Semantic Programming // Information processing 86: Proc. IFIP 10th World Comput. Congress. Vol. 10. Elsevier Sci., Dublin, 1986. P. 1093-1100.

2. Ershov Yu. L., Goncharov S. S., Sviridenko D. I. Semantic Foundations of Programming // Fundamentals of Computation Theory: Proc. Intern. Conf. FCT 87, Kazan, 116-122. Lect. Notes Сотр. Sci. 1987. Vol. 278. https://doi.org/10.1007/3-540-18740-5_28

3. Goncharov S. S., Sviridenko D. I. ^-programming, Transl. II. // Amer. Math. Soc. 1989. N 142. P. 101-121.

4. Goncharov S. S., Sviridenko D. I. ^-programming and its Semantics// Vychisl. Systemy. 1987. N 120. P. 24-51. (in Russian).

5. Goncharov S. S., Sviridenko D. I. Theoretical Aspects of ^-programming // Lect. Notes Сотр. Sci. 1986. Vol. 215. P. 169-179. https://doi.org/10.1007/3-540-16444-8_13

6. Goncharov S. S. Conditional Terms in Semantic Programming // Siberian Mathematical Journal. 2017. Vol. 58, N 5. P. 794-800. https://doi.org/10.1134/S0037446617050068

7. Гончаров С.С., Свириденко Д.И. Логический язык описания полиномиальной вычислимости // Доклады РАН. 2019. Т. 485, № 1. С. 11-14.

8. Гончаров С. С., Свириденко Д. И. Recursive Terms in Semantic Programming // СМЖ. 2018. Т. 59, № 6. С. 1279-1290.

9. Построение локально-простых моделей: методология и практика / И. А. Казаков, И. А. Кустова, Е. Н. Лазебникова, А. В. Мандивода // Известия Иркутского государственного университета. Сер. Математика. 2017. Т. 22. С. 71-89. https://doi.org/10.26516/1997-7670.2017.22.71

10. Малых А. А., Мандивода А. В. Документное моделирование // Известия Иркутского государственного университета. Сер. Математика. 2017. Т. 21. С. 89-107. https://doi.org/10.26516/1997-7670.2017.21.89

11. Mantsivoda А. V., Ponomaryov D. К. A Formalization of Document Models with Semantic Modelling // The Bulletin of Irkutsk State University. Series Mathematics. 2019. Vol. 27, pp. 36-54.

12. Ospichev S., Ponomarev D. On the Complexity of Formulas in Semantic Programming // Siberian Electronic Mathematical Reports. 2018. Vol. 15. P. 987-995.

13. Reiter R. Knowledge in Action: Logical Foundations for Describing and Implementing Dynamical Systems. MIT Press, 2001. https://doi.org/10.7551/mitpress/4074.001.0001

14. Витяев E. E. Семантический вероятностный вывод предсказаний // Известия Иркутского государственного университета. Сер. Математика. 2017. Т. 21. С. 33-50.

Андрей Валерьевич Манцивода, доктор физико-математических наук, профессор, Институт математики, экономики и информатики, Иркутский государственный университет, Российская Федерация, 664003, г. Иркутск, ул. К. Маркса, 1; Институт математики им. С. Л. Соболева СО РАН, Российская Федерация, 630090, г. Новосибирск, пр. Академика Коптюга, 4, тел.: +7 (3952) 521241, e-mail: andrei@baikal. ru

Денис Константинович Пономарев, кандидат физико-математических наук, Институт математики им. С. Л. Соболева СО РАН, Российская Федерация, 630090, г. Новосибирск, пр. Академика Коптюга, 4; Институт систем информатики им. А. П. Ершова СО РАН, Российская Федерация, 630090, г. Новосибирск, пр. Лаврентьева, 6; Новосибирский государственный университет, Российская Федерация, 630090, г. Новосибирск, ул. Пирогова, 1, тел.: +7 (383) 3306660, e-mail: ponom@iis.nsk.su

Поступила в редакцию 15.11.2019

On termination of transactions over Semantic document models Текст научной статьи по специальности «Математика»

Аннотация научной статьи по математике, автор научной работы — Mantsivoda Andrei, Ponomaryov Denis

Похожие темы научных работ по математике , автор научной работы — Mantsivoda Andrei, Ponomaryov Denis

О завершаемости транзакций над семантическими документными моделями

Текст научной работы на тему «On termination of transactions over Semantic document models»