On the equivalence-checking problem for polysemantic models of sequential programs
Ivan M. Zakharyaschev, Vladimir A. Zakharov
Abstract. We introduce a new prepositional model of computation for sequential computer programs. A distinctive feature of this model is that program runs and the results of computations are defined by means of two independent operational semantics. One of them can be regarded as an internal semantics that is used for routing runs in the control-flow graph of a program. The other one can be viewed as an observational semantics which is used for interpreting the results of a program execution. We show that some conventional models of sequential and recursive programs can be embedded into our model. We consider the equivalence-checking problem for the presented model and develop a uniform approach to the design of efficient equivalence-checking algorithms.
1. Introduction
The investigation of the equivalence-checking problem is of great importance in computer programs optimization, maintenance and understanding. Although it is hardly possible to formalize completely the term “to understand the meaning of a program”, nevertheless we can estimate the extent of our understanding using the following criterion: the meaning of a given program is understood if we can distinguish it from programs with different meaning. By defining an equivalence relation on programs in such a way that programs with the same meaning are equivalent, we face the need to consider the equivalence-checking problem. Usually, the meaning of a program is specified by its observable behavior. In the case of sequential programs, with every program 7r we can associate the input-output relation Rv computed by the program and consider this relation as the observable behavior of the program. Thus, the equivalence-checking problem is to verify whether two given sequential programs compute the same input-output relation.
As follows from the well-known Rice-Uspensky Theorem [21, 26], the equi-valence-checking problem defined above is undecidable for any programming system £ satisfying the following conditions:
1) any recursive input-output relation R is computable by some program 7r
from S, i.e., R = R-n]
2) there exists a program 1A which can simulate any program from S, i.e., Ru(n, in, out) = Rv(in, out) for every 7r € S;
3) any programming system S' satisfying 1) and 2) above can be effectively translated into S, i.e., there exists a recursive function / : S' —> S such that R^i = Rf(-n') for every 7r' G S'.
It is worth noting that all programming systems used in practice comply with these three conditions. To obtain positive results (effective criteria, semidecision procedures, etc.) for the equivalence-checking problem, one can be guided by the following approach. Given a programming system S, consider a model of computation S which has the same syntax but a simpler semantics. Based on this semantics, define an equivalence relation ~ on programs. We say that S approximates S if 7Ti ~ 7T2 implies R-Kl = R^ for every pair 7Ti, 7T2 of programs. If S approximates S, it suffices to check 7Ti ~ 7T2 to certain that RVl = R7r2. If the equivalence-checking problem “7Ti ~ 7T2?” is decidable in S then a decision procedure in S can be used for checking program equivalence in S.
This approach goes back to the seminal papers by Lyapunov and Yanov [9, 27]. It was further developed and studied in details in [4, 8, 13]. A whole spectrum of computational models that can be used for approximating the behavioral equivalence was introduced for sequential programs [10, 13, 19], functional programs [1, 3, 5], parallel and distributed programs [7]. Lattice-theoretic properties of the approximation relation on such models were studied in [18]. The utility of this approach is strengthened by the design of efficient equivalence-checking algorithms for many approximating models [1, 7, 11, 15, 20, 22, 23, 24, 25, 28, 29, 30].
Usually the semantics of approximating models of computation is defined in terms of transition systems (Ivripke models) M = (S', R, p) (see [6]), where S is a state space whose elements are associated with data states of programs, R is a transition relation which interprets program statements, and p is an evaluation of predicates in branching statements. A run of a program 7r on M is defined as a double route (w^ivs), where is a route in the control flow graph of 7r, and ivs is the corresponding route in the state space S. A program builds both routes in the framework of a single operational semantics. If a run terminates then the final data state s reached by ivs is accepted as the result of the run. However, there are cases when such models are not suitable for capturing some features of program computations. For example, we may assume that the program execution allows more than one output along its run. Then the result
of computation is specified not only by its final data state, but also by some intermediate data states traversed by this run in S. Moreover, in functional programming (see [2, 17]), an execution of a program may consist of the interleaving of explicit data transformations (numerical computation steps) and transformations of function terms (symbolic or “lazy” computation steps). The latter requires an alternative operational semantics which is defined in terms of rewriting rules. Thus, in studying the equivalence-checking problem by means of approximating models there are cases when it is suitable to deal with models of programs supplied with different semantics.
In this paper we introduce a new propositional model of computation for sequential computer programs. A distinctive feature of this model is that program runs and the results of computations are defined by means of two independent operational semantics. One of them can be regarded as an internal semantics that is used for routing runs in the control-flow graph of a program. The other one can be viewed as an observational semantics which is used for interpreting the results of a program execution. We show that some conventional models of sequential and recursive programs (Yanov schemes with input-output statements, linear recursive monadic schemes) can be embedded into our model. We consider the equivalence-checking problem for the presented model and develop a uniform approach to the design of efficient equivalence-checking algorithms. The paper is organized as follows. In Section 2 we introduce the basic concepts of our model, the syntax and the semantics of generalized propositional sequential programs (GPSP). Both syntax and semantics of GPSPs are defined in terms of transition systems. In Section 3 we show that some known models of computation used in studying the equivalence-checking problem can be uniformly embedded into GPSP models. Finally, in Section 4 we present a uniform approach to the equivalence-checking problem for GPSP. This approach extends the criteria system techniques used in [28, 29] for designing efficient e qui vale nee- che cki ng algorithms.
2. Preliminaries
We begin by defining the syntax and the semantics of generalized propositional sequential programs (GPSP).
2.1. Syntax of GPSPs
Fix two finite alphabets A = {a1,..., aN}, B = {p1,... ,pK } and an infinite alphabet V = {F\, F2,... }.
The elements of A are called basic actions. Intuitively, basic actions stand for
elementary built-in procedures. A finite sequence of basic actions is called a basic term. The set of all basic terms is denoted by A*. We write A for the empty sequence of actions and call it the empty term. As usual, we denote by |t| the length of a term t, and by the concatenation of t\ and t^. We also have a basic action stop not included in A — it corresponds to the statement that terminates each computation of a program.
The elements of B are called basic predicates. Each basic predicate stands for an elementary built-in relation on program data. A tuple (ai,..., a^) of truth-values of basic predicates is called a condition. The set of all conditions is denoted by C. We write ci, eg,. .. for generic elements from C. Since the set of primitive relations used in programs is finite and fixed, the internal structure of conditions is of no importance.
The elements of V are called procedures. Depending on the type of programming system (imperative or functional) whose programs are approximated by GPSP, procedures may be thought of either as points in control flow graphs of imperative programs, or as names of procedures and functions defined in recursive programs.
Definition 1. By a (deterministic) generalized propositional sequential program (GPSP, for short ) over sets A, C, V we m.ean a finite labeled transition system. 7r = (P7r, entry, exit, T, B}, where
• Vtt denotes the set of procedures used in n;
• entry is the initial point of the program,;
• exit is the terminal point of the program,;
• T: {V-n U {entry}) x C —> (Vtt U {exit}) is a transition function;
• B: {V-n U {entry}) x C —> A* is a binding function.
A transition function represents the control flow of a program, whereas a binding function associates with each transition a block of basic actions. Given a sequence of conditions ci, eg,..., cn_i, we say that a sequence of procedures Fi, F%, • • •, Fn is a (ci, C2, • • •, cn-\)-trace from F\ to Fn in a program 7r if F\ € Vtt U {entry} and = T(J^,Cj), for every i, 1 < i < n. This means that Fi, F21 ■ ■ ■, Fn is a trace routed by conditions ci, C2,.. ., cn_ 1 in the transition system. If F\ = entry and F„ = exit then the trace is called complete. We extend the binding function to the traces of a GPSP 7r by assuming that B(Fi, ci, eg,..., c„_i) = B(F1,c1)B(F2,c2) .. .B(F„_i,c„_i). By the size |7r| of 71 we mean the number [P7r \ + \B(P, c)|.
Per, c£C
Given a GPSP 7r and two procedures F' G V-k U {entry}, F" G U {exit}, we say that F' refers to F" if there exists a trace from F' to F". A procedure F in Vtt is called
• self-referenced if F refers to itself;
• marginal if F does not refer to any self-referenced procedure in n;
• pre-marginal if F is non-marginal, but there exists a condition c such that T(F,c) is marginal procedure;
• terminated if F refers to exit.
2.2. Dynamic frames and models
The semantics of programs is defined by means of dynamic Ivripke structures (frames and models) (see [6]).
Definition 2. A dynamic deterministic frame (or simply a framej over the set of basic actions A is a triple of the form, J7 = (S', so, Q), where
• S' is a non-empty set of data states,
• so is the initial state, sq G S',
• Q: S' x A U {stop} —> S' is an, updating function.
For all a G A, sgS, the state Q(s, a) is interpreted as the result of application of the action a to the data state s. The updating function Q can be naturally extended to the set A* of basic terms: Q*(s, A) = s, Q*(s, ta) = Q(Q*(s,t), a). A state s" is said to be reachable from s' if s" = Q*(s',t) for some t, G A* (notation: s' C s"). We also write s' IZ s" if s" = Q*(s',t) for some t G A*,t ^ A. If C is a partial order on S', then the frame T is called ordered.
Denote by the state s = Q*(so,t) reachable from the initial state by means of a basic term t. As usual, the subscript J7 will be omitted when the frame is understood. Since we will deal only with data states reachable from the initial state, it is assumed that every state s G S' is reachable from the initial state so,
i.e., S' = {[£] : t G ^4*}.
A frame Ts = (S', s, Q') is said to be a subframe of J7 = (S, so, Q) induced by a state s G S if S' = {Q*(s,t) : t G ^4*} and Q' is the restriction of Q to S'. A frame J- is called
• a semigroup if J7 can be mapped homomorphically onto every subframe
• universal if [t1] = [t2] implies t1 = t2 for every pair t1 ,t2 G A*.
Taking the initial state so = [A] for the unit, one may regard a semigroup frame J7 as a finitely generated monoid (S',*) such that [ti] * [t2] = [^1^2]- hi what follows we will say that the frame J7 is associated with this monoid. Clearly, the universal frame U is associated with the free monoid on A. If J7 is an ordered semigroup frame then the unit element [A] is irreducible, e.g., [A] = [^1^2] implies ti = t-2 = A.
Definition 3. A dynamic deterministic model for GPSP (or simply a GPSP-model) over the sets of basic actions A and conditions C is a triple Mq = (J7, £, £} such, that
• J7 = (S', so, Q) and £ = (R, ?’o, P} are frames over A,
• £: S' —> C is a valuation function indicating for every data state seS a condition, c G C which, is satisfied at s.
2.3. Equivalence-checking problem for GPSPs
Definition 4. Let 71 = (Ptt, entry, exit,T,B) be som.e GPSP over the sets of basic actions A and conditions C, and Mq = (J7, £,£} be a GPSP-m.odel based on, frames J7 = (S', s0, Q) and £ = (R, ?’o, P}. Then, a finite or infinite sequence of quadruples
p={F1,c1, si,?’i), {F2,c2, s2,r2),..., (Fi,Ci, Si, r.j),... , (1)
where for every i, i > 1, F.j G 7^- U {enirj/}, c,; G C, s.; G S', r,; G -R, is called a run of 71 on Mq if
1. Fi = entry, Si = [A];f, n = [A]f, Cl = ^(si);
2. for every i, i > 2, one of the following alternatives holds:
• either Fi = exit and (Fi, c.,, s.i^r.j ) is the last quadruple in, (1),
• or Fi ^ exit and
Fi+i = T{Fi,Ci),
s-i+i = [B(F1,c1,c2, . . . , c.j )]_f, r,:+i = [B(F1,c1,c2, . . ., c,:)]f, ci+1 = ^(s*+i )•
If p is finite and (Fn, cn, sn, rn) is its last element, we say that p terminates
having the state r = P(rn, stop) G R as the result of the run p. If p is an
infinite sequence, we say that p loops and has no results. Since GPSPs and
frames under consideration are deterministic, every program 7r has the unique run p(tt,Mg) on a given model Mg- We denote the result of p(tt,Mg) by \p{7r, Mq)\, assuming that \p{7r, Mq)\ is undefined if p(ir, Mg) loops.
Let it' and n" be some GPSPs, M a GPSP-model, and J7, £ be frames. Then tt1 and 7r" are called
• equivalent on Mg (it1 ~mg ’K" i in symbols) if \p(n', Mg)\ = \p(n”, Mg)\,
i.e., either both runs p(n', Mg) and p(iv", Mg) loop (and hence have no results) or both of them terminate with the same state r as their results;
• equivalent on J7, £ (n1 n11, in symbols) if tt1 ~mg n" f°r every
model M = (J7, £, £} based on T and £.
The equivalence-checking problem w.r.t. frames J7, £ is to check, given an arbitrary pair tt', tx" of GPSPs, whether ti' tt" holds. When the decidability
and complexity aspects of the equivalence problem are concerned, the frames J7, £ under consideration are assumed to be effectively characterized in logic or algebraic terms.
3. Embedding sequential and recursive programs into GPSPs
In this section we show that the computational model of generalized proposi-tional sequential programs is sufficiently expressive for presenting uniformly the equivalence-checking problem for various models of computer programs. We consider two models of computer programs—sequential imperative programs with multiple outputs and linear recursive programs—and demonstrate the embedding of these models into GPSPs.
3.1. Sequential programs with multiple outputs
As was observed in Section 1, if more than one output statements are executed along a run of a sequential program, the result of computation is specified not by the final data state when the program terminates, but by the sequence of data states reached by the program after successive execution of output statements. Formally, the equivalence-checking problem for this class of programs can be defined as follows.
Suppose that the set of basic actions is split into disjoint sets Ai and Ai- The actions from A\ are used just to output current intermediate results, whereas those from Aa are conventional actions whose execution is invisible to the external observer.
Definition 5. A deterministic propositioned sequential program. (PSP) over sets A, C, V is a finite labeled transition system. 7r = (Pv, entry, exit,T, B), where
• Vtt denotes the set of program, points in, n;
• entry is the initial point of the program.;
• exit is the terminal point of the program.;
• T: {V-n U {entry}) x C —> (Vtt U {exit}) is a transition function;
• B: P7,U {entry} —> A is a binding function.
Let J7 = (S', so, Q) be a frame and £ a valuation function on J7. Then the run of PSP 7r on the PSP-model M = (J7, £} is a finite or infinite sequence of triples
p={F1,c1,s1),{F2,C2,S2),...,{Fi,Ci,si),... , (2)
such that
1. F.j G Vtt U {entry}, Cj G C, Si € S' for every i, i > 1,
2. Fi = entry, ci = £(s0), si = [AJ.r-j
3. for every i, i > 2, one of the following alternatives holds:
• either F.j = exit and (Fi, c.;, s,;) is the last triple in (2),
• or T(Fi, Ci) ^ exit and
Fi-i-i T(Fi, Ci), s.^i Q(s.i, B(Fi)), Ci-i-i
If p is finite and (Fn, cn, sn) is its last element, we say that p terminates. The result ?’(7T, M) of a terminating run p of a PSP 7r on a model M is defined as follows. Let ?’i, *2, • • •, ik be the sequence of all indices such that B{Fij) G Ai- Then r(7T, M) = (s^, s,;2 .. ., Sik). If p does not terminate then the result ?’(7T, M) is undefined.
The equivalence of PSPs 7I7 and tt" on models and frames is defined analogously to that of GPSPs.
Now we show that the equivalence-checking problem for PSPs with multiple outputs can be reduced to the equivalence-checking problem for GPSPs.
With every PSP 7r = (P-ki entry, exit, T, B) we associate a GPSP 7T = (Pm entry, exit, T, B) such that B{F,c) = B{F) for every procedure F € Pv and every condition c G C.
Given a PSP-model M = (J7, based on the frame J7 = (S', so, Q), we consider a GPSP-model Mg = (J7, £,£,), where the frame £ = (R,ro,P) is defined as follows:
1. R is the set of all finite sequences (si, s2,..., sk) of data states from J7;
2. the initial state ro is the sequence (so);
3. for every state r = (si,..., sk) in R and every action a in A U {stop}
(a) P(r,a) = (si, s2, • • •, sk-1,Q{sk,a),Q{sk,a)) if a € Ai;
(b) P(r,a) = (si, s2, • • •, sk-i,Q(sk,a)) if a e A2;
(c) P(r, stop) = (si, s2, • • •, Sfc-l, so}-
The following theorem shows that PSPs with multiple outputs can be embedded into GPSPs:
Theorem 1. Let tx\ and n2 be PSPs and M = (J7,!;) a PSP m.odel. Consider the GPSPs 7Ti, 7T2 and the GPSP-m,odel Mq = as defined above. Then,
1. tt\ 1*2 7ri ~mg
2. 7Ti 7T2 7Tl 7T2.
Moreover, if J7 is a semigroup frame then, £ is a semigroup frame as well.
Thus, the equivalence-checking problem for sequential programs with multiple outputs can be reduced to the equivalence-checking problem for GPSPs without loss of specific algebraic features of semantics.
3.2. Linear recursive programs
Let A and V be the sets of basic actions and procedures, respectively. By a term we mean any finite sequence of basic actions and procedures. A term t, is called linear if at most one procedure occurs in t, and the rightmost element of t, is a basic action. The set of all linear terms over AWP is denoted by LinTerm. We write F € t to indicate that a procedure F occurs in t,. If t, = aia2 .. ,an then the term an ... a2ai is called the reverse of t, and denoted by £-1.
A definition of a procedure F is an expression D of the form
F = if c1 then t,i else
if c2 then t-2 else
if cI_1 then tq-i else tj
where t„i € LinTerm,, c' G C = {c1, (?,..., c1}, 1 < i < I. The definition above will be also written as
F: (c1,*iMc2,^),...,^1,*/) . (3)
The first occurrence of F in D is called the head of D, and the list of pairs (c1, ), (c2,t2), ..., (c1 ,tq) is the body of D. For every pair (c*,!,;) in the body
of D, the term ti is called a c'-variant of the definition D.
Definition 6. A (deterministic) linear recursive program (LRP) over the sets A, C, V is a tuple 71 = (G, D2,..., Dn}, where
• G € LinTerm, is the goal of the program,;
• Di, D2, • • •, Dn are definitions of pairwise different procedures Fi,...,Fn.
The set of procedures {Fi,...,Fn} defined in an LRP 7r is denoted by Vv. Given a procedure F in Vv, we write D-n{F) for the definition of F in tt, and D-n{F, c) for the c-variant of D-n{F). If a program is understood, the subscript 7r will be omitted. It is also assumed that every procedure occurring in 7r is defined in 7T.
The semantics of LRPs is defined by means of dynamic frames and models.
Definition 7. Let 7r = (G, Di, D2,..., Dn) be some LRP and M = a
model based on, a frame 3~ = (S', so, Q) ■ A finite or infinite sequence of triples
P = {ti, si,ci),{t2, s2,c2) ... ,{ti, Si,c.i),... , (4)
where t„i € LinTerm,, s.j G S', c,; G C, i > 1, is called a run of 7r on M if 11 = G and for every i, i > 1, one of the following conditions holds:
1. if ti is a basic term, then, Si = Q*(s*-i, t.^1), c,; = ^(s,:), and the triple
(ti,Si,Ci) is the last element of (2);
2. if ti is a non-basic term, of the form, ti = TFt, where F £ V7r, t € A*,
then, s.i = <5*(sj_i,i_1), c.i = £(si), ti+1 = TD(F,Ci).
If p is finite and the triple (sm,cm,tm) is its last element, we say that p terminates with result sm . If p is an infinite sequence, we say that p loops. Since LRPs and frames under consideration are deterministic, every program 7r has a unique run p(ir,M) on a given model M. We denote by \p(ir,M)\ the result of p(ir,M), assuming that \p(ir,M)\ is undefined if p(ir,M) loops.
The equivalence of LRPs tt1 and n" on models and frames is defined similarly to that of GPSPs.
Now we show that the equivalence-checking problem for LRPs can be reduced to the equivalence-checking problem for GPSPs.
First, we define a translation from LRPs into GPSPs. Given the set A of basic actions for LRPs, we introduce a set A of basic actions for GPSPs by
taking A = {(A, a) : a G A} U {(a, A) : a G ^4}. For any pair of basic terms ti = and t<2 = a'1,...,a'm over A we denote by (t2,ti) the term
(A, ai)... (A, afc)(a/1, A)... (a'm, A).
Let 71 = (G, D1, D2, • • •, Dn} be an LRP over A, C, and "P. The corresponding GPSP W = (TV, entry, exit, T, B) is defined as follows:
1. VW=V7T, entry = G;
2. for every procedure / G and every condition c G C,
(a) if D7t{F, c) = t, where t is a basic term in A*, then T{F,c) = exit and B{F,c) = (A, t);
(b) if D1T{F,c) = t'F't, where t,are basic terms in ^4* and F' is a procedure in V-n, then T(F,c) = F' and B(F,c) = (t',t).
Next we relate dynamic models for LRP with GPSP-models. Let M = (J7, £} be a dynamic model over the set of basic actions A and conditions C. Then the corresponding GPSP-model Mq = (J7, £, £} is obtained from M by adopting the updating function Q to the basic actions from A
Q(s, (t',t)) = Q(s,t)
and by adding to M a frame £ = (R, r0, P} such that
1. R=(AU {stop}) x S';
2. r0 = (A, so) is the initial state;
3. the updating function P: Rx (AU {stop}) —> R is defined for each r = (t, s) in R by the following equalities:
(a) P(r, stop) = (stop, Q*{s,t));
(b) P(r, (a, A)) = (at, s);
(c) P(r, (A, a)) = (t,Q(s,a)}.
The following theorem shows that LRP can be embedded into GPSPs:
Theorem 2. Let tx\ and n2 be LRPs, and M = (J7,!;) a PSP model. Consider the GPSPs 7f 1, 7f2 and the GPSP model Mq = (J7^,^) as defined above. Then
1. tt\ ~mg ^2;
2. 7T1 7T2 -^=> Tf\ K2•
Moreover, if J7 is a semigroup frame then £ is a semigroup frame as well.
Thus, the equivalence-checking problem for linear recursive programs can be reduced to the equivalence-checking problem for GPSPs without loss of specific semantic features.
4. How to design a polynomial time
equivalence-checking algorithms for GPSPs
In this section we present an approach to the design of efficient equivalence-checking algorithms for GPSPs w.r.t. some ordered semigroup frames. Its key idea is as follows. Given frames J7, £ and a pair of programs 7T1, 7r2, we first choose some specific semigroups U and V to encode all pairs of states (s', s") in J7 and (r',r") in £. This encoding is intended to estimate the extent to which the intermediate data states of program runs “diverge” to that moment. Then, using this encoding, we construct a graph structure T'^1^2 to represent all pairs of runs /9(7^, M), p(n2,M) of programs 7T1, n2 on the models based on the frames T and £. We show that to check the equivalence of 7r1 and 7r2 we only need to analyze a fragment of r7I-i 7I-2 whose size is polynomial in 17T11 and 17r21. The construction of Twi w2 involves solutions to the reachability problem “s' C s"?” for the frame J7 and the identity problem “11/ = w"?” for the semigroups U and V. If these problems are decidable in polynomial time, the equivalence-checking problem for GPSPs w.r.t. J7, £ is decidable in polynomial time as well. Using this technique, we demonstrate that the equivalence-checking problem for LRPs w.r.t. the frames associated with free commutative monoids is decidable in polynomial time.
Suppose that U is a finitely generated monoid, and u* are the distinguished elements in U. Denote by o and e the binary operation on U and the unit of U, respectively.
Definition 8. The triple K = (?7, «+,«*} is said to be a criteria system for a semigroup frame J7 = (S', so, Q) if K and J7 meet the following requirements:
(Rl) there exists a homomorphism ip of S' x S' into U such, that [*i] = N u+ O v?(([*i], [*2]}) ou* = e
holds for every pair ti, t,2 in A*,
(R2) for every element u in, U o v* the equation, X o u = e has at most one
solution, X in, the coset u+ o U.
Let J7 = (S', so, Q) and £ = (R, r0, P} be semigroup frames, and J7 an ordered frame. Suppose that Kjr = (U, w+, u*) and K£ = (V, v*) are criteria systems
for these frames such that ip: S' x S' —> U and t[>: R x R —>■ V are the required
homomorphisms. We assume that the coset u+ o U is divided into four disjoint
sets U= = {«+ o ip((s, s})}, ?7< = {«+ o ip((s',s")) : s' IZ s"}, i/> = {«+ o ip((s',s")) : s" IZ s'} and Uq = (u+ o U) — (U= U U< U t/>). Since T is an ordered semigroup frame, checking reachabilities s' C s" and s" C s' would suffice to decide which of these classes contains w+ o tp((s', s")).
Given a pair of GPSPs 7T1, n2 such that V^i fl Vw2 = 0, we define a rooted labeled directed graph T'^1^2 as follows.
The nodes of IV^ are quadruples (Fi, F2, u, v) such that Fi and F2 are procedures from 7r1 and 7r2, respectively, and u, v are elements from the cosets w+ o U and o V, respectively.
The root of r7I-i 7I-2 is the node wq = (entry, entry, u+, v+).
The arcs of rwi w2 are marked with pairs (ci, c2) in (Cu{t}) x (Cu{t}). The arcs connect nodes in T'^1^2 according to the following rules. Consider an arbitrary node w = (Fi,F2,u,v) in r7I-i7I-2.
1. If u G Uq and FuF2 ^ exit then for every pair [ci,c2) G C xC the arc marked with [ci,c2) leads from w to w' = (F[, F^, u', v') such that F{ = T^i(Fi,ci), F'2 = T7T2(F2,c2), v! = tiov?(([i1]f,[i2]f)), v' = vo ^(([ti]£, Nf)), where ti = B^i(Fi,ci) and t2 = B7T2(F2, c2)\.
2. If u G ?7> or Fi = exit then for every c G C the arc marked with (t, c) leads from w to the node iv' = (F, F^, u!, v') such that F.2 = Tlr2(F2,c), v! = Mo^(([A]r, fair)), v' = v o ?/>(([%, [t2jf)), where t2 = B7T2(F2,c)}.
3. If u G U<_ or F2 = exit then for every c G C the arc marked with (c, t) leads from w to the node w' = (F[,F2,v!,v') such that F[ = Tlri(Fi,c), v! = uo Lp(([ti\T, [A]^}), v' = v o?/)(([t1]£, [%)), where ti = B^i(Fi,c)\.
4. If u G U= and Fi, F2 ^ exit then for every c G C the arc marked with (c, c) leads from w to the node w' = (F[, F^, u!, v1) such F{ = T^i(Fi, c), F2 = T7t2(F2,c), v! = UOI^(([i1]j:,[t2]f)), V1 = vo?/)(([t1]£, [t2]£)), where ti = B^i(Fi,c) and t2 = B7T2(F2, c)].
The directed paths in r7I-i7I-2 encode all possible pairs of runs ?’ (7r1, M), r(jr2, M) of the GPSPs 7r1 and 7r2 on the models M based on the frames T and £. The main characteristic feature of the graph T'^1^2 is presented in the following lemma:
Lemma 1. Suppose ivo, ivi,... ,ivm, m > 1, is a finite sequence of nodes in Twi w2 such, that iv0 is the root ofT and iv.j = (Fl, F%,ul,v‘ ), 1 < i < m. Then, this sequence of nodes forms a path,
in, r7I-i 7I-2 iff there exists a GPSP-model M based on the frames T and £ such, that each, of the runs r( 7r®, M), i = 1, 2, of the programs 7r1 and 7r2 /(.as a prefix of the form,
(enirt/, cf, sf, rf ), (i^1, cf, sf, ?f ),..., (if, cf, sjk, if ),
where Cj1, cf2,..., cjk is the subsequence of all those elements in, cj, c|,..., c™ that are different from, s. Moreover, for every j, 1 < j < m, these prefixes satisfy the following requirem.ents
v? = u+ o<^((s'/,S23')),
Z1 I2
V3 = o ^((r]5 , T2-7 )), where I) = max {/ : I < j, I ■ {j\,jb ■ ■ ••./;! | • i = 1, 2.
Proof. By induction on m, using the definition of r7I-i 7I-2. □
A node iv in r7I-i 7I-2 is said to be a O-rejecting node if it satisfies one of the
following conditions:
1. iv = (exit, exit, w, v) and v o v* ^ e;
2. iv = (F1, F2, u, v) is such that one of the procedures F1, F2 is marginal,
whereas the other one is a non-marginal.
Clearly, given a decision procedure for the identity problem uv' = v"?” on V it is easy to check whether iv is a O-rejecting node.
A node ivo in r7I-i 7I-2 is said to be a l-rejecting node if there exists an infinite path ivo, ivi,... ,ivn,... in r7I-i 7I-2 which starts at ivo and satisfies one of the following conditions:
1. almost all nodes ivn = (F1, F2,u,v) in this path are such that u € ?7< and F2 is terminated procedure;
2. almost all nodes iv„ = (F1, F2,u,v) in this path are such that u € ?7>
and F1 is terminated procedure.
Lemma 2. 7T1 tt2 iff no rejecting nodes are accessible from, the root r7I-i 7I-2.
Proof. Follows from Lemma 1 and requirement (Rl). If a l-rejecting node is accessible from the root of rwi w2 then there is a GPSP-model M such that one of the runs p{p(jr2,M) terminates, whereas the other loops. If a node iv = (F1, F2, u, v) is accessible from the root of rwi w2 and one of the procedures, say F1, is marginal, whereas the other (F2) is a non-marginal then
there is a GPSP-model M such that the run p{ж1,М) terminates and the run p(n2,M) loops. If a node w = (exit, exit, u, v) is accessible from the root of Twi к2 and v ° v* Ф e then there is a GPSP-model M such that both runs /9(7^, M), p(n2,M) terminate but [/э(7Г1, Af)] ф \p(n2, M)\. □
Lemma 3. Suppose that both procedures F\, F2 are terminated and two different nodes w' = (Fi, F2, u, v'), iv" = (Fi, F2, u, v") are accessible from, the root of Twi w2. Suppose also that neither w', nor w" is a 1-rejecting node. Then, some О-rejecting node is accessible from, the root of Twi w2.
Proof. By Lemma 1, each path in Г„1 „2 is associated with the pair of (prefixes of) runs р(тг1,М), p{n2,M). Since F\ is a terminated procedure and T is an ordered frame, we may assume that р(пг,М) terminates. Since w' and w" are not 1-rejecting nodes, this means that two different nodes of the form ■Wi = (exit, Gii «a, v[) and w" = (exit, G2: wi, v") are accessible from w' and ■w" respectively. The requirement (R2) of the criteria system Kg guarantees that v[ ф v'{. If Go, is non-marginal then each of the nodes iv[ and iv" is 0-rejecting. Otherwise, by applying Lemma 1, we may assume that p(n2,M) also terminates. Then two different nodes of the form w'2 = (exit, exit, «2,^2) and W2 = (exit, exit, «2,^2 ) are reachable from iv[ and w". But, by the
requirement (R2) of the criteria system Kg, at most one of the elements iv'2,
W2 may be equal to e. Hence, at least one of the nodes w'2, w2 is 0-rejecting. □
Lemma 4. Suppose both, procedures F\, F2 are pre-marginal and the node iv = (Fi, F2,u,v) is accessible from, the root of Г„1 „2. Suppose also that и ^ U= and iv is not a 1-rejecting node. Then, some О-rejecting node is accessible from, the root of rwi w2.
Proof. If FuF2 are pre-marginal nodes and и ^ U= then we may find a pair (Ci, c2) of conditions such that the arc marked with (ci,c2) leads from iv to a node w' = (F{, F2,«/, v'), where one of the procedures F{, F2 is marginal,
whereas the other is non-marginal. □
Lemma 5. Let N = (max(|7Ti|, 17Г2 |))2 + 1, and F\, F2 be a pair of procedures such, that one of them, is non-marginal, whereas the other is terminated. Suppose that at least N pairwise different nodes ivi = (Fi, F2, u1, v1),..., ivn = (Fi, F2,uN,vN) are accessible from, the root 0/Twi w2 and all these nodes are not 1-rejecting. Then, some О-rejecting node is accessible from, the root o/Twi w2.
Proof. If exactly one of the procedures F\, F2 is non-terminated or marginal then, by Lemma 1, a О-rejecting node is accessible from any iVj . If u‘ = uJ holds for some pair i,j, then v'1 ф i>i and, hence, by Lemma 3, some О-rejecting node
is also accessible from the root. Thus, it suffices to consider the case when (1) all elements wi,..., un are pairwise different and (2) both procedures F2 are non-marginal and terminated. It follows from (2) that from any node w.j it is possible to reach a node w' = (F[, F2, w', it) such that m,' G Uq U U= and one of the procedures, say F[, is pre-marginal. If F2 is not pre-marginal then at least one of the successors of w' in T'^1^2 is a O-rejecting node. Suppose that both F[ and F2 are pre-marginal. Then a consequence of (1) and the requirement (R2) for criteria system K^1 is the fact that a node of the form w'j = {F[,F2, u'p it) is also reachable from another node Wj (where i =/= j), and, moreover, w' =/= u'j. By the requirement (R2) of the criteria system K-f, at most one of the element w', w' is in U=. Hence, by Lemma 4, a O-rejecting node is accessible from the root of Twi w2. □
Lemma 6. Let N = (max( 17Ti |, |tt2 |) )2 +1, and F\, F2 be a pair of marginal procedures. Suppose that N + 1 pairwise different nodes wo = (-Fi, F2, u0, i’o), wi = (Fi, F2, mi, i’i),..., wn = (Fi, F2,un,vn) are accessible from, the root of T'^1^2 and i’o ^ I’i for ali i, 1 < i < N. Then, som.e O-rejecting node is accessible from, the root of r7I-i 7I-2.
Proof. By combining the arguments used in the proofs of Lemmas 4,5. □
Lemma 7. Let N = (max( |t\\ |, ||) )2 +1 . and F\, F2 be a pair of marginal procedures. Suppose that N + 1 pairwise different nodes wo = (-F\, F2, «1, v), wi = (Fi, F2, mi, v),..., wn = (Fi, F2,un,v) are accessible from, the root ofT^i^. Then, a O-rejecting node is accessible from, wo only if a O-rejecting node is accessible from, som.e w.j, 1 < i < N.
Proof. By combining the arguments used in the proofs of Lemmas 4,5. □
Theorem 3. Suppose that IF = (S,so,Q) and £ = (i?,?’o,P) are semigroup frames, and Kjr = (t/, u+,u*) and Kg = (V, v+,v*) are criteria system,s for these frames such, that the identity problem, “x = y?” is decidable in, both, semigroups U and V in, time T\ (n). Suppose also that IF is in, addition, an, ordered fram.e such, that the reachability problem, ’’[t'] C [t"] ?” is decidable in, time T2{n). Then, the equivalence-checking problem, “k1 n2?” is decidable
in, time O (n6 (T\ (O (n4)) + 72 (O (n4)))), where n = 111 ax( 17Ti |, 17T2 |) -
Proof. Let 7r1 and 7r2 be GPSPs, and n = max(|7Ti|, 17T2 |). By Lemma 2, the equivalence-checking problem for 7r1 and n2 is reduced to the accessibility-checking of rejecting nodes in r^i ^2. Consider an arbitrary pair of procedures F\ € V^i and F2 € V^. We will show that to check the accessibility of a rejecting node from the root of Twi w2 it suffices to analyze only a bounded
number of the nodes (Fi, F2, u, v) for every pair of procedures Fi, F2.
If both procedures F\, F2 are non-terminated then it is clear that no rejecting nodes are accessible from any node of the form {F\, F2, u, v).
If one of the procedures F2 is terminated, whereas the other is nonterminated, then Lemma 1 guarantees that some rejecting node is accessible from any node of the form {F\, F2, u, v).
Now consider the case when one of the procedures F2 is non-marginal and
the other is terminated. As evidenced by Lemmas 3-5, if n2 + 1 nodes of the form w = (_Fi, F2, u, v) are accessible from the root of T'^1^2 then either one of these nodes is 1-rejecting, or some O-rejecting node is accessible from the root of r^l j7r2 .
Finally, suppose that both procedures F2 are marginal. Then, by Lemmas 6 and 7, it suffices to consider only 2n2 + l nodes of the form {F\,F2, u, v) to check the accessibility of any rejecting node via some node of the form (Fi, F2, u, v). Thus, to check the equivalence 7r1 71-2 one need only to check the rooted
fragment of rwi w2 which includes at most 2??4 + n2 nodes. When constructing such a rooted fragment of size m we are forced to check inequalities [£'] C [t"] and identities u+ o <^(([^], [*'2]}) = u+ o v+ o ^(([^i], [4])) = v+ o
^(([^i], 2]))j where the size of terms t', t", t[, t'2, t", is O(m). □
To demonstrate the use of Theorem 3, we consider the equivalence-checking problem for linear recursive programs w.r.t. commutative frames. Let !Ffc be a frame associated with a free commutative monoid. Suppose A = {a1,.. ., aN} and denote by Z a free Abelian group of range N generated by some elements (ji, ..., q?f. Then K = (Z, Z, e, e) is a criteria system for J-fc, assuming </?({[a.;], [A]}) = qj and <^>(([A], [a^]}) = qj1 for every pair of actions a*,aj. It should be noted that the reachability problem in !Ffc and the identity problem in Z are decidable in linear time. Hence, by Theorem 3 the equivalence-checking problem for GPSPs w.r.t. !Ffc is decidable in polynomial time.
As in Section 3, given the set A of basic actions for LRPs, we introduce the set A of basic actions for GPSPs: A = {(A, a) : a G ^4} U {(a, A) : a G ^4} and translate every LRP 7r into GPSP W. Given a free commutative frame IF = (S,so,Q), we introduce a pair of frames J7 = (S', so, Qi),£ = (S,so,Q2) such that
Qi(s,(A,a)) = Q(s,a);
Qi(s,(a,A,)) = s;
Q2(s, (A, a)) = Q(s,ci);
Qi(s,(a,A,)) = Q(s,ci).
Theorem 4. Let tx\ and n2 be a pair of LRPs, and a frame J-fc is associated with, a free commutative frame. Then, the frames J7, £ defined above are
associated also with, free commutative mono-ids and
jTl 7|"2 7fi ^2•
By combining Theorems 3 and 4 we arrive at
Corollary 1. The equivalence-checking problem, for linear recursive programs w.r.t. frames associated with, free commutative mono-ids is decidable in, polynomial time.
5. Conclusions
We introduce a new model of computation—a polysemantic model of proposi-tional sequential programs (GPSPs)—into which both sequential and recursive models of programs can be embedded. This gives a uniform framework for studying the equivalence-checking problem for various classes of programs. This framework substantially extends the algebraic formalism of propositional models of computer programs developed in [27, 10, 18, 19]. An attempt to introduce program semantics where the intermediate and final results of computations are separated was initiated in [16]. In that paper the first-order model of sequential programs is considered and final results of computations are defined as a projection of intermediate results on some subset of program variables. But unlike our approach, this type of semantics for final results does not maintain the composition of program statements.
Theorems 3 and 4 demonstrate that some equivalence-checking techniques initially developed for propositional models of sequential programs [28] can be readily adopted to a more general model of computation—generalized propositional sequential programs. This gives us a hope that some new decidable cases of equivalence-checking problem can still be found.
References
[1] E. Ashcroft, Z. Manna, A. Pnueli, A decidable properties of monadic functional schemes, J. ACM, vol 20 (1973), N 3, p.489-499.
[2] R. Bird, P. Walter, Introduction to Functional Programming, 1988, Prentice-Hall, Englewood Cliffs, NJ.
[3] J.W. De Bakker, D.A. Scott, A theory of programs. Unpublished notes, Vienna:IBM Seminar, 1969.
[4] A.P. Ershov, Theory of program schemata. In Proc. of IFIP Congress 71, Ljubljana, 1971, p.93-124.
[5] S.J. Garland, D.C. Luckham, Program schemes, recursion schemes and formal languages, J. Comput. and Syst. Sci., 7, 1973, p.119-160.
[6] D. Harel, Dynamic logics. In Handbook of Philosophical Logics, D. Gabbay and F. Guenthner (eds.), 1984, p.497-604.
[7] Y. Hirshfeld, F. Moller, Decidable results in automata and process theory. LNCS, 1043, 1996, p.102-148.
[8] V.E. Kotov, V.K. Sabelfeld, Theory of program schemes, 1991.
[9] A.A. Lapunov, Yu.I. Yanov, On logical program schemata, In Proc. Conf. Perspectives of the Soviet Mathematical Machinery, Moscow, March 12-17, 1956, Part III.
[10] A.A. Letichevsky, On the equivalence of automata over semigroup, Theoretic Cybernetics, 6, 1970, p.3-71 (in Russian).
[11] A.A. Letichevsky, Equivalence and optimization of programs. In Programming theory, Part 1, Novosibirsk, 1973, p. 166-180 (in Russian).
[12] A.A. Letichevsky, L.B. Smikun, On a class of groups with solvable problem of automata equivalence, Sov. Math. Dokl., 17, 1976, N 2, p.341-344.
[13] D.C. Luckham, D.M. Park, M.S. Paterson, On formalized computer programs, J. Comput. and Syst. Sci., 4, 1970, N 3, p.220-249.
[14] M.S. Paterson, Program schemata, Machine Intelligence, Edinburgh: Univ. Press, 3, 1968, p.19-31.
[15] M.S. Paterson, Decision problems in computational models, SIGPLAN Notices, 7, 1972, p.74-82.
[16] G.N. Petrosyan. On the decidable cases of the inclusion problem for sequential program schemes, in System Informatics and Theory of Programming. Novosibirsk, 1974, p.130-151 (in Russian).
[17] S. Peyton-Johns, The implementation of Functional Programming, 1987, Prentice-Hall, Englewood Cliffs, NJ.
[18] R.I. Podlovchenko, Hierarchy of program models, Programming and Software Engineering, 1981, N 2, p.3-14 (in Russian).
[19] R.I. Podlovchenko, Semigroup program models, Programming and Software Engineering, 1981, N 4, p.3-13 (in Russian).
[20] R.I. Podlovchenko, V.A. Zakharov, On the polynomial-time algorithm deciding the commutative equivalence of program schemata, Reports of the Soviet Academy of Science, 362, 1998, N 6 (in Russian).
[21] H.G. Rice. Classes of recursively enumerable sets and their decision problems. Trans. Amer. Math. Soc., bf 89, 1953, p. 25-59.
[22] V.K. Sabelfeld, Logic-term equivalence is checkable in polynomial time. Reports of the Soviet Academy of Science, 249, 1979, N 4, p.793-796 (in Russian).
[23] V.K. Sabelfeld Tree equivalence of linear recursive schemata is polynomialtime decidable, Information Processing Letters, 1981, 13, N 4, p.147-153.
[24] V.K. Sabelfeld, An algorithm deciding functional equivalence in a new class of program schemata, Theoret. Comput. Sci., 71, 1990, p.265-279.
[25] M.A. Taiclin,The equivalence of automata w.r.t. commutative semigroups, Algebra and Logic, 8, 1969, p.553-600 (in Russian).
[26] V.A. Uspensky, A.L. Semenov, What are the gains of the theory of algorithms: basic developments connected with the concept of algorithm and with its application in mathematics. LNCS, bf 122, 1981, p.100-234.
[27] J.I. Yanov, To the equivalence and transformations of program schemata, Reports of the Soviet Academy of Science, 113, 1957, N 1, p.39-42 (in Russian).
[28] V.A. Zakharov, The efficient and unified approach to the decidability of the equivalence of propositional programs. In LNCS, 1443, 1998, p. 246-258.
[29] V.A. Zakharov, On the decidability of the equivalence problem for orthogonal sequential programs, Grammars, 2, 1999, p.271-281.
[30] V.A. Zakharov, On the decidability of the equivalence problem for monadic recursive programs, Theoretical Informatics and Applications, 34, 2000, p. 157-171.