The star-height of a finite automaton and some related questions
B.F. Melnikov
Abstract—The paper is related to the star-height for non-deterministic finite automata, not for the star-height problem for regular languages. We describe an alternative proof of Kleene's theorem, and then our version of the reduction of some problems related to the star-height to nondeterministic finite automata. For a given regular language, the corresponding finite automaton is constructed; the method we are considering has the property that the reverse construction gives the original regular expression.
The publication of this not-so-complicated problem has two goals, the both are related to the future development of the topic under consideration. First, we assume the future development of the topic with the aim of describing a similar approach for generalized regular expressions. Secondly, another generalization is proposed, namely, consideration of the structures we describe for the so-called extended automata. Thus, the material of this article is necessary for the definition of extended generalized pseudo-automata, which the author proposes to cite in the next publication.
Keywords—nondeterministic finite automata, regular languages, Kleene's theorem, star-height problem.
I. INTRODUCTION AND MOTIVATION
This paper summarizes some previous publications of the author. Among these publications, we note papers [1], [2], [3] (in chronological order).
In [1], only some schemes of algorithms were published.
In [3], an algorithm was given that relates Kleene's theorem and the star-height. This material has for now been published in Russian only; besides, due to the termination of Samara State University in 2015, the website of the journal is now unavailable.
In the current paper, the author gives the English version of [3], where also some noted misprints are corrected, and for some statements simpler proofs are given.
And the paper [2] was published long ago, it contained another algorithm that also describes the connection between Kleene's theorem and the star-height.
The current paper is related to the star-height for nondeterministic finite automata, its subject is practically unrelated to much more complex problems, i.e., to both "ordinary" star-height problem and generalized star-height problem for regular languages.
For these problems for the languages, let us only make some remarks. The first of mentioned problems, i.e., the "ordinary" star-height problem, was set in 1963 in [4] and solved in 1988 in [5]; in [6], however, this solution was called "extremely difficult".
In 2005, there was published the much more understandable solution of D.Kirsten, see [7]. In 2015, this solution
Received May 12, 2018.
Boris F. Melnikov, Russian State Social University (email: bf-melnikov@ yandex.ru).
was some improved using an approach of the game theory, see [8]. And unlike the previous comment, the last proof was called "elegant". Although it does improve the understanding of the proof of D. Kirsten, the number of automata being examined due to this "elegance" does not decrease.
The second of these mentioned problems is the generalized star-height problem. For now, it is unknown the answer to the question whether or not this problem is decidable. The author also does not know significant papers on this subject published after [9] (1992). The connection between the generalized star-height problem and generalized pseudo-automata (see [10] for the last formalism) is to be considered in one of the following publications.
In this paper we describe, firstly, an alternative proof of Kleene's theorem, and secondly, our version of the reduction of some problems related to the star-height to nondeterminis-tic finite automata. Both these topics represent an alternative to the particular version of the reduction proposed back in 1963 in the above-mentioned paper [4]. As we already noted above, the subject of this paper is practically unrelated to the complex star-height problems. However, the publication of this not-so-complicated problem has two goals, the both ones are related to the future development of the topic under consideration. First, we assume the future development of the topic with the aim of describing a similar approach for generalized regular expressions (see [11] etc.) and generalized automata (see [10]). Secondly, another generalization is proposed; namely, we shall describe the consideration of our structures for so-called extended automata (see [12]) and, accordingly, extended pseudo-automata. Let us note, that all concepts needed for such extensions are present in this paper.
The structure of this paper is as follows. Section II briefly describes the used notation. In Section III, we consider a special version of the proof of Kleene's theorem.
In Section IV, we define the main object of this paper, i.e., the star-height of a finite automaton. This definition is built on the basis of its transition graph only, and is not associated with the marks of its edges.
In Section V, for a given regular language, the corresponding finite automaton is constructed. Of course, this problem was solved more than 60 years ago, but the method we are considering has the property that the "reverse construction" gives the original regular expression.
In Conclusion (Section VI), we formulate the directions for the further research.
II. Preliminaries
We shall not describe in detail the notation used in this paper; they are exactly the same as those used in [13]. Like
there, the "main" automaton under consideration will be denoted by
K = ( Q, S,S,S,F). (1)
We shall use the classical definition of the star-height of the regular expression, see [7], [11]. We define it (denoting by SH(r) for expression r) by the induction in the following way:
• SH(0) = SH(0*) = SH(a) = 0 for each a e S;
• for each regular expressions r and s, we set
SH((r + s)) = SH((r • s)) = max( SH(r), SH(s));
• for each regular expression r (where r = 0), we set
SH((r*)) = SH(r) + 1.
Remark that sometimes, another definition is considered: SH(0*) is defined as 1. There is easy to prove, that such difference is important (i.e., two different definitions for expressions give two different values of the star-height of the regular languages) for some finite languages only.
Besides, some new notation will be described as necessary; most of these notations are associated with paths in the transition graph of the considered automaton.
III. A SPECIAL VERSION OF THE PROOF OF KLEENE'S THEOREM
Let us consider a special version of the proof of Kleene's theorem. For the given automaton (1), let us consider some injective "ordering" function
t : Q ^ R+.
We shall also write, e.g., p < r meaning t(p) < t(r); also we shall use notation "max", meaning
max(p, r)
if T(P) > T(r) I r, if T(p) < T(r).
etc. Let us fix K and t in this section.
For some states q,p, r e Q, where p > q and r > q, let us consider some simple path from p to r, whose sequence of vertices is
fe qi,q2 ,...,qs,r),
such that
s > 0 and (Vi e {1, ...,s}) (qi > q).
(We allow p = r. In this case, i.e., if p = r, it is a simple loop.) We shall denote the set of all such paths by Aq (p, r). We also set
Aq = { q' is a state of a path of Aq(q, q) and q' = q } . Some more notation:
• if Aq(p, r) = 0, then we shall write Vq(p, r);
• otherwise, Vq(p, r);
• if both Vq(p, r) and Vq(r,p), then we shall write
Wq (p,r).
For each states s, f e Q (we allow s = f), we shall consider automaton Ks^f, defining as follows:
where
Qs^f = {s,f }u Qsf, Qsf = { q e Q | q > max(s, f) } , and Ss^f is constructed in the following way:
a
pr
if and only if
p —r, p e {s}u Qsf, r e {f }u Qsf.
0
We shall construct regular expressions corresponding languages of automata of the type (2) by the induction on the value t(min(s,f)). We shall denote these expressions by ps^f; if s = f, we shall denote the automaton and the expression by Ks and ps.
(Let us remark, that we consider only expressions, obtained in the way described below. Certainly, each automaton has also infinitely many other corresponding regular expressions. But we shall not consider them in this section.)
Thus, let us consider the induction formulated before. Its basis is the following. If
min(s, f) = qmaX = max({ q | q e Q }),
then automaton (2) (i.e., Kqmax) defines the language, defined also by regular expression
Pqm ax =
({ a G S | qma
' qma
(3)
remark that anyway pqmax 3 e.
Step of induction. If s = f, then we write expression defining language of automaton Ks^f in the following way:
Ps^f = { a e S 1 s f} + U Ps—yq •Pq • Pq—f (4)
q>max(s,f)
(for expression, U symbolizes +'s). And if s = f, then we write expression defining language of automaton Ks in the following way:
Ps =
Qa G S
^ } + U Ps—q • Pq • Pq—s j . (5)
q>s
By the hypothesis of inductions, all the expressions of the right parts of (4) and (5) are already constructed. For some pair s e S and f e F, denote
Ls—f = L(Ks —f )+ U L(Ks—q ) • L(Kq ) •L(Kq—f ).
q<min(s,f)
Then we can write language of the given automaton (1) (i.e., L(K)) in the following way:
L(K)= U L(Ks) •Ls—f ^L(Kf).
ses, f eF
Therefore, the corresponding regular expression is
U Ps
ses, f eF
Ps^f + U Ps—yq ■ Pq ■ Pq—>f
q<min(s,f )
■ Pf.
(6)
Ks—f = ( Qs —f, S,Ss—f, {s}, {f}),
(2)
We shall everywhere (i.e., both in this paper and in the future) use this formula (6). Although we note, that there also exists a simpler expression
U (ps • ^ Ps—q • Pq • Pq—f) • Pn;
; fcpV qeQ /
ses, feF
where, for instance, the language defined by ps^f is a subset of the language defined by
IJ Ps^q ■ Pq ■ Pq f.
qeQ
However, the last record, as is easy to verify, usually gives a much larger number of components ("terms") in the obtained regular expression.
IV. The star-height of a finite automaton
In this section, we define the main object of this paper, i.e., the star-height of a finite automaton It is important to note that this definition is built on the basis of its transition graph only, and is not associated with the marks of its edges, i.e., the letters.
Firstly, let us consider some propositions; they are simply the corollaries of the given before definitions of regular expressions p (i.e., (3)-(5)).
Proposition 1: SH(pqmax) < 1. □
Proposition 2:
SH(pq f) = max SH(pq) < max SH(pq). □
qeAmax(s,/)(s,f) qeQs^/
Proposition 3: If As= 0, then
SH(ps) = max SH(pq)+1 < maxSH(pq)+1. □
qe As q>s
In previous section, we fixed K and t . Then below, the obtained regular expression defining L(K) for the ordering function t (i.e., (6)) will be denoted below by R(K, t).
Let for automaton (1), its set Q consists of n states. We are able to consider n! different injective functions t (because for each pair of states q, r g Q, only the value of predicate t(q) < t(r) is important). These n! different functions have, generally speaking, different regular expressions defined by (4) and (5).
Definition 1: The star-height of automaton (1) is defined in the following way:
SH(K) = min SH(R(K,t )),
t eT
where T is the set of all the bijective functions of the type t : Q ^{1, ...,n}. □
It is important to remark, that for defining SSH(K), we have used exactly the way of constructing regular expressions, which was given in previous subsection. (Certainly, there exist other ways of constructing regular expressions by the given automaton.)
Let us consider a simple well-known example; the transition graph for the language of regular expression (ab*c)* is given on Fig. 1. (The agreement on the use of single and double circles as a designation of states of the transition graph was given in [10].)
To determine the star-height of this automaton, we have to consider 2 functions t ; let us call them t1 and t2. For the function
{ Ti(qi) = 1, n(q2) = 2},
we obtain, using (3)-(6) and some simplest equivalent transformations ([14] etc.), the following expressions:
pq2 = b\ pq2^qi = {c} pqi^q2 = {a}
and
pqi = (ab*c)*,
and, therefore,
R(K,t1) = (ab*c)* ■ (ab*c)* ■ (ab*c)*. (7)
similarly, considering the function
{ T2(qi) = 2, T2(q2) = 1 }, we obtain the following expressions:
pqi = {e} pqi^q2 = {a}, pq2^qi = {c}
and
pq2 = (ca +
and, therefore,
R(K, T2) = {e} ■ ({e} + a ■ (ca + b)* ■ c) ■ {e}. (8)
Counting the star-height of regular expressions (7) and (8), we obtain, that the star-height of considered automaton is equal to 1 .
Proposition 4: For automaton without useless and inaccessible states and given ordering function t,
SH(R(K, t )) = max SH(pq). □
qeQ
V. The finite automaton for the given language
In this section, for a given regular language, the corresponding finite automaton is constructed. As we already said, this problem was solved more than 60 years ago, but the method we are considering has the property that the "reverse construction" (that is, the standard algorithm for constructing a regular expression based on a given finite automaton) gives (after some simplest equivalent transformation) exactly the original regular expression.
We shall not formulate these transformations strictly, since this is simply not necessary. See, e.g., the example before, where we have obtained 3 factors (ab*c)* in (7), but we can consider, e.g., t = (ab*c)*.
Thus, we shall formulate the main proposition of the paper as follows.
Proposition 5: For each regular expression r, there exists automaton
Kr = (Q r, S, 6r, Sr, Fr )
and function Tr for its states, such that:
(r1) L(Kr) = L(r);
(r2) SH(Kr) < SH(R(Kr,Tr)) = SH(r).
(Let us note once again, that we do not assert, that r =
R(K,T).)
Proof. We shall prove this proposition in the usual way, i.e., considering the usual process of constructing the given regular expression; at the same time, we shall construct the equivalent automaton and corresponding function t .
In addition to conditions (r1) and (r2) formulated before, we shall construct automata (let it be Kr) and corresponding function (let Tr), for which two additional requirements are also met:
(r3) there exists a value Tr e R+, such that:
- t(s) < Tr for each s e Sr;
- t(q) > Tr for each q e Sr; (r4) there is no edge of the type
a
s1 s2, 0r
where si,s2 e Sr. Possible automata for the regular expressions 0, e and a (for each a e S) are given on Fig. 2-4 respectively. It is evident, that for each of these expressions, we have SH(Kr) = 0. Conditions (r3) and (r4) also hold.
Fig. 2
Fig. 3
Fig. 4
Mp
max Tp(qp),
qp eQp
we obtain, that the ordering function can be the following one:
T(p+r)(q) =
Tp(sp), Tr (sr) +
Tp(qp) + Tp + Tr, kTr (qr ) + Tp + Tr + Mp,
if sp € Sp ;
if Sr G Sr ;
if qp G Qp \ Sp ;
if qr G Qr \ Sr .
Evidently,
SH(R(K(p+r),T(p+r))) =
max(.SH(R(Kp, Tp)), SH(R(Kr, Tr )) Conditions (r1), (r3) and (r4) also hold.
For expression (p • r), we can consider automaton
K(p.r) = (Qp U Qr, S, Sp U Sr U S', Sp, Fr ), where S'(q, a) 3 sr if and only if S(q, a) 3 fp for all possible fp e Fp, sr e Sr, a e S
(S' contains no other elements). The ordering function T(p .r) can be defined in the following way:
T(p+r)(q) =
íтp(qp),
if qp G Qp ;
I Tr (qr ) + Mp, if qr G Qr .
Like previous case,
SH(R(K(p • r),T(p • r))) =
max(SH(R(Kp, Tp), SH(R(Kr, Tr ))),
and conditions (r1), (r3) and (r4) also hold.
And for expression (r*), we can consider automaton
K(r.) = ( Qr U jq'}, E, Jr U J', Sr U jq'}, Fr U jq'} ), where J'(q, a) 3 sr if and only if J(q, a) 3 fr for all possible fr G Fr, Sr G Sr , a G E
(J' contains no other elements). The ordering function T(r„) can be defined in the following way:
Now, let us suppose that we already have automata corresponding to given regular expressions p and r; we mean that we suppose, that already have automata, which define the same regular languages and have the same star-height. Certainly, we can also suppose that we also have functions t for these automata obtaining regular expressions which star-height is equal to the star-height of the given expressions (i.e., giving minimum possible star-height). I.e., we can suppose that conditions (r1-r4) hold. Let these automata be
Kp = ( Qp, S,Sp,Sp,Fp ) and Kr = ( Qr, S,Sr,Sr, Fr );
and corresponding functions be
Tp : Qp ^{1,..., |Qp| } and Tr : Qr ^{1,..., |Qr |}.
Then we can use the following automata and ordering functions; let us remark in advance, that the transition functions are considered as the sets, and the facts of defining required regular expressions by these automata are evident. For expression (p + r), we can consider automaton
K(p+r) = (Qp u Qr, S, Sp U Sr, Sp U Sr, Fp U Fr ).
And considering values Tp, Tr and
T(r»)(q) =
Vr(q), if q G Qr ;
Tr(q') = min Tr
\qeQr
(q)) / 2.
Conditions (r1), (r3) and (r4) are evident; let us prove (r2).
By the way of construction K(r*), we obtain the following fact: each path of its transition graph, which belongs to K(r*) and does not belong to Kr, has to contain a state sr G Sr. Then for any states q, q', q'' G Qr, where q G Sr, we obtain the coincidence of the sets Aq (q', q'') for automata K(r») and Kr. And only for states sr G Sr, we can obtain some new paths of sets ASr (sr, q) and ASr (q, sr). We mean the paths which belong to automaton K(r*) and does not belong to Kr.
For three the following objects:
• automaton Kr ;
• corresponding (defined in this section) function Tr ;
• and some its state q,
we denote the defined in previous section value SH(pq ) by SHr (q). Then by Proposition 3 and condition (r3) for automaton K(r*), we obtain, that:
SH(r»)(q) = SHr(q), if q G Qr \ Sr ;
SH(r,)(s) <SHr(q) + 1, if s G Sr ;
kSH(r.)(q')=0 .
Then by condition (r4) for automaton K(r»), we obtain condition (r2). □
VI. Conclusion
Using Proposition 5, we can reformulate the star-height problem for regular languages in the following way: for the given regular language, we have to construct the equivalent finite automaton K having the minimum possible star-height. After that, considering n! bijective functions of the type t : Q —y j 1,..., n} (where n = |Q|), we construct regular expressions R(K, t) and choose the expression having the minimum possible star-height.
In addition to this ("natural") continuation of the topic of this paper, there are others, not so obvious. As we said before, the publication of this not-so-complicated problem has some different goals, they are related to the future development of the topic under consideration here.
• The first topic is the consideration of the generalized star-height problem. (As we said before, the connection between the generalized star-height problem and generalized pseudo-automata is to be considered in one of the following publications.)
• The second topic is the consideration of extended automata, see [15].
• And the third topic is precisely the transition to the star-height problem for the languages.
For all such topics, problems close to those considered in this article are possible. Besides, it is very important to note that the possible developments of the topic described here can be treated completely independently of each other ("movements" towards the complication: "vertically", "horizontally" and "upwards"): for example, it is possible to consider the generalized star-height problem (i.e., the star-height problem for generalized regular expressions) with approach using extended pseudo-automata.
The problems considered in this paper are also related to a topic that is set aside from the ones described here: we should to describe an algorithm, answering the question whether or not a proper subset of the set of edges forms the equivalent automaton.
(See some related questions in [16], [17]. Certainly, we do not mean an exhaustive algorithm. The brute force method consists in this case in the complete application of the algorithm for constructing two canonical automata and their subsequent comparison.)
The author hopes, that the joint application of algorithms mentioned above can give a faster than available determining the star-height of the given regular language.
References
[1] Melnikov B. Ob odnoy klassifikacii kontekstno-svobodnyh ... [On a classification of sequentional context-free languages and grammars]. Vestnik of Moscow University. Series 15: Computational Mathematics and Cybernetics. 1993, no. 3, pp. 64-69. (in Russian, https://elibrary.ru/title_about.asp?id=8373)
[2] Melnikov B. and Vakhitova A. Some more on the finite automata. The Korean Journal of Computational and Applied Mathematics (Journal of Applied Mathematics and Computing). 1998, vol.5, no.3. pp.495-506.
[3] Melnikov B. O zvyozdnoy vysote regulyarnogo yazyka . . . [On the star-height of a regular language. Part III: The star-height of an automaton and the scheme of the transformation algorithm]. Heuristic algorithms and distributed computations. 2014, no. 3, pp. 60-76. (in Russian, https://elibrary.ru/item.asp?id=22376180)
[4] Eggan L. Transitions graphs and the star height of regular events. Michigan Mathematical Journal. 1963, vol. 10, pp. 385-397.
[5] Hashiguchi K. Algorithms for determining relative star height and star height. Information and Computation. 1988, vol.78, pp. 124-169.
[6] Perrin D. Finite Automata. Handbook of theoretical computer science, Vol. A. MIT Press Cambridge, MA, USA, 1990, 57 p.
[7] Kirsten D. Distance desert automata and the star height problem. Informatique Théorique et Applications. 2005, vol.39, no.3. pp.455509.
[8] Bojanczyk M. Star height via games. ACM/IEEE Symposium on Logic in Computer Science (LICS). 2015, pp. 214-219.
[9] Pin J.-E., Straubing H., TMrien D. Some results on the generalized star-height problem. Information and Computation. 1992, vol. 101, no. 2, pp. 219-250.
[10] Melnikov B. and Melnikova A. Pseudo-automata for generalized regular expressions. International Journal of Open Information Technologies. 2018, vol.6, no. 1. pp. 1-8.
[11] Salomaa A. Jewels of Formal Language Theory. Rockville (Maryland): Computer Science Press, Inc. 1981, 144 p.
[12] Melnikov B. Extended nondeterministic finite automata. Fundamenta Informaticae. 2010, vol. 104, no.3. pp. 255-265.
[13] Melnikov B. The complete finite automaton. International Journal of Open Information Technologies. 2017, vol.5, no. 10. pp.9-17.
[14] Melnikov B. and Sayfullina M. O nekotoryh algoritmah ekviva-lentnogo preobrazovaniya nedeterminirovannyh konechnyh avtoma-tov [On some algorithms for the equivalent transformation of nondeterministic finite automata]. Izvestiya of Higher Educational Institutions. Mathematics. 2009, no. 4, pp. 67-72. (in Russian, https://elibrary.ru/item.asp?id=11749888)
[15] Melnikov B. Extended nondeterministic finite automata. Fundamenta Informaticae. 2010, vol. 104, no.3. pp.255-265.
[16] Dolgov V. and Melnikov B. Postroenie universal'nogo konechnogo avtomata . . . [The construction of a universal finite automaton. Part I: From theory to practical algorithms]. Vestnik of Voronezh State University. 2013, no. 2, pp. 173-181. (in Russian, https://elibrary.ru/item.asp?id=2 02 67 92 4)
[17] Dolgov V., Melnikov B. and MelnikovaA. Cikly grafa perehodov bazisnogo avtomata . . . [Cycles of the transition graph of a basic automaton and related questions]. Vestnik of Voronezh State University. 2016, no. 4, pp. 95-111. (in Russian, https://elibrary.ru/item.asp?id=27257800)