Techreports: Finite automata construction and minimization algorithms (Bruce W. Watson)
Fri, 10 Jun 1994 23:55:47 GMT

          From comp.compilers

Related articles
Techreports: Finite automata construction and minimization algorithms (1994-06-10)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Bruce W. Watson)
Keywords: report, DFA
Organization: Eindhoven University of Technology, the Netherlands
Date: Fri, 10 Jun 1994 23:55:47 GMT

Hi All,

I would like to announce a reprinting (with corrections) of a couple of
technical reports on finite automata.
The full citations of the reports are:
        ``A taxonomy of finite automata construction algorithms'' by Bruce W.
        Watson, Computing Science Note 93/43, Eindhoven University of
        Technology, The Netherlands, 1993.

        ``A taxonomy of finite automata minimization algorithms'' by Bruce W.
        Watson, Computing Science Note 93/44, Eindhoven University of
        Technology, The Netherlands, 1993.

The algorithms given in these reports have been implemented in a C++
toolkit of finite automata algorithms. The papers relating to the toolkit
(which is known as the ``FIRE engine'') appear in another news posting in
this newsgroup.

The abstracts are (respectively):
This paper presents a taxonomy of finite automata construction algorithms.
Each algorithm is classified into one of two families: those based upon
the structure of regular expressions, and those based upon the
automata-theoretic work of Myhill and Nerode.

Many of the algorithms appearing in the literature are based upon the
structure of regular expressions. In this paper, we make this term precise
by defining regular expressions as a $\Sigma$-term algebra, and automata
constructions as various $\Sigma$-algebras of automata. Each
construction algorithm is then presented as the unique natural
homomorphism from the $\Sigma$-term algebra of regular expressions to the
appropriate $\Sigma$-algebra of automata. The concept of duality is
introduced and used to derive more practical construction algorithms. In
this way, we successfully present (and relate) algorithms given by
Thompson, Berry and Sethi, McNaughton and Yamada, Glushkov, and Aho,
Sethi, and Ullman. Efficient implementations (including those due to Chang
and Paige, and Br\"uggemann-Klein) are also treated. As a side-effect we
derive several new algorithms.

A pair of impractical, but theoretically interesting, construction
algorithms were presented by Myhill and Nerode. Some encoding techniques
are used to make the algorithms practical --- giving Brzozowski's
algorithm based upon derivatives. DeRemer's algorithm is derived as an
encoding of Brzozowski's algorithm. Two new algorithms, related to
DeRemer's, are derived. Lastly, this family of algorithms is related to
the first family.

In addition to classifying the algorithms, we identify (and abstract from)
the coding tricks and implementation details present in many of the
published algorithms. This paper also presents an introduction to finite
automata, $\Sigma$-algebras, and their properties.

This paper presents a taxonomy of finite automata minimization algorithms.
Brzozowski's elegant minimization algorithm differs from all other known
minimization algorithms, and is derived separately. All of the remaining
algorithms depend upon computing an equivalence relation on states. We
define the equivalence relation, the partition that it induces, and its
complement. Additionally, some useful properties are derived. It is shown
that the equivalence relation is the greatest fixed point of an equation,
providing a useful characterization of the required computation. We derive
an upperbound on the number of approximation steps required to compute the
fixed point. Algorithms computing the equivalence relation (or the
partition, or its complement) are derived systematically in the same
framework. The algorithms include Hopcroft's, several algorithms from
text-books (including Hopcroft and Ullman's \cite{HopcroftUllman}, Wood's
\cite{Wood}, and Aho, Sethi, and Ullman's \cite{AhoSethiUllman}), and
several new algorithms or variants of existing algorithms.

The reports may be obtained by anonymous ftp from: (also known as
in directory:
Files constax...

(Files with 600dpi as a substring in their name are for use with
high-resolution printers. No dvi files are available due to the inclusion
of PostScript figures.
These reports first appeared in December of 1993. Those of you who
received the first printing need not print the entire reports again, but
can fetch the files constax.err and mintax.err.)
If you ftp the reports, would you kindly drop me some email telling me
what you fetched. The numbers turn out to be interesting to the Faculty
evaluation people here.

Paper copies of the reports can also be obtained by emailing me at:
or paper mailing me at:
        Bruce W. Watson
        HG 7.39
        Faculty of Computing Science
        Eindhoven University of Technology
        P.O. Box 513
        5600 MB Eindhoven
        The Netherlands

Bruce Watson

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.