Responses to "Jay Earley and his algorithm" (Mark Smotherman)
Fri, 9 Jun 89 17:00:01 -0400

          From comp.compilers

Related articles
Responses to "Jay Earley and his algorithm" (1989-06-09)
| List of all articles for this month |

Date: Fri, 9 Jun 89 17:00:01 -0400
From: (Mark Smotherman)

Summary of responses to "Jay Earley and his algorithm"

The three questions were:
1) What are significant improvements that have been made to Earley's
      parsing algorithm since 1968?
2) What compilers use Earley's algorithm or variant thereof? (I am
      aware of the Western Digital Ada compiler.)
3) Where is Earley these days, and what is he doing? (OK, make that
      four questions!)

Many thanks to the following responders (who have been a great help!): (our fearless moderator) (Bernard Lang) (Dale Worley) (Jon Mauney) (Peter C. Damron)
Dick Grune <>
Ron Rasmussen <> (Joop Leo)

Edited responses follow:


[It's my impression that for most purposes LR argorithms have supplanted
Earley's, because they are so much faster and are adequate for the grammars
of most current languages. -John]


From: (Bernard Lang)

1) significant improvements

    There is much literature on this. A paper is coming up in the next
SIGPLAN'89 conference in Portland. Much of the work on this topic,
though not all of it has taken place in the Computational Linguistics
community (which to some extent rediscovered the algorithm).
Look for "chart parsing" in that community. The main conferences
are the ACL yearly meeting, and the COLING conference (every other year).
I contributed myself several papers.
The topics are the usual ones: optimization, formal analysis, error
detection and correction, incrementality, ...
  The algorithm is related to others used in Database query evaluation,
and in logic programming.

2) compilers using Earley's algorithm or variant

Earley parsers are used in interfaces for programming environments.

3) Where is Earley

Last I heard of him (Early seventies), he was becoming something like a
psychologist. I have no idea where.


From: (Dale Worley)

There is a paper on some new Earley-like parsers, "A parser generator
for finitely ambiguous context-free grammars" by J. Rekers (report
CS-R8712 from the Center for Mathematics and Computer Science,
Mathematisch Centrum, Amsterdam), which discusses Tomita's algorithm.
Tomita's algorithm acts like an LR parser until you get to a situation
that is not LR parsable, then it (reasonably efficiently) clones the
LR parser to keep track of each possible parse. It has almost the
generality of Earley's parser (the grammar is required to be "finitely
ambiguous"), but runs much faster for ordinary grammars. The paper
has a reasonably good bibliography.


From: (Jon Mauney)

The most important theoretical improvement to Earley's parser is
described in
    S.L. Graham, M.A. Harrison, W.L. Ruzzo
    "An improved context-free recognizer"
    TOPLAS 2(3), pp415-462, July 1980

I don't have any information on use of Earley's algorithm in actual
compilers. The Graham-Harrison-Ruzzo algorithm was used in my
syntax-error repair algorithm
    (SIGPLAN Symposium on Compiler Construction, June 1982)
    (Mauney and Fischer, TOPLAS 10(3), July 1988)

From: (Peter C. Damron)

Try reading:
S. L. Graham, M. A. Harrison, W. L. Ruzzo
"An Improved Context-Free Recognizer"
ACM Trans. on Programming Languages and Systems
v2, n3, July 1980, pp. 415-462

The Earley algorithm has been used for parsing, code generation,
and natural language processing, but I think there are better ways
to do all three.


From: Dick Grune <>

The original Earley algorithm, which featured a 1-token reduction look-ahead
was extended to k-token reduction and t-token prdecition look-ahead by
%A M. Bouckaert
%A A. Pirotte
%A M. Snelling
%T Efficient parsing algorithms for general context-free parsers
%J Inf. Sci.
%V 8
%N 1
%D Jan. 1975
%P 1-26
They recommend using a 1-token prdecition look-ahead rather than follow
Earley's suggestion and have simulations to prove their point.

The Earley and CYK algorithms were bent toward each other and combined in to
an "almost" encompassing algorithm in
%A S.L. Graham
%A M.A. Harrison
%A W.L. Ruzzo
%T An improved context-free recognizer
%V 2
%N 3
%D July 1980
%P 415-462

None of these beat the n^3 barrier for the general case. There are impractical
algorithms using Strassen matrix multiplication that break this barrier.

The algorithm was extended for context-sensitive grammars in
%A G.Sh. Vold'man
%T A parsing algorithm for context-sensitive grammars
%J Program. Comput. Software
%V 7
%D 1981
%P 302-307

I know it is sometimes used in natural language parsers, and there is at least
one attempt to use it in a Graham-Glanville code generator, in SIGPLAN
Notices, June 1984.


From: Ron Rasmussen <>

I talked to some people from Interactive Software Engineering, the producers
of Eiffel (Bertrand Meyers), and their language sensitive editor and Eiffel
parser is based upon Earleys' algorithms. The person I talked to mentioned
some improvements etc. Try:

              Jean-Marc Nerson @
              270 Storke Road, Suite 7
              Goleta, Ca.

He mentioned Earley update papers by "Charles Anres" and Jean-Marc Julia
in France.


From: (Joop Leo)

In 1987 I wrote a paper titled

"A general CF parsing algorithm running in linear time on
every LR(k) grammar without using lookahead"

It appeared in the Proc. of the SION congress CSN 87 (Amsterdam,
2/3 november 1987) p.53-57.
I also submitted it to Theoretical Computer Science. Within a few month
I will have finished a revised version.
If you are interested I can send you a preliminary version.


From: (Bernard Lang)

Here is a partial bibliography on Earley like algorithms.
(quickly extracted from my papers).
It is in LaTeX format.
I believe all these papers deal in some form with some variant of
Earley's algorithm, though I did not check and my memory may have
betrayed me in one or two cases.
I know of several others, but I do not really have time to computerize now
the complete bibbliography I have.
I hope this will help you.

Billot, S. 1988
{\em Analyseurs Syntaxiques et Non-D\'{e}terminisme}.
Th\`{e}se de Doctorat, Universit\'{e} d'Orl\'{e}ans la Source,
Orl\'{e}ans (France).

Billot, S.; and Lang, B. 1989 % june 26-29
The structure of Shared Forests in Ambiguous Parsing.
INRIA Research Report 1038. % may 1989
To appear in {\em
Proc. of the $\rm 27^{th}$ Annual Meeting of the Association for Computational
Vancouver (British Columbia).

Bouckaert, M.; Pirotte, A.; and Snelling, M. 1975
Efficient Parsing Algorithms for General Context-Free Grammars.
{\em Information Sciences} 8(1): 1-26%january 1975

Cocke, J.; and Schwartz, J.T. 1970
{\em Programming Languages and Their Compilers}.
Courant Institute of Mathematical Sciences, New York University,
New York.

Earley, J. 1968
{\em An Efficient Context-Free Parsing Algorithm.}
Ph.D. thesis, Carnegie-Mellon University,
Pittsburgh, Pennsylvania.

Earley, J. 1970
An Efficient Context-Free Parsing Algorithm.
{\em Communications ACM}
13(2): 94-102.

Graham, S.L.; Harrison, M.A.; and Ruzzo W.L. 1980 %july
An Improved Context-Free Recognizer.
{\em ACM Transactions on Programming Languages and Systems} 2(3): 415-462.

Hays, D.G. 1962
Automatic Language-Data Processing.
In {\em Computer Applications in the Behavioral Sciences},
(H. Borko ed.), Prentice-Hall, pp. 394-423.

Kasami, J. 1965
{\em An Efficient Recognition and Syntax Analysis Algorithm for Context-Free
Report of Univ. of Hawaii,
%from Valiant's paper
AFCRL-65-758, Air Force Cambridge Research Laboratory,
Bedford (Massachusetts),
%from 2nd dragon book
also 1966,
University of Illinois Coordinated Science Lab. Report, No. R-257.
%from Younger's paper

Kay, M. 1980
Algorithm Schemata and Data Structures in Syntactic Processing.
{\em Proceedings of the Nobel Symposium on Text Processing},

Knuth, D.E. 1965
On the Translation of Languages from Left to Right.
{\em Information and Control}, 8: 607-639.

Lang, B. 1971
Parallel Non-deterministic Bottom-up Parsing.
Proc. of International Symposium on Extensible Languages, Grenoble,
{\em SIGPLAN Notices} 6(12): 56-57.

Lang, B. 1974
Deterministic Techniques for Efficient Non-deterministic Parsers.
{\em Proc. of the $2^{nd}$ Colloquium on Automata, Languages and Programming},
J. Loeckx (ed.), Saarbr\"{u}cken,
Springer Lecture Notes in Computer Science 14: 255-269.
Also: Rapport de Recherche 72, IRIA-Laboria, Rocquencourt (France).

Lang, B. 1988 %22-27 August
Parsing Incomplete Sentences.
{\em Proc. of the $12^{th}$ Internat. Conf. on Computational Linguistics
(COLING'88)} Vol. 1 :365-371, D. Vargha (ed.), % De'nes Vargha
Budapest (Hungary).

Lang, B. 1988 %June 28-30
Datalog Automata.
{\em Proc. of the $3^{rd}$ Internat. Conf. on Data and Knowledge Bases},
C. Beeri, J.W. Schmidt, U. Dayal (eds.), Morgan Kaufmann Pub.,
pp. 389-404,
Jerusalem (Israel).

Lang, B. 1988
{\em Complete Evaluation of Horn Clauses, an Automata Theoretic Approach}.
INRIA Research Report 913.

Lang, B. 1988 %
{\em The Systematic Construction of Earley Parsers:
Application to the Production of ${\cal O}(n^6)$ Earley Parsers
for Tree Adjoining Grammars}.
In preparation.

Li, T.; and Chun, H.W. 1987 %june 23-27
A Massively Parallel Network-Based Natural Language Parsing System.
{\em Proc. of 2nd Int. Conf. on Computers and Applications}
Beijing (Peking), : 401-408.

Nakagawa, S. 1986
{\em Speaker-Independent Sentence Recognition by Phoneme-Based Word Spotting
and Time-Synchronous Context-Free Parsing}.
Technical Report CMU-CS-86-109, Carnegie-Mellon University,
Pittsburgh, Pennsylvania.

Nakagawa, S. 1987
Spoken Sentence Recognition by Time-Synchronous Parsing Algorithm of
Context-Free Grammar.
{\em Proc. ICASSP 87}, Dallas (Texas), Vol. 2 : 829-832.

Pereira, F.C.N.; and Warren, D.H.D. 1983 %june 15-17
Parsing as Deduction.
{\em Proceedings of the 21st Annual Meeting of the Association for
Computational Linguistics}: 137-144,
Cambridge (Massachusetts).

Phillips, J.D. 1986 %automn
A Simple Efficient Parser for Phrase-Structure Grammars.
{\em Quarterly Newsletter of the Soc. for the Study of Artificial
Intelligence (AISBQ)} 59: 14-19.

Pratt, V.R. 1975 %august
LINGOL --- A Progress Report.
In {\em Proceedings of the 4th IJCAI}: 422-428.

Rekers, J. 1987 %march
{\em A Parser Generator for Finitely Ambiguous Context-Free Grammars}.
Report CS-R8712,
Computer Science/Dpt. of Software Technology,
Centrum voor Wiskunde en Informatica,
Amsterdam (The Netherlands).

Sheil, B.A. 1976
%june 76 for tech report
Observations on Context Free Parsing.
%page numbers are from Tomita's PhD, I have not checked.
in {\em Statistical Methods in Linguistics}: 71-109, Stockholm (Sweden),
Proc. of Internat. Conf. on Computational Linguistics (COLING-76),
Ottawa (Canada).
Also: Technical Report TR 12-76,
Center for Research in Computing Technology,
Aiken Computation Laboratory,
Harvard Univ., Cambridge (Massachusetts).

Shieber, S.M. 1985
Using Restriction to Extend Parsing Algorithms for Complex-Feature-Based
{\em Proceedings of the 23rd Annual Meeting of the Association for
Computational Linguistics}: 145-152.

Thompson, H. 1983 %August 22-26
MCHART: A Flexible, Modular Chart Parsing System.
{Proc. of the National Conf. on Artificial Intelligence (AAAI-83)},
Washington (D.C.), pp. 408-410.

Tomita, M. 1985
{\em An Efficient Context-free Parsing Algorithm for
Natural Languages and Its Applications}.
Ph.D. thesis, Carnegie-Mellon University,
Pittsburgh, Pennsylvania.

Tomita, M. 1986
An Efficient Word Lattice Parsing Algorithm for Continuous Speech Recognition.
In {\em Proceedings of IEEE-IECE-ASJ International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 86)}, Vol. 3: 1569-1572.
%Tokyo, April 1986

Tomita, M. 1987
An Efficient Augmented-Context-Free Parsing Algorithm.
{\em Computational Linguistics}
13(1-2): 31-46.

Tomita, M. 1988
Graph-structured Stack and Natural Language Parsing.
{\em Proceedings of the $26^{th}$ Annual Meeting of the Association for
Computational Linguistics}: 249-257.

%Kuniaki Uehara;
%Ryo Ochitani;
%Osamu Kakusho;
%Junichi Toyoda
Uehara, K.;
Ochitani, R.;
Kakusho, O.;
Toyoda, J.
1984 %february
A Bottom-Up Parser based on Predicate Logic:
A Survey of the Formalism and its Implementation Technique.
{\em 1984 Internat. Symp. on Logic Programming},
Atlantic City (New Jersey), : 220-227.

Valiant, L.G. 1975
General Context-Free Recognition in Less than Cubic Time.
{\em Journal of Computer and System Sciences}, 10: 308-315.

Villemonte de la Clergerie, E.; and Zanchetta, A. 1988
{\em Evaluateur de Clauses de Horn.}
Rapport de Stage d'Option, Ecole Polytechnique, Palaiseau (France).

Ward, W.H.; Hauptmann, A.G.; Stern, R.M.; and Chanak, T. 1988 % april 11-14
% Wayne Alexander Richard Thomas
Parsing Spoken Phrases Despite Missing Words.
In {\em Proceedings of the 1988 International Conference on Acoustics,
Speech, and Signal Processing (ICASSP 88)}, Vol. 1: 275-278.
%New York City (New York), April 11-14 1988

% Context-Free Language Processing in Time $n^{3}$.
%G.E. Res. Develop. Cr., Schenectady, N.Y. 1966.
% reference found in Cocke & Schwartz 1970 (Prog. Lang. and their Compilers).
Younger, D.H. 1967
Recognition and Parsing of Context-Free Languages in Time $n^{3}$.
{\em Information and Control}, 10(2): 189-208
[From (Mark Smotherman)]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.