Fri, 9 Jun 89 17:00:01 -0400

Related articles |
---|

Responses to "Jay Earley and his algorithm" mark@hubcap.clemson.edu (1989-06-09) |

Date: | Fri, 9 Jun 89 17:00:01 -0400 |

From: | mark@hubcap.clemson.edu (Mark Smotherman) |

Summary of responses to "Jay Earley and his algorithm"

The three questions were:

1) What are significant improvements that have been made to Earley's

parsing algorithm since 1968?

2) What compilers use Earley's algorithm or variant thereof? (I am

aware of the Western Digital Ada compiler.)

3) Where is Earley these days, and what is he doing? (OK, make that

four questions!)

Many thanks to the following responders (who have been a great help!):

compilers@ima.isc.com (our fearless moderator)

lang@inria.inria.fr (Bernard Lang)

worley@compass.com (Dale Worley)

mauney@cscadm.ncsu.edu (Jon Mauney)

peterd@june.cs.washington.edu (Peter C. Damron)

Dick Grune <dick@cs.vu.nl>

Ron Rasmussen <ras%hpclras@sde.hp.com>

joop@cs.kun.nl (Joop Leo)

Edited responses follow:

------------------------

From: compilers@ima.isc.com

[It's my impression that for most purposes LR argorithms have supplanted

Earley's, because they are so much faster and are adequate for the grammars

of most current languages. -John]

------------------------

From: lang@inria.inria.fr (Bernard Lang)

1) significant improvements

There is much literature on this. A paper is coming up in the next

SIGPLAN'89 conference in Portland. Much of the work on this topic,

though not all of it has taken place in the Computational Linguistics

community (which to some extent rediscovered the algorithm).

Look for "chart parsing" in that community. The main conferences

are the ACL yearly meeting, and the COLING conference (every other year).

I contributed myself several papers.

The topics are the usual ones: optimization, formal analysis, error

detection and correction, incrementality, ...

The algorithm is related to others used in Database query evaluation,

and in logic programming.

2) compilers using Earley's algorithm or variant

Earley parsers are used in interfaces for programming environments.

3) Where is Earley

Last I heard of him (Early seventies), he was becoming something like a

psychologist. I have no idea where.

------------------------

From: worley@compass.com (Dale Worley)

There is a paper on some new Earley-like parsers, "A parser generator

for finitely ambiguous context-free grammars" by J. Rekers (report

CS-R8712 from the Center for Mathematics and Computer Science,

Mathematisch Centrum, Amsterdam), which discusses Tomita's algorithm.

Tomita's algorithm acts like an LR parser until you get to a situation

that is not LR parsable, then it (reasonably efficiently) clones the

LR parser to keep track of each possible parse. It has almost the

generality of Earley's parser (the grammar is required to be "finitely

ambiguous"), but runs much faster for ordinary grammars. The paper

has a reasonably good bibliography.

------------------------

From: mauney@cscadm.ncsu.edu (Jon Mauney)

The most important theoretical improvement to Earley's parser is

described in

S.L. Graham, M.A. Harrison, W.L. Ruzzo

"An improved context-free recognizer"

TOPLAS 2(3), pp415-462, July 1980

I don't have any information on use of Earley's algorithm in actual

compilers. The Graham-Harrison-Ruzzo algorithm was used in my

syntax-error repair algorithm

(SIGPLAN Symposium on Compiler Construction, June 1982)

(Mauney and Fischer, TOPLAS 10(3), July 1988)

------------------------

From: peterd@june.cs.washington.edu (Peter C. Damron)

Try reading:

S. L. Graham, M. A. Harrison, W. L. Ruzzo

"An Improved Context-Free Recognizer"

ACM Trans. on Programming Languages and Systems

v2, n3, July 1980, pp. 415-462

The Earley algorithm has been used for parsing, code generation,

and natural language processing, but I think there are better ways

to do all three.

------------------------

From: Dick Grune <dick@cs.vu.nl>

The original Earley algorithm, which featured a 1-token reduction look-ahead

was extended to k-token reduction and t-token prdecition look-ahead by

%A M. Bouckaert

%A A. Pirotte

%A M. Snelling

%T Efficient parsing algorithms for general context-free parsers

%J Inf. Sci.

%V 8

%N 1

%D Jan. 1975

%P 1-26

They recommend using a 1-token prdecition look-ahead rather than follow

Earley's suggestion and have simulations to prove their point.

The Earley and CYK algorithms were bent toward each other and combined in to

an "almost" encompassing algorithm in

%A S.L. Graham

%A M.A. Harrison

%A W.L. Ruzzo

%T An improved context-free recognizer

%J TOPLAS

%V 2

%N 3

%D July 1980

%P 415-462

None of these beat the n^3 barrier for the general case. There are impractical

algorithms using Strassen matrix multiplication that break this barrier.

The algorithm was extended for context-sensitive grammars in

%A G.Sh. Vold'man

%T A parsing algorithm for context-sensitive grammars

%J Program. Comput. Software

%V 7

%D 1981

%P 302-307

I know it is sometimes used in natural language parsers, and there is at least

one attempt to use it in a Graham-Glanville code generator, in SIGPLAN

Notices, June 1984.

------------------------

From: Ron Rasmussen <ras%hpclras@sde.hp.com>

I talked to some people from Interactive Software Engineering, the producers

of Eiffel (Bertrand Meyers), and their language sensitive editor and Eiffel

parser is based upon Earleys' algorithms. The person I talked to mentioned

some improvements etc. Try:

Jean-Marc Nerson @

ISE

270 Storke Road, Suite 7

Goleta, Ca.

93117

He mentioned Earley update papers by "Charles Anres" and Jean-Marc Julia

in France.

------------------------

From: joop@cs.kun.nl (Joop Leo)

In 1987 I wrote a paper titled

"A general CF parsing algorithm running in linear time on

every LR(k) grammar without using lookahead"

It appeared in the Proc. of the SION congress CSN 87 (Amsterdam,

2/3 november 1987) p.53-57.

I also submitted it to Theoretical Computer Science. Within a few month

I will have finished a revised version.

If you are interested I can send you a preliminary version.

------------------------

From: lang@inria.inria.fr (Bernard Lang)

Here is a partial bibliography on Earley like algorithms.

(quickly extracted from my papers).

It is in LaTeX format.

I believe all these papers deal in some form with some variant of

Earley's algorithm, though I did not check and my memory may have

betrayed me in one or two cases.

I know of several others, but I do not really have time to computerize now

the complete bibbliography I have.

I hope this will help you.

\cfbibitem

{billot-dea}

Billot, S. 1988

{\em Analyseurs Syntaxiques et Non-D\'{e}terminisme}.

Th\`{e}se de Doctorat, Universit\'{e} d'Orl\'{e}ans la Source,

Orl\'{e}ans (France).

\cfbibitem

{lang-shared}

Billot, S.; and Lang, B. 1989 % june 26-29

The structure of Shared Forests in Ambiguous Parsing.

INRIA Research Report 1038. % may 1989

To appear in {\em

Proc. of the $\rm 27^{th}$ Annual Meeting of the Association for Computational

Linguistics},

Vancouver (British Columbia).

\cfbibitem

{bps}

Bouckaert, M.; Pirotte, A.; and Snelling, M. 1975

Efficient Parsing Algorithms for General Context-Free Grammars.

{\em Information Sciences} 8(1): 1-26%january 1975

\cfbibitem

{nodal-span}

Cocke, J.; and Schwartz, J.T. 1970

{\em Programming Languages and Their Compilers}.

Courant Institute of Mathematical Sciences, New York University,

New York.

\cfbibitem

{earley-th}

Earley, J. 1968

{\em An Efficient Context-Free Parsing Algorithm.}

Ph.D. thesis, Carnegie-Mellon University,

Pittsburgh, Pennsylvania.

\cfbibitem

{earley-pap}

Earley, J. 1970

An Efficient Context-Free Parsing Algorithm.

{\em Communications ACM}

13(2): 94-102.

\cfbibitem

{graham}

Graham, S.L.; Harrison, M.A.; and Ruzzo W.L. 1980 %july

An Improved Context-Free Recognizer.

{\em ACM Transactions on Programming Languages and Systems} 2(3): 415-462.

\cfbibitem

{hays}

Hays, D.G. 1962

Automatic Language-Data Processing.

In {\em Computer Applications in the Behavioral Sciences},

(H. Borko ed.), Prentice-Hall, pp. 394-423.

\cfbibitem

{kasami}

Kasami, J. 1965

{\em An Efficient Recognition and Syntax Analysis Algorithm for Context-Free

Languages}.

Report of Univ. of Hawaii,

%from Valiant's paper

also

AFCRL-65-758, Air Force Cambridge Research Laboratory,

Bedford (Massachusetts),

%from 2nd dragon book

also 1966,

University of Illinois Coordinated Science Lab. Report, No. R-257.

%from Younger's paper

\cfbibitem

{kay}

Kay, M. 1980

Algorithm Schemata and Data Structures in Syntactic Processing.

{\em Proceedings of the Nobel Symposium on Text Processing},

Gothenburg.

\cfbibitem

{lrk}

Knuth, D.E. 1965

On the Translation of Languages from Left to Right.

{\em Information and Control}, 8: 607-639.

\cfbibitem

{lang-weak-prec}

Lang, B. 1971

Parallel Non-deterministic Bottom-up Parsing.

Proc. of International Symposium on Extensible Languages, Grenoble,

{\em SIGPLAN Notices} 6(12): 56-57.

\cfbibitem

{lang-pap}

Lang, B. 1974

Deterministic Techniques for Efficient Non-deterministic Parsers.

{\em Proc. of the $2^{nd}$ Colloquium on Automata, Languages and Programming},

J. Loeckx (ed.), Saarbr\"{u}cken,

Springer Lecture Notes in Computer Science 14: 255-269.

\newline

Also: Rapport de Recherche 72, IRIA-Laboria, Rocquencourt (France).

\cfbibitem

{lang-incomplete}

Lang, B. 1988 %22-27 August

Parsing Incomplete Sentences.

{\em Proc. of the $12^{th}$ Internat. Conf. on Computational Linguistics

(COLING'88)} Vol. 1 :365-371, D. Vargha (ed.), % De'nes Vargha

Budapest (Hungary).

\cfbibitem

{lang-datalog}

Lang, B. 1988 %June 28-30

Datalog Automata.

{\em Proc. of the $3^{rd}$ Internat. Conf. on Data and Knowledge Bases},

C. Beeri, J.W. Schmidt, U. Dayal (eds.), Morgan Kaufmann Pub.,

pp. 389-404,

Jerusalem (Israel).

\cfbibitem

{lang-lpda}

Lang, B. 1988

{\em Complete Evaluation of Horn Clauses, an Automata Theoretic Approach}.

INRIA Research Report 913.

\cfbibitem

{lang-tag}

%{Lan}{88c}

Lang, B. 1988 %

{\em The Systematic Construction of Earley Parsers:

Application to the Production of ${\cal O}(n^6)$ Earley Parsers

for Tree Adjoining Grammars}.

In preparation.

\cfbibitem

{china}

Li, T.; and Chun, H.W. 1987 %june 23-27

A Massively Parallel Network-Based Natural Language Parsing System.

{\em Proc. of 2nd Int. Conf. on Computers and Applications}

Beijing (Peking), : 401-408.

\cfbibitem

{nakagawa-rep}

Nakagawa, S. 1986

{\em Speaker-Independent Sentence Recognition by Phoneme-Based Word Spotting

and Time-Synchronous Context-Free Parsing}.

Technical Report CMU-CS-86-109, Carnegie-Mellon University,

Pittsburgh, Pennsylvania.

\cfbibitem

{nakagawa}

Nakagawa, S. 1987

Spoken Sentence Recognition by Time-Synchronous Parsing Algorithm of

Context-Free Grammar.

{\em Proc. ICASSP 87}, Dallas (Texas), Vol. 2 : 829-832.

\cfbibitem

{parsing-deduc}

Pereira, F.C.N.; and Warren, D.H.D. 1983 %june 15-17

Parsing as Deduction.

{\em Proceedings of the 21st Annual Meeting of the Association for

Computational Linguistics}: 137-144,

Cambridge (Massachusetts).

\cfbibitem

{phillips}

Phillips, J.D. 1986 %automn

A Simple Efficient Parser for Phrase-Structure Grammars.

{\em Quarterly Newsletter of the Soc. for the Study of Artificial

Intelligence (AISBQ)} 59: 14-19.

\cfbibitem

{lingol}

Pratt, V.R. 1975 %august

LINGOL --- A Progress Report.

In {\em Proceedings of the 4th IJCAI}: 422-428.

\cfbibitem

{rekers}

Rekers, J. 1987 %march

{\em A Parser Generator for Finitely Ambiguous Context-Free Grammars}.

Report CS-R8712,

Computer Science/Dpt. of Software Technology,

Centrum voor Wiskunde en Informatica,

Amsterdam (The Netherlands).

\cfbibitem

{sheil}

Sheil, B.A. 1976

%june 76 for tech report

Observations on Context Free Parsing.

%page numbers are from Tomita's PhD, I have not checked.

in {\em Statistical Methods in Linguistics}: 71-109, Stockholm (Sweden),

Proc. of Internat. Conf. on Computational Linguistics (COLING-76),

Ottawa (Canada).

\newline

Also: Technical Report TR 12-76,

Center for Research in Computing Technology,

Aiken Computation Laboratory,

Harvard Univ., Cambridge (Massachusetts).

\cfbibitem

{restriction}

Shieber, S.M. 1985

Using Restriction to Extend Parsing Algorithms for Complex-Feature-Based

Formalisms.

{\em Proceedings of the 23rd Annual Meeting of the Association for

Computational Linguistics}: 145-152.

\cfbibitem

{mchart}

Thompson, H. 1983 %August 22-26

MCHART: A Flexible, Modular Chart Parsing System.

{Proc. of the National Conf. on Artificial Intelligence (AAAI-83)},

Washington (D.C.), pp. 408-410.

\cfbibitem

{tom-th}

Tomita, M. 1985

{\em An Efficient Context-free Parsing Algorithm for

Natural Languages and Its Applications}.

Ph.D. thesis, Carnegie-Mellon University,

Pittsburgh, Pennsylvania.

\cfbibitem

{tom-lattice}

Tomita, M. 1986

An Efficient Word Lattice Parsing Algorithm for Continuous Speech Recognition.

In {\em Proceedings of IEEE-IECE-ASJ International Conference on Acoustics,

Speech, and Signal Processing (ICASSP 86)}, Vol. 3: 1569-1572.

%Tokyo, April 1986

\cfbibitem

{tom-journal}

Tomita, M. 1987

An Efficient Augmented-Context-Free Parsing Algorithm.

{\em Computational Linguistics}

13(1-2): 31-46.

\cfbibitem

{tom-pp}

Tomita, M. 1988

Graph-structured Stack and Natural Language Parsing.

{\em Proceedings of the $26^{th}$ Annual Meeting of the Association for

Computational Linguistics}: 249-257.

\cfbibitem

{jap-lingol}

%Kuniaki Uehara;

%Ryo Ochitani;

%Osamu Kakusho;

%Junichi Toyoda

Uehara, K.;

Ochitani, R.;

Kakusho, O.;

Toyoda, J.

1984 %february

A Bottom-Up Parser based on Predicate Logic:

A Survey of the Formalism and its Implementation Technique.

{\em 1984 Internat. Symp. on Logic Programming},

Atlantic City (New Jersey), : 220-227.

\cfbibitem

{valiant}

Valiant, L.G. 1975

General Context-Free Recognition in Less than Cubic Time.

{\em Journal of Computer and System Sciences}, 10: 308-315.

\cfbibitem

{stagex}

%{VilZ}{88}

Villemonte de la Clergerie, E.; and Zanchetta, A. 1988

{\em Evaluateur de Clauses de Horn.}

Rapport de Stage d'Option, Ecole Polytechnique, Palaiseau (France).

\cfbibitem

{island-miss}

%{WarHSC}{88}

Ward, W.H.; Hauptmann, A.G.; Stern, R.M.; and Chanak, T. 1988 % april 11-14

% Wayne Alexander Richard Thomas

Parsing Spoken Phrases Despite Missing Words.

In {\em Proceedings of the 1988 International Conference on Acoustics,

Speech, and Signal Processing (ICASSP 88)}, Vol. 1: 275-278.

%New York City (New York), April 11-14 1988

\cfbibitem

{younger}

%1966

% Context-Free Language Processing in Time $n^{3}$.

%G.E. Res. Develop. Cr., Schenectady, N.Y. 1966.

% reference found in Cocke & Schwartz 1970 (Prog. Lang. and their Compilers).

Younger, D.H. 1967

Recognition and Parsing of Context-Free Languages in Time $n^{3}$.

{\em Information and Control}, 10(2): 189-208

[From mark@hubcap.clemson.edu (Mark Smotherman)]

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.