Has lexing and parsing theory advanced since the 1970's?

Roger L Costello <costello@mitre.org>
Tue, 14 Sep 2021 13:16:01 +0000

          From comp.compilers

Related articles
Has lexing and parsing theory advanced since the 1970's? costello@mitre.org (Roger L Costello) (2021-09-14)
Re: Has lexing and parsing theory advanced since the 1970's? anton@mips.complang.tuwien.ac.at (2021-09-16)
Re: Has lexing and parsing theory advanced since the 1970's? 480-992-1380@kylheku.com (Kaz Kylheku) (2021-09-17)
Has lexing and parsing theory advanced since the 1970's? christopher.f.clark@compiler-resources.com (Christopher F Clark) (2021-09-18)
Re: Has lexing and parsing theory advanced since the 1970's? drikosev@gmail.com (Ev Drikos) (2021-09-29)
| List of all articles for this month |
From: Roger L Costello <costello@mitre.org>
Newsgroups: comp.compilers
Date: Tue, 14 Sep 2021 13:16:01 +0000
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="98714"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, history, question
Posted-Date: 16 Sep 2021 12:56:21 EDT
Accept-Language: en-US
Content-Language: en-US

Lately I have been learning to use Flex & Bison. As I understand it, Flex &
Bison (and other parser generators) are built on solid theory. As a result,
those parser generators were created quickly and cheaply using structured,
well-known techniques. Contrast with parsers developed prior to the theory:
The earliest parser back in the 1950s used utterly ad hoc techniques to
analyze the syntax of the source code of programs they were parsing. During
the 1960s, the field got a lot of academic attention and by the early 1970s,
parsing was no longer an arcane art. In the 1970s Aho, Ullman, Knuth, and many
others put parsing techniques solidly on their theoretical feet.


One of the first techniques they (Aho, Ullman, Knuth, and others) espoused was
to separate lexing from parsing. Lexing built upon regular expressions, which
built upon Finite Automata (FA) theory and Nondeterministic Finite Automata
(NFA) theory. FA and NFA were brilliantly melded together with the famous
Kleene Theorem. Parsing built on top of a rich theory of grammars -- Context
Free Grammars, Context Sensitive Grammars, etc. -- that Chomsky formulated.
Neat!


That said, Flex & Bison is old. Has lexing/parsing theory advanced since the
1970’s? If yes, are there parser generators available today which are based on
those advances in lexing/parsing theory? Or does Flex & Bison still represent
the state-of-the-art in terms of the underlying theory it uses?


[Having been there in the 1970s I can report that the linguists and the computer
scientists were dimly aware of each other but didn't work together and used
different terms, e.g., linguists say phrase structure grammars where we would
say CFG.
Flex is a rewrite of lex which was a Bell Labs summer project to make
it easier to turn regular expressions into DFAs (not NFAs) by Eric
Schmidt. The code was awful, flex was a cleaner rewrite that avoided the
AT&T software license. Bison was orignally a GPL rewrite of yacc but it
has since added reentrant parsers and GLR parsing for ambiguous grammars.
They knew about GLR in the 1970s, but Yacc had to run on a 16 bit PDP-11
and the data structures were too big. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.