Re: is lex useful? (James Kanze US/ESC 60/3/141 #40763)
27 Jun 1996 11:40:33 -0400

          From comp.compilers

Related articles
[11 earlier articles]
Re: is lex useful? (1996-06-26)
Re: is lex useful? (Jerry Leichter) (1996-06-27)
Re: is lex useful? (Scott Stanchfield) (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? 72510.2757@CompuServe.COM (Stephen Lindholm) (1996-06-27)
Re: is lex useful? (1996-06-27)
Re: is lex useful? (1996-06-30)
Re: is lex useful? Robert.Corbett@Eng.Sun.COM (1996-06-30)
Re: is lex useful? (1996-06-30)
Re: is lex useful? (1996-06-30)
Re: is lex useful? (1996-06-30)
Re: is lex useful? (1996-06-30)
[5 later articles]
| List of all articles for this month |

From: (James Kanze US/ESC 60/3/141 #40763)
Newsgroups: comp.compilers
Date: 27 Jun 1996 11:40:33 -0400
Organization: GABI Software, Sarl.
References: 96-06-073 96-06-105 96-06-111
Keywords: lex (Scott Nicol) writes:
>- Parser-scanner interactions can get really hairy (a common way to fix
> difficult parsing problems is to have the parser fiddle with the scanner,
> so the scanner will handle it). (Mandeep S Dhami) writes:
|> IMHO, This is by far the most important reason to write your own lexer.
|> You can often simplified yacc grammar by doing this. To some this may be
|> "bad" design (wrong use of tools/inappropiate coupling), to others, it is
|> "smart" design! Anyway, I feel the later.

And the moderator adds:

|> [I've never had any trouble putting gross hacks into my flex lexers using
|> start states. I'd be interested in examples of scanner hackery that's hard
|> in flex but easy in a hand-coded lexer. -John]

I'll second this. I've never had any problem adding parser state
dependancies to lex generated grammars (although I've not used start
states to do it).

The technique I've generally used in recent years is simplicity itself.
The yacc parser state is a global variable, and can be accessed by lex.
Yacc can also be requested to generate an ASCII dump (in y.output, in
the classical implementation). It is relatively easy to write a shell
script to massage this output into a table for lex, indicating which
tokens it can legally return in each parser state. (Generally, if
there is a token which will result in a shift, it is the one which
should be returned.)

Once a potential token is recognized (say SOME_KEYWORD), the action
checks the table to see if it is acceptable. If not, it tries an
alternative (USER_SYMBOL). Potentially, the action may even read more
characters. (The only token which is acceptable is FILENAME; something
which looks like a keyword or a user symbol may only be part of a legal
filename.) Although I've never tried it, I imagine that yyunget could
also be used to return characters to the input stream.
James Kanze Tel.: (+33) 88 14 49 00 email:
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.