Related articles |
---|
2 word token as one in lex makotosu@my-deja.com (2000-07-18) |
Re: 2 word token as one in lex troy@bell-labs.com (Troy Cauble) (2000-07-23) |
Re: 2 word token as one in lex james.d.carlson@sun.com (James Carlson) (2000-07-23) |
Re: 2 word token as one in lex kszabo@nortelnetworks.com (Kevin Szabo) (2000-07-23) |
From: | "Kevin Szabo" <kszabo@nortelnetworks.com> |
Newsgroups: | comp.compilers |
Date: | 23 Jul 2000 17:04:15 -0400 |
Organization: | Nortel Networks (Ottawa, Ontario, Canada) |
References: | 00-07-034 |
Keywords: | parse, comment |
|I'm trying to parse SQL and I'd like to recognize UNION JOIN as one
|token in the lexer. So for example,
|
|if the lexer sees UNION and the next token (after any # of whitespaces,
|tabs and newlines) is JOIN it should return UNION_JOIN
My personal feeling is that you are trying do too much work in the
lexer. Why not just specify this rule in your Yacc grammar? It will
save you aggravation if 'union join' is going to be part of some other
production.
John's suggestion of an intermediate pre-parser between the lexer and
the parser is a good one; I've exploited that for thing that are
manually generated (like recursive descent) but I usually like to
express all the rules in the grammar if at all possible.
Have you looked at the O'Reilly lex-yacc book? They have an SQL
parser as an example (If I remember correctly IIRC).
Kevin
[My SQL grammar in the book is for the old SQL 89. I dimly recall
there's some ambiguity in the grammar that makes it reasonable to
try to handle UNION JOIN as one token. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.