Re: Help: PCLEX, PCYACC and reserved words (Fergus Henderson)
7 Dec 1997 22:03:57 -0500

          From comp.compilers

Related articles
Help: PCLEX, PCYACC and reserved words (Garry Whitworth) (1997-12-05)
Re: Help: PCLEX, PCYACC and reserved words (1997-12-07)
Re: Help: PCLEX, PCYACC and reserved words (Mark Thiehatten) (1997-12-10)
Re: Help: PCLEX, PCYACC and reserved words (Chris Clark USG) (1997-12-10)
| List of all articles for this month |

From: (Fergus Henderson)
Newsgroups: comp.compilers
Date: 7 Dec 1997 22:03:57 -0500
Organization: Comp Sci, University of Melbourne
References: 97-12-036
Keywords: parse, design

"Garry Whitworth" <> writes:

>[Yuck. For the partial name problem, I'd have the lexer recognize
>them all as symbols, then in the semantic code look up the symbol in a
>table of keywords using prefix matching. The keyword as symbol problem is
>a lot harder. Sometimes you can get away with an ugly grammar that says
>symbol | thiskeyword | thatkeyword | theotherkeyword
>all over the place, or maybe you can identify the contexts well enough in
>the parser to tell the lexer when it's expecting a keyword and when it
>isn't. -John]

If you are using a parsing technique such as recursive descent, or a
parser generator which allows semantic matching, I think it should be
straight-forward. As John said, have the lexer recognize all keywords
as symbols; then the parser just has to put the appropriate semantic
conditions on them.

For example, if you're using recursive descent, then the body of one
of your parsing procedures might look something like this:

if nexttoken.type = SYMBOL then
if matches(nextttoken.value, "percent", 3) then
/* one of per, perc, perce, percen, percent */
elseif matches(nextttoken.value, "character", 4) then
/* one of char, chara, ..., character */
/* treat as an ordinary identifier */
end if
/* handle some other kind of token */
end if

At each point in the grammar, you check for only those keywords that
could be relevant at that point. Other keywords are assumed to be
ordinary identifiers.

So it's really quite straight-forward, not "yuck" at all, unless
you're using one of those yucky bottom-up parser generators (see
subject line ;-).
Fergus Henderson <>
WWW: <>
PGP: finger fjh@
[Hey, things can be straight-forward and yucky at the same time. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.