Re: Lookahead vs. Scanner Feedback

jwoods@convex.com (Jeff Woods)
Thu, 9 Jan 1992 16:34:06 GMT

          From comp.compilers

Related articles
[6 earlier articles]
Re: Lookahead vs. Scanner Feedback drw@lagrange.mit.edu (1992-01-07)
Re: Lookahead vs. Scanner Feedback smk@dcs.edinburgh.ac.uk (1992-01-07)
Re: Lookahead vs. Scanner Feedback bill@twwells.com (1992-01-08)
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-08)
Re: Lookahead vs. Scanner Feedback nigelh@sol.UVic.CA (1992-01-08)
Re: Lookahead vs. Scanner Feedback dww@inf.fu-berlin.de (1992-01-08)
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-09)
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-10)
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-13)
Re: Lookahead vs. Scanner Feedback megatest!djones@decwrl.dec.com (1992-01-13)
| List of all articles for this month |
Newsgroups: comp.compilers
From: jwoods@convex.com (Jeff Woods)
Keywords: yacc, parse
Organization: CONVEX Computer Corporation, Richardson, Tx., USA
References: 92-01-012
Date: Thu, 9 Jan 1992 16:34:06 GMT

hjelm+@cs.cmu.edu (Mark Hjelm) writes:
>I have a parser, written using Yacc and Lex, for ANSI C. The grammar is
>taken pretty much verbatim from the standard. The scanner uses the symbol
>table to decide whether to return "identifier" or "typedef name" as the
>token type for an identifier. How do I KNOW that there are no situations
>which, due to parser lookahead, would cause the scanner to return an
>incorrect token type for an identifier (i.e. return "identifier", even
>though the identifier was just/will be made into a "typedef name")? Is
>there a general answer to this question for other parsing strategies
>(possibly with other amounts of lookahead) and other grammars (languages)?


>Just Curious,
>Mark


Mark,


I believe I can accurately address your question. You ask:


How do I KNOW that there are no situations which, due to
parser lookahead, would cause the scanner to return an
incorrect token type for an identifier?


You can definitely KNOW this answer by either constructing the Item sets
by hand or better yet, obtaining the set yacc has constructed (by invoking
yacc with the -v option) during parser generation. Since you are using
the ANSI specification, you will see that "tIdentifer" and "tTypedefName"
both appear in the same Item set. Since a typedef name is lexically
identical to an identifier, this places the burden of resolving this
ambiguity on the scanner. You now KNOW that you do have a situation (ie.
a parser configuration exists that places the burden of resolving lexical
ambiguity on the scanner) where the scanner is unable to return the
*correct* token type without some external help.


Brian Bliss also points this out:


The ANSI C grammar in the back of K&RII is not ambiguous: it
assumes that the lexer resolves the ambiguity, not the parser.


It's not clear to me why you qualify your question with "..., due to
parser lookahead, ...." The parser will only look ahead when it is unable
to make a parsing decision, such as shift, reduce or goto. Since the
ambiguity between typedef name and identifier is lexical rather than
syntactical, I don't understand your reason for qualifying your question
by parser lookahead.


Assuming I may continue, I would like to offer a strategy for handling
such a conflict.


Instantiate a flag, such as "typedefRecognition," and initialize it to
"ON" prior to parser invocation, after completed declarations and "OFF"
after recognition of typedef names. The scanner will consult the setting
of this flag after recognizing the lexeme as an "identifier" but before
returning the token ID. If "typedefRecognition" is "ON," the scanner
searching the name space for a symbol with an attribute matching
"typedefName."


Something like this might be in order:


parser code:


typedefRecognition = ON ;
status = yyparse() ;


scanner code:


... identifier scanning ...


tokenToReturn = tIdentifer ;


if(typedefRecognition == ON && isTypedefName(ActiveScope, lexeme)) {
typedefRecognition = OFF ;
tokenToReturn = tTypedefName ;
}


grammar file:


type-specifier:
tVoid typedefRecOn
| tChar typedefRecOn
.
.
.
| tTypedefName typedefRecOn
;


typedefRecOn:
{
typedefRecognition = ON ;
}
;


Hope this is helpful,


Jeff




----------------------------------------------------------------------
Jeff Woods
Software Engineer (214) 497-4501
Software Development Tools Group (214) 497-4500 FAX
CONVEX Computer Corporation jwoods@convex.COM
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.