Re: Lex , Yacc Problem

blackmarlin@asean-mail.com (C)
23 Sep 2003 12:56:26 -0400

          From comp.compilers

Related articles
Lex , Yacc Problem dbansidhar@indiainfo.com (2003-09-14)
Re: Lex , Yacc Problem paulchen32@freenet.de (Jens Kallup) (2003-09-22)
Re: Lex , Yacc Problem rkrayhawk@aol.com (2003-09-22)
Re: Lex , Yacc Problem blackmarlin@asean-mail.com (2003-09-23)
| List of all articles for this month |

From: blackmarlin@asean-mail.com (C)
Newsgroups: comp.compilers
Date: 23 Sep 2003 12:56:26 -0400
Organization: http://groups.google.com/
References: 03-09-051
Keywords: lex, parse
Posted-Date: 23 Sep 2003 12:56:26 EDT

dbansidhar@indiainfo.com (Bansidhar) wrote in message news:03-09-051...


[snip]


> Requriement : Once I encounter second #if I want to insert a string "
> \n " in the input so that lexer will see the second #if statement as
> #if 0 ((test). How can I achieve this functionality in lex. Can I
> insert some string in input file in the run time ? Or is there any
> other way to achive the similar functionality..


I cannot think of a way of how to achieve this in Lex alone.
When I implemented a C style processor (as you are doing), I
implemented a custom parser which sits between the the scanner
and the parser per se. This implemented a simple state machine
with an addition state storage LIFO stack which parsed the
preprocessor constructs and decided whether or not to pass
tokens to the main parser (and also did a few other bits and
bobs not relevant to this discussion).


The overall picture is as follows:
    NORMAL ( pass tokens directly to main parser)
    NORMAL_IF ( as NORMAL, but in IF statement )
    IGNORE_IF ( ignore tokens, except #elif (true) or #else )
    IGNORE_ALL ( ignore all tokens )


#IF always pushes the current state onto the stack, (even in
IGNORE_ALL), then evaulates whether the state should be NORMAL
/ IGNORE_IF or IGNORE_ALL depending on the conditional and
pushed state.


#ENDIF always pops the state. (A stack underflow indicates an
error).


#ELSE will toggle NORMAL_IF to IGNORE_ALL or IGNORE_IF to
NORMAL_IF, IGNORE_ALL is unchanged. (Other states indicate an
error).


#ELIF will toggle NORMAL_IF to IGNORE_ALL, IGNORE_IF to
NORMAL_IF (provided the conditional is true) and IGNORE_ALL
is unchanged. (Other states -> error.)


With very little extra work #INCLUDE can be supported (though
this would be more difficult with Lex/Flex -- I always use a
custom written scanner which lacks problems with buffering.)


Also #DEFINE, #UNDEF may be implemented (#IFDEF / #IFNDEF are
virtually identical to #IF), playback is fairly easy by adding
the states PLAYBACK (for when a #DEFINEd identifier is found)
and PLAYSKIP (to skip to the end of a definition) and then
intercepting #DEFINEd identifiers before they reach the parser.


The problem with this approach it that complex expressions
for the #IF conditional are not supported. You could use
feedback from the parser to change the states on an ignored
#IF, though this may stop constructs like (C)...


    if( test == FALSE )
        fprintf( stderr, "failure :-( "
#IFDEF DEBUG
            "[%s:%u]", __FILE__, __LINE__
#ENDIF
            );


(If, as you say, you are implementing an assembler then
this should not be a problem if you are taking the path
choosen by most assemblers and use \n as a statement
terminator.)


Alternatively you could implement the preprocessor as an
entirely seperate programme -- though this would result in
an increase in compilation time which may be unacceptable.


(As my preprocessor did not require anything more complex
than integer comparason, I took the easy way out and did not
implement it.)


C
2003/9/15


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.