Re: yyless and pattern matching

rkrayhawk@aol.com (RKRayhawk)
1 Apr 2000 14:18:35 -0500

          From comp.compilers

Related articles
yyless and pattern matching mballen@NOSPAM_erols.com (Michael B. Allen) (2000-03-23)
Re: yyless and pattern matching cfc@world.std.com (Chris F Clark) (2000-03-25)
Re: yyless and pattern matching rkrayhawk@aol.com (2000-04-01)
| List of all articles for this month |

From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 1 Apr 2000 14:18:35 -0500
Organization: AOL http://www.aol.com
References: 00-03-104
Keywords: lex, parse

I wonder if there might be a workaround for you, by using a minus 2
factor in the exclusive state exit logic. As in ...


<X>[^#] {
        yyless( yyleng - 2 );
        printf( "no_longer_comment{%s}", yytext );
        BEGIN INITIAL;
      }


With the logic you have displayed, this will float a linefeed back
into the INITIAL state rule set. And I recon you have posted a
simplified version to help us see that you can isolate the problem.


So can't predict what devastating effect this trick might have on your
outter rules, but if in INITIAL you can tolerate strickly blank lines,
then maybe it will work. (If you are doing line counting out in
INITIAL, you may need an intermediate buffer 'state' to absorb the
spurious linefeed. As in ...




%x BUFFER_NL


<X>[^#] {
        yyless( yyleng - 2 );
        printf( "no_longer_comment{%s}", yytext );
        BEGIN BUFFER_NL;
      }




<BUFFER_NL> \n {
      /* don't count spurious */
        BEGIN INITIAL;
  }


\n { /*count lines in INITIAL state */
  my_line_count++;}


That is not pretty, nor have I tested it. But seems like it could get
the at-begining-of-line logic to toggle the way you want.




The expression
    yyless( yyleng - 2 )
only works within a context where all other rules in the <X> state
have previously just consumed the linefeed (as you illustrated with
your possibly reduced posted scanner).


Alternatively, again as a workaround, you might consider letting the
scanner do the lookahead for you. By perhaps getting started with the
'#' plus anything not a newline. Then look for the two possible
continuations that make sense:


    \n# is more comments
    \n(whitespace)# maybe is more (my guess)


all else is not a next line with a comment, so push it all back onto
the stack (that gets the line feed back, just as speculated above),
and then trigger BEGIN INITIAL.


Best Wishes,


Robert Rayhawk
RKRayhawk@aol.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.