Related articles |
---|
2 word token as one in lex makotosu@my-deja.com (2000-07-18) |
Re: 2 word token as one in lex troy@bell-labs.com (Troy Cauble) (2000-07-23) |
Re: 2 word token as one in lex james.d.carlson@sun.com (James Carlson) (2000-07-23) |
Re: 2 word token as one in lex kszabo@nortelnetworks.com (Kevin Szabo) (2000-07-23) |
From: | James Carlson <james.d.carlson@sun.com> |
Newsgroups: | comp.compilers |
Date: | 23 Jul 2000 16:57:11 -0400 |
Organization: | Sun Microsystems Inc. - BDC |
References: | 00-07-034 |
Keywords: | parse |
makotosu@my-deja.com writes:
> if the lexer sees UNION and the next token (after any # of whitespaces,
> tabs and newlines) is JOIN it should return UNION_JOIN
Why? The parser should be able to handle this, shouldn't it?
> I want to do this in the lexcial analyzer, not the parser. Is this
> possible? I was thinking of using exclusive states but I could not get
> it working, I did something like
Here's something that's equivalent to what you wrote and does what you
ask (note that it doesn't handle word-ends at all correctly; but
neither did the previous example).
%{
#include <stdio.h>
%}
%%
"UNION" { puts("union-alone"); }
"UNION"[ \t\n]+"JOIN" { puts("union-join"); }
"JOIN" { puts("join-alone"); }
%%
int
main(int argc, char **argv)
{
yyin = stdin;
yylex();
}
--
James Carlson, Internet Engineering <james.d.carlson@east.sun.com>
SUN Microsystems / 1 Network Drive 71.234W Vox +1 781 442 2084
MS UBUR02-212 / Burlington MA 01803-2757 42.497N Fax +1 781 442 1677
[Still doesn't handle comments. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.