Related articles |
---|
Using multiple lexers/scanners to parse one language. usenet@kikobu.com (Morten) (2002-05-12) |
Re: Using multiple lexers/scanners to parse one language. kshiva@synopsys.com (Kintali Shiva Prasad) (2002-05-17) |
From: | Morten <usenet@kikobu.com> |
Newsgroups: | comp.compilers |
Date: | 12 May 2002 00:10:48 -0400 |
Organization: | TDC Internet |
Keywords: | lex, |
Posted-Date: | 12 May 2002 00:10:48 EDT |
Hi. I've devised a small query language for hierarchic structures (XML).
Example: FOR /home/ WHERE this()/name() = "morten" RETURN this();
For the first part of that query, the following tokens get returned:
FOR SLASH HOME SLASH WHERE
I need to tokenize on slash, as I need to be able to identify each
step in the path when building the parse tree. The following however,
gives me some problems:
FOR /where/where/ WHERE ...
FOR SLASH WHERE SLASH WHERE SLASH WHERE
My question is this, is it conceptually wrong to handle my language
as two languages, a query language and a path language. I can modify
my lexer to give: FOR PATH WHERE but not tokenizeing on slash. Then
feed the PATH token to a second lexer/parser set, and have that
tokenize on slash and make the parse tree for the PATH token.
What is the common approach to this type of problem?
Thanks
Morten
[Use start states in the lexer, which is morally equivalent to two
lexers but easier to implement. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.