|Using multiple lexers/scanners to parse one language. firstname.lastname@example.org (Morten) (2002-05-12)|
|Re: Using multiple lexers/scanners to parse one language. email@example.com (Kintali Shiva Prasad) (2002-05-17)|
|Date:||12 May 2002 00:10:48 -0400|
|Posted-Date:||12 May 2002 00:10:48 EDT|
Hi. I've devised a small query language for hierarchic structures (XML).
Example: FOR /home/ WHERE this()/name() = "morten" RETURN this();
For the first part of that query, the following tokens get returned:
FOR SLASH HOME SLASH WHERE
I need to tokenize on slash, as I need to be able to identify each
step in the path when building the parse tree. The following however,
gives me some problems:
FOR /where/where/ WHERE ...
FOR SLASH WHERE SLASH WHERE SLASH WHERE
My question is this, is it conceptually wrong to handle my language
as two languages, a query language and a path language. I can modify
my lexer to give: FOR PATH WHERE but not tokenizeing on slash. Then
feed the PATH token to a second lexer/parser set, and have that
tokenize on slash and make the parse tree for the PATH token.
What is the common approach to this type of problem?
[Use start states in the lexer, which is morally equivalent to two
lexers but easier to implement. -John]
Return to the
Search the comp.compilers archives again.