Using multiple lexers/scanners to parse one language.

Morten <usenet@kikobu.com>
12 May 2002 00:10:48 -0400

          From comp.compilers

Related articles
Using multiple lexers/scanners to parse one language. usenet@kikobu.com (Morten) (2002-05-12)
Re: Using multiple lexers/scanners to parse one language. kshiva@synopsys.com (Kintali Shiva Prasad) (2002-05-17)
| List of all articles for this month |
From: Morten <usenet@kikobu.com>
Newsgroups: comp.compilers
Date: 12 May 2002 00:10:48 -0400
Organization: TDC Internet
Keywords: lex,
Posted-Date: 12 May 2002 00:10:48 EDT

Hi. I've devised a small query language for hierarchic structures (XML).
Example: FOR /home/ WHERE this()/name() = "morten" RETURN this();


For the first part of that query, the following tokens get returned:


FOR SLASH HOME SLASH WHERE


I need to tokenize on slash, as I need to be able to identify each
step in the path when building the parse tree. The following however,
gives me some problems:


FOR /where/where/ WHERE ...


FOR SLASH WHERE SLASH WHERE SLASH WHERE


My question is this, is it conceptually wrong to handle my language
as two languages, a query language and a path language. I can modify
my lexer to give: FOR PATH WHERE but not tokenizeing on slash. Then
feed the PATH token to a second lexer/parser set, and have that
tokenize on slash and make the parse tree for the PATH token.


What is the common approach to this type of problem?


Thanks


Morten
[Use start states in the lexer, which is morally equivalent to two
lexers but easier to implement. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.