Related articles |
---|
ASP-style grammar o_dwyer_john@hotmail.com (2003-03-30) |
Re: ASP-style grammar 6667@wp.pl (kat-Zygfryd) (2003-03-30) |
From: | o_dwyer_john@hotmail.com (John O'Dwyer) |
Newsgroups: | comp.compilers |
Date: | 30 Mar 2003 00:44:49 -0500 |
Organization: | http://groups.google.com/ |
Keywords: | parse, question, comment |
Posted-Date: | 30 Mar 2003 00:44:49 EST |
I'm having difficulties building an ASP-style grammar to parse
something like this:
some boilerplate markup some boilerplate markup
some boilerplate markup some boilerplate markup
<% script code goes here %>
some boilerplate markup some boilerplate markup
some boilerplate markup some boilerplate markup
I want to use production rules:
file -> contents
contents -> content
contents -> contents, content
content -> boilerplate_markup
content -> script_block
script_block -> open_tag, script_expressions, close_tag
And regular expressions:
'(.|\n)*' = boilerplate_markup // any char including CR
'<%' = open_tag
'%>' = close_tag
Problem is, the lexer classifies the whole input as boilerplate_markup
rather than reducing at the open_tag (which then pushes a new lexer
for the script expressions).
I can get it to work if I reclassify the boilerplate as a single char:
'(.|\n)' = boilerplate_markup
But this creates an unmanageably large parse tree.
I guess I need to rephrase the regular expressions, but I've tried
everything I can think of without success.
Any help would be gratefully received!
Many thanks in advance,
John.
[Most lexers read the largest chunk they can, so your original pattern
slurps right across the <% marker. Try something like this to force it
to stop and look at the <
'(([^<]|\n)*|<)' = boilerplate_markup // any char including CR
-John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.