Related articles |
---|
Q1. reducing grammar redundancy spam@abelectron.com (valentin tihomirov) (2005-02-20) |
Re: Q1. reducing grammar redundancy wyrmwif@tsoft.org (SM Ryan) (2005-02-28) |
Re: Q1. reducing grammar redundancy vidar@hokstad.name (Vidar Hokstad) (2005-02-28) |
Re: Q1. reducing grammar redundancy branco.medeiros@gmail.com (2005-02-28) |
Re: Q1. reducing grammar redundancy spam@abelectron.com (valentin tihomirov) (2005-02-28) |
From: | "Vidar Hokstad" <vidar@hokstad.name> |
Newsgroups: | comp.compilers |
Date: | 28 Feb 2005 00:47:42 -0500 |
Organization: | http://groups.google.com |
References: | 05-02-086 |
Keywords: | parse |
Posted-Date: | 28 Feb 2005 00:47:42 EST |
valentin tihomirov wrote:
> Call me an optimization paranoid but...I have a question regarding
> the issue of grammar generalization to be used in language tools.
[snipped lisp example]
> [2] The issue conserning SGML redundancies. Please consider a basic XML
> element:
> element -> OPEN_PAREN "element" CLOSE_PAREN // opening tag
> element // body
> OPEN_PAREN SLASH "element" CLOSE_PAREN; // closing tag
> As you see, the closing tag duplicates the appearance of "element"
> keyword. I may mistake easily here, looks like a context-sensetive
> rule is needed here in order to avoid the repetition. How can it be
> written in more compact way? How can it be written in a more optimal
> form acceptable by parser?
A simple, general approach to removing this kind of redundancy where
it makes sense is to push some checks out of the parser itself. For
XML for instance, it's highly unusual to write specific parsers for
specific XML vocabularies. Instead you'd normally use a generic parser
to handle the overal syntax of the tags and other markup, and enforce
the pairing and nesting of elements without expressing them directly
in the formal grammar.
This is what allows for the flexibility of defining languages for
dynamically specifying the grammar while still allowing a wide variety
of tools to parse vocabularies they don't understand the meaning of.
But the big question is why you want to write this in a more compact
way? If you're going to hardwire an XML parser, writing someting like:
element: "<element>" element_content "</element>"
seems to me at least to be far preferrable to making the grammar
unreadable just because you want to remove a few characters here and
there.
Premature optimization and all that... Is there a specific problem that
make you want to do this?
Vidar
Return to the
comp.compilers page.
Search the
comp.compilers archives again.