Re: Q2. Why do you split a monolitic grammar into the lexing and parsing rules?

"Vidar Hokstad" <vidar@hokstad.name>
28 Feb 2005 00:48:14 -0500

          From comp.compilers

Related articles
Q2. Why do you split a monolitic grammar into the lexing and parsing r spam@abelectron.com (valentin tihomirov) (2005-02-20)
Re: Q2. Why do you split a monolitic grammar into the lexing and parsi vidar@hokstad.name (Vidar Hokstad) (2005-02-28)
Re: Q2. Why do you split a monolitic grammar into the lexing and parsi Ron@xharbour.com (Ron Pinkas) (2005-02-28)
Re: Q2. Why do you split a monolitic grammar into the lexing and parsi mefrill@yandex.ru (2005-02-28)
Re: Q2. Why do you split a monolitic grammar into the lexing and parsi ndrez@att.net (Norm Dresner) (2005-02-28)
Re: Q2. Why do you split a monolitic grammar into the lexing and parsi rh26@humboldt.edu (Roy Haddad) (2005-03-04)
| List of all articles for this month |

From: "Vidar Hokstad" <vidar@hokstad.name>
Newsgroups: comp.compilers
Date: 28 Feb 2005 00:48:14 -0500
Organization: http://groups.google.com
References: 05-02-087
Keywords: parse

valentin tihomirov wrote:
> So, I do not understand why do we need the artificial obstacle, the
> 2nd level?


There's nothing inherent in parsing that prevents you from writing
parsers without any distinction between lexical symbols and compound
rules. In fact I've written several parsers that doesn't enforce any
difference at all.


Some points to keep in mind:
- It's often simpler to split the two. You can write a lexer and verify
it/debug it separately, and then nicely layer the parser on top.
- It's often nicer for error reporting to separate the two - you might
want to report on the full token expected, and point the user to a
token boundary.
- When hand writing a parser it is often much easier to understand if
the higher levels deals with a stream of tokens instead of having to
take into account issues such as whitespace, comments and
disambiguation of token boundaries that can be filtered out in the
lexer.
- The split often has little practical effect, but presents a
conceptual separation that sometimes simplify understanding of the
grammar.
- It's a split that at least I believe encourages good design, in that
it makes you think consciously about the recognition of tokens and how
to reduce ambiguity in the parser and promotes context free grammars.


Vidar


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.