Related articles |
---|
Writing a parser/lexical analyzer builder. alex@alexandermorou.com (Alexander Morou) (2007-10-26) |
Re: Writing a parser/lexical analyzer builder. DrDiettrich1@aol.com (Hans-Peter Diettrich) (2007-10-27) |
From: | "Alexander Morou" <alex@alexandermorou.com> |
Newsgroups: | comp.compilers |
Date: | Fri, 26 Oct 2007 01:26:05 -0500 |
Organization: | Compilers Central |
Keywords: | parse, tools, question |
Posted-Date: | 26 Oct 2007 09:39:12 EDT |
Greetings,
I'm attempting to build a parser/lexical analyzer builder in C# using
the .NET 2.0 Foundation. Right now I'm in the conceptualization
stage, and I wanted to get a bit of insight before I got too far into
building code that might potentially just blow up in my face.
I could easily write a simple parser that understands EBNF or BNF,
however I wanted to ease the construction of writing grammar
description files for complex systems (say C#, its expression system
alone has eleven different precedences, not just operators,
-precedences-.) To do so I'm introducing my own variant of templates
into the system.
I have defined a sample grammar in the following file:
http://lhq.rpgsource.net/text/csExpressions.oilexer
The project intends to use recursive descent to handle parses. I
realize that rules defined as: AddExp ::= AddExp AddOperators
MulDivExp | MulDivExp will by default, in recursive descent, recurse
into infinity without precautionary measures. To solve such a crisis
I decided I would add a '_Continuous' parse case to self-referencing
First-targets and thus as a pseudo mockup I decided upon:
http://lhq.rpgsource.net/text/aeTest.txt
Where AddExp references another Production Rule, and MulDiv references
a token. You'll note the differences in their proposed
implementation, one strictly uses the lookahead from the local
tokenizer stock, the other uses a parse method to determine the same.
Now remember the above code is just something I threw together to
presumably solve the issue, I have not verified it because I want to
ensure I'm taking the project in the right direction before I code en
masse.
The reason I'm posting is: an online associate of mine said that the
extensions to an already established norm (EBNF) are superfluous. Is
there any use in adding templates, allowing tokens to be categorized
to make writing rules easier, and other changes? I don't want to
waste my time in creating a way to obfuscate grammar description in a
way that's unusable. If it is useful can it be cleaned up, or is what
I have even viable at all?
Thanks in advance,
-Alexander Morou
Return to the
comp.compilers page.
Search the
comp.compilers archives again.