Related articles |
---|
parser generator terminology rpboland@gmail.com (Ralph Boland) (2009-09-06) |
Re: parser generator terminology mhelvens@gmail.com (Michiel) (2009-09-06) |
Re: parser generator terminology DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-09-06) |
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-06) |
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-07) |
Re: parser generator terminology haberg_20080406@math.su.se (Hans Aberg) (2009-09-07) |
Re: parser generator terminology mhelvens@gmail.com (Michiel) (2009-09-07) |
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-07) |
[4 later articles] |
From: | Michiel <mhelvens@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Sun, 6 Sep 2009 13:54:56 -0700 (PDT) |
Organization: | Compilers Central |
References: | 09-09-038 |
Keywords: | parse, design |
Posted-Date: | 06 Sep 2009 19:36:53 EDT |
Ralph Boland wrote:
> I am designing my own parser generator tool (please don't advise me on
> the wisdom of creating yet another parser generator tool)
I believe there's a lot to be improved upon in this field. I have
plans to write a parser generator too. But it's far down my list. ;-)
I've always wanted these features in a parser generator:
* An EBNF dialect, which automatically generates lists and optional
values in the output language. Possibly the type of
list<T>/optional<T> and the method for generating them may be
specified separately from the grammar.
* Automatic location tracking. Again, with the code to do this
specified separately from the grammar. In my Flex/Bison code, I'm
finding the same couple of lines in virtually every rule.
* Parameterized grammar rules, to specify the same decoration for
several constructs. For example:
commalist(A): [ A { ',' A } ] ;
formalparameters: commalist( type identifier ) ;
tuple: '(' commalist( expression ) ')' ;
would be equivalent to:
formalparameters: [ type identifier { ',' type identifier } ] ;
tuple: '(' [ expression { ',' expression } ] ')' ;
I believe that with static analysis, such a grammar could always be
automatically transformed to a grammar without parameterized rules.
* A way to output a human-readable grammar with documentation for
grammar rules automatically extracted from the source-code.
Feel free to use or ignore these as you wish.
> ... The tokens correspond to the nonterminals of the
> grammar.
'terminals', I assume you mean.
> From here things are less clear to me.
>
> 1) Is there a name for the definition of the set of tokens; preferably
> a short name useful for naming identifiers? (I do not like regular
> definition since it implies a set of rules I do not follow.)
Not that I know of. What about 'vocabulary'?
> 2) Most parser generator tools actually use attribute grammars, which
> have attributes or semantic actions, and build abstract syntax trees.
> In my system the grammar is not attributed and instead a separate
> table is used to define the attributes and semantic actions associated
> with the grammar rules. I need a name for this table but don't know
> what to call it. Possibilities are attribute table, transformation
> table, and abstract syntax table. I don't like these though because
> they lead to long identifier names. I am considering using tree morph
> table, or just morph table. Individual entries in the morph table
> would be called morphs. This works well for naming identifiers but
> will be cryptic for anybody else but me. Can anybody make a better
> suggestion than those I mentioned?
I've never heard the word 'morph' associated with semantic actions in
a parser. I guess I'd stick to 'semantic actions', 'semantics' or
'actions'.
> 3) Most parser generator tools build parsers that call the scanner to
> get tokens. In my system the parser calls a token fetcher that used
> one or more scanners to get tokens and may then process its input
> before returning its versions of tokens back to the parser. I do not
> have a good name for the token fetcher. One possibility is to call
> the scanner the lexer and the token fetcher the scanner. Can anybody
> suggest a name for my token fetcher?
'Tokenizer'?
> 4) Some systems have specifications for defining methods for walking
> over abstract syntax trees and making further transformations or
> constructions. I am not planning to do this but am curious to know a
> good name for such a specification preferably suitable for identifier
> naming.
That would be 'visitor'. The name comes from the Visitor design
pattern [Design Patterns, GoF]. For a description of this technique,
read:
http://en.wikipedia.org/wiki/Visitor_pattern
Or read the book. I can recommend it. Concrete visitors are usually
called SomethingVisitor. For you, possibly NamingVisitor.
> All suggestions much appreciated.
Good luck with your project.
--
Michiel Helvensteijn
Return to the
comp.compilers page.
Search the
comp.compilers archives again.