Re: parser generator terminology

Chris F Clark <cfc@shell01.TheWorld.com>
Sun, 06 Sep 2009 20:16:46 -0400

          From comp.compilers

Related articles
parser generator terminology rpboland@gmail.com (Ralph Boland) (2009-09-06)
Re: parser generator terminology mhelvens@gmail.com (Michiel) (2009-09-06)
Re: parser generator terminology DrDiettrich1@aol.com (Hans-Peter Diettrich) (2009-09-06)
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-06)
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-07)
Re: parser generator terminology haberg_20080406@math.su.se (Hans Aberg) (2009-09-07)
Re: parser generator terminology mhelvens@gmail.com (Michiel) (2009-09-07)
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-07)
Re: parser generator terminology haberg_20080406@math.su.se (Hans Aberg) (2009-09-09)
Re: parser generator terminology cfc@shell01.TheWorld.com (Chris F Clark) (2009-09-13)
[2 later articles]
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: Sun, 06 Sep 2009 20:16:46 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 09-09-038
Keywords: parse
Posted-Date: 07 Sep 2009 06:48:49 EDT

Ralph Boland <rpboland@gmail.com> writes:


> I am designing my own parser generator tool (please don't advise me on
> the wisdom of creating yet another parser generator tool) and am
> trying to sort out the terminology so that I can name my classes,
> methods and variables (identifiers).


First, although I'm one of those who regularly questions the wisdom of
yet-another-parser-generator, I will refrain in your case because you
have clearly earned your "stripes" by doing this several times before.
If you feel there is something youy want to solve, the solve it and
don't be defensive about it. I won't even make the token offer of
trying to collaborate on a new Yacc++ version, because I know your
approach is sufficiently different that it would be a hinderance than
a help.


> 1) Is there a name for the definition of the set of tokens; preferably
> a short name useful for naming identifiers? (I do not like regular
> definition since it implies a set of rules I do not follow.)


The set of all tokens should be called a synonym of the "vocabulary"
(or perhaps vocab for short). The set of all characters is the
alphabet (or sigma). However, I think that's not what you are asking.


Do you mean a set of tokens that have different spellings but the same
token type? As opposed to tokens that have only one spelling. You
mention identifiers and they fit in this category. I'm not certain
their is a name that describes tokens that mean sets.


> In my system the grammar is not attributed and instead a separate
> table is used to define the attributes and semantic actions associated
> with the grammar rules. I need a name for this table but don't know
> what to call it. Possibilities are attribute table, transformation
> table, and abstract syntax table. I don't like these though because
> they lead to long identifier names.


The abbreviation AST for abstract syntax tree would be short and is
well-known.


> I am considering using tree morph
> table, or just morph table. Individual entries in the morph table
> would be called morphs. This works well for naming identifiers but
> will be cryptic for anybody else but me. Can anybody make a better
> suggestion than those I mentioned?


Morph seems fine, since it is a nice word that means to change the
shape of something, and that's what your table does.


> 3) Most parser generator tools build parsers that call the scanner to
> get tokens. In my system the parser calls a token fetcher that used
> one or more scanners to get tokens and may then process its input
> before returning its versions of tokens back to the parser. I do not
> have a good name for the token fetcher. One possibility is to call
> the scanner the lexer and the token fetcher the scanner. Can anybody
> suggest a name for my token fetcher?


If you put thinks like converting indentifiers into keywords in your
2nd phase, you could use Frank DeRemenrs names: the scanner (regx
processing, forming tokens from chars) and the screener (taking the
tokenize text and separating it into meaningful classes).


Another name you might consider for the after scanner part is the
"filter", that would be appropriate if its job is to remove certain
tokens that are unnecessary (e.g. whitespace and comments).


> 4) Some systems have specifications for defining methods for walking
> over abstract syntax trees and making further transformations or
> constructions. I am not planning to do this but am curious to know a
> good name for such a specification preferably suitable for identifier
> naming.


Typically, I've heard such things called "rewriters".


> All suggestions much appreciated.
>
> Ralph Boland


Hope this helps,
-Chris


******************************************************************************
Chris Clark Internet: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. or: compres@world.std.com
23 Bailey Rd Web Site: http://world.std.com/~compres
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.