Guide for the creation of translators by RD parser generators

Detlef Meyer-Eltz <Meyer-Eltz@t-online.de>
7 Oct 2005 21:47:09 -0400

          From comp.compilers

Related articles
Guide for the creation of translators by RD parser generators Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-10-07)
Re: Guide for the creation of translators by RD parser generators Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-10-08)
| List of all articles for this month |
From: Detlef Meyer-Eltz <Meyer-Eltz@t-online.de>
Newsgroups: comp.compilers
Date: 7 Oct 2005 21:47:09 -0400
Organization: Compilers Central
Keywords: parse, design, LL(1)
Posted-Date: 07 Oct 2005 21:47:09 EDT

I just wrote this guide as part of the TextTransformer help and I
think at least the third point could be a matter of discussion. This
guide evolved from my personal erperiences and I'm curious to know,
whether you can accept it or whether you have other proposals,
annotations or additions.


It is presumed, that no grammar description exist.






Guide for the creation of translators by RD parser generators
-------------------------------------------------------------


1. Set the required project options!


E.g. it is very important already at the beginning of the development
of a new project, to select the characters, which don't have a meaning
for parsing the texts. Per default the line feed and the line break
characters are amongst them. This setting must be changed, if line
breaks have to be recognized.




2. At first design the parser without semantic actions!


For the construction of the parser it often will be necessary or
appropriate to rearrange productions and to simplify complex
productions by definition of sub-productions. If the parser already
contained a semantic code, this had to be adapted newly at each of
these changes.




3. Develop top down!


Start with the most general production, the start rule that shall
recognize the complete text, and then take the start rule to pieces of
sub-productions which shall recognize principal parts of the text.
According to the same principle the sub-productions then further can
be refined. If e.g. a book shall be parsed, then the start rule would
be:


Book ::= SKIP // recognizes the whole text


After the first improvement:


Book ::= SKIP Chapter+ SKIP


Chapter ::= TITLE SKIP


"TITLE" here stands for a regular expression, which unmistakably
distinguishes a chapter heading from other text components.
Remark: Such an expression doesn't exist certainly for all books. The
book is used as an example of a text structure, which everybody knows.
The book parser works only for syntactically ideal books.


The Chapter production can further be refined now:


Chapter ::= TITLE EOL* Paragraph+


Paragraph ::= EOL SKIP


EOL ::= \r?\n // end of line


The advantage of this top down procedure is, that in every stage of
the development the current parser can be tested at all "books".
Possible faults can so already be discovered in an early stage of
development.


Note: With the transformation manager many examples can be tested as a
batch. If such a test fails, the corresponding text can be opened with
a click in the IDE.




4. Choose the kind of transformation


There in principle are three ways how the parser can be completed to a
transformation program. They differ in what is done with the
recognized text sections.


a) text sections are immediately processed and written into the
output.


b) text sections are, written into variables and these are returned or
passed as (reference-) parameters to other productions, where they can
be evaluated or combined to new values.


c) a parse tree is produced and the processing of the text sections
are carried out after the complete text was parsed.


The last method is the most variable since all text sections still can
in principle be accessed and since with the parsing tree a different
output can be caused, depending on the used function table. If a
translator shall be developed, which shall convert one format into
several output formats, then the use of a parse tree is nearly
indispensable. The development of such a translator is, however, much
more difficult than the direct processing of the source text with one
of the two other methods.


If the order in which the processed text sections shall be put out is
approximately identically with the sequence in which they were
recognized, the first method of direct output is recommend. If
recognized text parts must be rearranged or the processing of a part
depends from a text that is found later, the second method is
recommend.


If you have decided about the way of the transformation, different
wizards can help you to insert parameters, variable declarations or
tree nodes into the productions.




5. Make a copy program before writing the definite transformation
      code!


This rule only applies to projects at which the source text shall be
modified in some significant places. If at first a program is made,
which simply copies the source text, by comparison with the target
text can be found easily, whether the output is complete.


--
mailto:Meyer-Eltz@t-online.de


url: http://www.texttransformer.de
url: http://www.texttransformer.com
url: http://www.text-konverter.homepage.t-online.de


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.