From: | Hans-Peter Diettrich <DrDiettrich1@aol.com> |
Newsgroups: | comp.compilers |
Date: | 1 Apr 2007 08:55:56 -0400 |
Organization: | Compilers Central |
References: | 07-03-106 07-03-111 07-03-121 |
Keywords: | parse, practice |
Posted-Date: | 01 Apr 2007 08:55:56 EDT |
John Sasso wrote:
> Thank-you for responding to my post. Well, one thing I neglected to
> note (for the sake of simplicity of the post, but on retrospect I should
> have) is that a new version of the language may have components of the
> prior version of the language removed from the grammar. That is, L_k
> may have certain keywords or statement constructs removed (made
> obsolete) which were present in L_k-1.
This will make a big difference. I already suspected that your problem
would be not as easy to solve, as I outlined ;-)
>>If the purpose is the rejection of something like non-ANSI extension to
>>a language, I'd assign version number attributes to the added rules.
>>Then one can determine, during the application of a rule, whether it is
>>part of the given language version. Or one can determine, after an
>>parse, what's the minimal required language version, from the maximum of
>> the processed rule version numbers.
>>
>
>
> OK, that's interesting, I was thinking of something similar (i.e. adding
> version # attributes to the production rules), but it seemed a bit
> complex.
My picture of an parser is dominated by LL and handwritten recursive
descent parsers, where it's not a technical problem to add restrictions
to the code. A table driven approach (automaton) will make such
additions more complicated, of course.
> However, if it is doable, can you point me to any literature
> that describes how to do such?
No special literature. BNF doesn't allow to add such predicates or
attributes to the syntactical description of a formal grammar, instead
such things must go into the semantical part. Semantical actions only
are introduced in concrete implementations of parser generators
(yacc...), and are valid only with regards to that specific grammar
extension model. Adding version number based decisions to a yacc grammar
would be equivalent to the manual modification of an recursive descent
parser, but with the added complexity that one cannot immediately
determine the impact of such (very local) code to the generated
automaton. Yacc allows for some attributes, like operator precedence or
grouping, but I'm not sure whether these informations are evaluated
during the creation of the automaton, or will be evaluated at the
runtime of the generated parser. Maybe implementation specific...
So everything beyond traditional LL and LR parser generators will be
more helpful with your problem, with attributes or predicates added to
the formal grammar.
> Lets just assume I have full access to the "meta-language" for new
> versions of the language. Not the parser or grammar w/ production
> rules, mind you, but with the meta-language (i.e. syntax rules and the
> like) I can construct the production rules.
I.e. you specify the meta language, and provide the according parser
generator or interpreter, right?
Then the procedures depend on the usage of your parser generator. Will
it be acceptable that your program reads the grammar of a single
specific language, as provided by the actual user, or is there a
requirement that the grammar must contain multiple versions, which are
selectable after construction of the parser?
If the parser generation is fast enough, I'd read the given grammar at
runtime of the application, so that a parser for a single language can
be created on the fly. A preprocessor could extract the actual grammar,
if the source ever should contain multiple language versions. For the
parser I'd use an object oriented interpreter approach, which doesn't
require transformations into NDFA and DFA tables, and which also doesn't
require separate compilation of the generated parser. There exist
examples of such fast interpreters, I've already been playing with MetaS
and TextTransformer myself, in addition to writing my own EBNF interpreter.
You also may have an look at GNU getopt, if your language is related to
parsing commands. I already played around with an extended model, that
is entirely based on tables, with no user supplied code in the parser.
There the key point is the determination of the variables (or objects or
list entries), to which the parsed values shall be assigned.
> Is this w.r.t the language tree? I apologize that I am not familiar
> with such (I know syntax trees, but I gather you are not referring to
> the same thing).
According to your description I thought that, from a given L_k, user A
can produce a language L_k_A, and user B can create a language L_k_B,
which exist in parallel, as branches of the root language L_k. All these
languages can be represented in form of an tree, evolving from the
original language (root node). It's just like the evolution of source
code, with multiple versions and branches stored alltogether in a
version control system - not related to language or grammar theory at all.
DoDi
Return to the
comp.compilers page.
Search the
comp.compilers archives again.