Parsing C++ with Semantic-Free Syntax Analysis

"Lowell Thomas" <lowell@coasttocoastresearch.com>
3 Jun 2006 18:59:11 -0400

          From comp.compilers

Related articles
Parsing C++ with Semantic-Free Syntax Analysis lowell@coasttocoastresearch.com (Lowell Thomas) (2006-06-03)
| List of all articles for this month |

From: "Lowell Thomas" <lowell@coasttocoastresearch.com>
Newsgroups: comp.compilers
Date: 3 Jun 2006 18:59:11 -0400
Organization: Compilers Central
Keywords: C++, parse, available
Posted-Date: 03 Jun 2006 18:59:11 EDT

I'd like to first announce the release of APG - an ABNF Parser Generator -
Version 4.0. It has several new features, most notably:


  * a new operation, "repeat-until", is added to the seven ABNF rule-forming
operations
  * a "first-success" disambiguation rule is used to efficiently and
predictably keep the grammars unambiguous
  * parse tree export and import facilities have been added, allowing for
separate and multiple (retargetable) translations from a single parse tree
  * any rule may be used as the start rule, not just the first


This release also comes with sample applications which develop a
reasonably complete C++ preprocessor and parsers for two
languages. The languages are simple abstractions of small C++ subsets
chosen 1) for proof of concept, 2) for transparency for review and
study and 3) to reproduce some of the most difficult of the
non-context-free language requirements of C++ - distinguishing type
names from variable names, parsing class member function definitions
within class declarations and other declaration vs. expression
ambiguities.


The new features have been developed in response to a couple of
challenges. First is the friendly challenge to my original version of
APG (article 05-06-130) that it wasn't up to the task of generating
C++ parsers. Having taken up that challenge, though, I quickly came
face to face with a second and more interesting challenge, "Are
context-free grammars up to the task of generating C++ parsers?" To
the extent that the test languages faithfully reproduce the most
serious of the non-context-free requirements of C++, I believe I have
shown in the sample applications that they can be.


This is my first attempt to build a language complier, and while I
don't claim to be well-read on this topic, I have read the long,
50-post thread "Why context-free?" (article 05-10-053) with great
interest, where many aspects of this challenge have been discussed. I
got there a taste of many good soups from many good chefs, few of whom
agree on technique. So I know that the readers of this group have many
years and types of experience with this topic and will not be
surprised to see that my solution uses a series of three grammars to
handle the non-context-free requirements. What may be unique (I am
very interested in your comments) is the development of what I have
called, for lack of a better term, Semantic-Free Syntax Analysis
(SFSA) parsing. That is, syntax analysis is done with only the
algorithm-generated code - no syntactic predicates, semantic
predicates, mid-rule actions, lexer feedbacks. No semantic actions of
any kind are done in the syntax analysis stage. All semantic analysis
is done in a separate traversal of a single, syntax-generated parse
tree. SFSA parsing was developed, not because of any inherent
strengths it may or may not have, but because it is the far opposite
extreme from a purely handwritten parser, maximizing the utility of
parser generators.


The complete description, parser generator and sample demonstrations
can be found at my web site.


Lowell Thomas
www.coasttocoastresearch.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.