Re: Help - partial parsing of C

e-sink@uiuc.edu (Eric W. Sink)
Wed, 22 Jan 1992 20:03:12 GMT

          From comp.compilers

Related articles
Help - partial parsing of C dta@dcs.exeter.ac.uk (Daniel Tallis) (1992-01-20)
Re: Help - partial parsing of C e-sink@uiuc.edu (1992-01-22)
Re: Help - partial parsing of C al@nmt.edu (1992-01-24)
| List of all articles for this month |
Newsgroups: comp.compilers
From: e-sink@uiuc.edu (Eric W. Sink)
Keywords: C, pars
Organization: University of Illinois at Urbana-Champaign
References: 92-01-075
Date: Wed, 22 Jan 1992 20:03:12 GMT

In 92-01-075 Daniel Tallis <dta@dcs.exeter.ac.uk> writes:
>As part of my third year project I need to do partial parsing of C
>programs....
>The reason for wanting to do this is that the project is a 'smart editor'
>which 'understands' the program to a certain extent. ...
>[I suspect that the easiest thing to do is to pick up one of the freely
>available yacc parsers listed in the comp.compilers FAQ and use that. You
>can find statement boundaries lexically pretty easily by looking for
>semicolons, and matching parens and braces. If you want to identify all
>the statement types distinguish between def and ref mentions of a name
>you're going to have to parse 90% of the language anyway. Yacc is no
>speed demon but many people find it to be fast enough. -John]


I recently designed a smart editor for another language, and faced a number
of the same issues.


I agree with John, that you are best off just going for a full parser from
the very beginning. We debated this issue at great length, and decided that
really very little could be accomplished without a full parser, a tremendous
amount was possible *with* the full parser, and the full parser is probably
easier than a partial one, since free grammars are available. I would
recommend using the GMD tools instead of yacc. I used them on my project
and was very pleased with the results.


We decided to keep two representations of the source file in memory at all
times: both the text file itself, and the corresponding parse tree data
structure. We spent much time trying to reason a way to keep only one or
the other, and were eventually convinced of the necessity of both. The
problem then becomes one of synchronization between the tree and the file.


What happens when the user edits a statement? How do we update the tree to
keep things in sync?


What happens when the user deletes a single semicolon, or makes some other
change which renders the C source file semantically or syntactically
invalid?


We eventually decided on two main modes of editing. One mode simply does
not allow modifications to the file which render it invalid. The tree is
kept in sync by direct modifications to the tree. The other mode allows any
changes, and allows full editor functionality, but the user may not save or
edit until the file is reparsed.


Of course, a criteria of our design was to disallow the creation of invalid
source files when using the editor. This may not be important to you.
These are just ideas.


By using a full parser, you will be able to check more things. In addition,
the accuracy of your parse will be better. Attempting to simply find the
ends of statements, without using a full parser will likely yield some
incorrect statements. Distinguishing between a declaration and a statement,
and the others things you mentioned are even harder.


Just ideas...
--
Eric W. Sink, Spatial Analysis and Systems Team
USACERL, P.O. Box 9005, Champaign, IL 61826-9005
1-800-USA-CERL x449, e-sink@uiuc.edu


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.