Re: An interesting Parser problem

Hans-Peter Diettrich <>
Fri, 30 Nov 2007 09:32:38 +0100

          From comp.compilers

Related articles
An interesting Parser problem ( (2007-11-29)
Re: An interesting Parser problem (2007-11-30)
Re: An interesting Parser problem (Hans-Peter Diettrich) (2007-11-30)
Re: An interesting Parser problem (Holger Siegel) (2007-11-30)
Re: An interesting Parser problem (Chris F Clark) (2007-11-30)
Re: An interesting Parser problem (Gene) (2007-12-02)
Re: An interesting Parser problem (RLW) (2007-12-08)
| List of all articles for this month |

From: Hans-Peter Diettrich <>
Newsgroups: comp.compilers
Date: Fri, 30 Nov 2007 09:32:38 +0100
Organization: Compilers Central
References: 07-11-081
Keywords: parse, editor
Posted-Date: 30 Nov 2007 20:33:51 EST wrote:

> We are planning to make an intelligent text editor (Why is not
> important, we have to do it.)
> The text editor would show declarations, definitions, function calls
> etc in a logical grouping and would also support intelligent text
> editing for eg. It would change the name of a variable "foo" in a
> higher scope but other variables with same name "foo" in other scopes
> present will not be affected. eg.

This reminds me to old BASIC on homecomputers, and to refactoring nowadays.

You'll have to decide when a name has been changed, so that you'll know
when it's time to update the text on the screen, for all occurences of
the changed identifier.

> So we need to construct a parse tree of the input. But after editing,
> we want to conserve the comments and white space as much as possible
> so that the user does not see drastic changes to original code.

You also have to decide, when a structural change is finished, so that
the text can be parsed again, and how to deal with
syntactical/semantical errors.

> One approach is to store to file name and offset information of each
> token in the parse tree. During decompiling, text between two offsets
> is copied as such from original file if its not edited. If edited,
> then the new text is used.

There exist techniques for unlimited undo capabilities, managing all
changes to an text. But it isn't as easy to fit together the text and
parse tree after changes. You may distinguish between valid parts of the
input, invalid parts (e.g. in update), and changed parts. Your parser
should be clever enough, to recover ASAP from syntax errors. Nonlocal
changes (name substitution...) should be allowed only for entirely valid
source text.

> So the first issue is to find the offset of each token in lex (yacc ?)
> This should also handle if some include file construct like "include
>" is encountered. In such a case, filename and offset from new
> file should be returned.

How do you intend to represent included files (inline?), and do you want
to update these files as well? Otherwise it might be possible to ignore
included files, depending on your overall intentions.

Did you also consider macro capabilities (#define...)?

> Is there any other tricky case possible ?

As many as you (not ;-) like.

IMO you should decide carefully whether you want an text editor, with
added checks and transformation capabilities, or treat input as (valid)
structures of a certain language, and only allow for structurally valid
changes. The latter is the old and easy BASIC way, restricting the
language and the user in many ways.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.