|partial parsing Sonali@SetuIndia.com (Sonali Chitnis) (2001-04-26)|
|Re: partial parsing email@example.com (2001-04-30)|
|Re: partial parsing firstname.lastname@example.org (2001-05-03)|
|Date:||3 May 2001 13:41:49 -0400|
|Posted-Date:||03 May 2001 13:41:49 EDT|
"Sonali Chitnis" Sonali@SetuIndia.com
Date: 26 Apr 2001 21:11:57 -0400
I am working on an editor whose files adhere to a specific grammar.
My requirement is the Editor should also be able to parse the file
when the file is yet being created and not only after the file is
written compeletly.It should be able to parse it on a block basis. I
think this is similar to the partial parsing.
I am not at all strong in this area but let me pick on your words in a
good faith effort to help. If you must "... be able to parse the file
when the file is yet being created ...", then for all intents an
purposes you can let go of the notion that the "... files adhere to a
specific grammar ...".
What you can do with them is limited.
Now it is certainly possible for the editor component of a full-blown
IDE (integrated Development Environment) engine to be one of the front
adaptors of an incremental compiler, but you are not saying that.
I would drop back a bit and reconsider. If you are dealing with a set
of requirements that are restricted, then you may not actually need to
'parse'. And grammar is not your concern. I mean this with the best
intentions and full support for your success. If you do not have to
psych out the meaning (the semantics) then you do not need to parse.
You may really be only interested in robust token recognition, and
some near considerations of bounding lexical scope aspects of the
topology. That is not necessarily parsing, it is rather lexical
In this regard the reliance on start states that you note in your
early attempts is important. You must solve the problems of the
partial file first for this thing to work, and the start state
technologies go a long way towards violating that fundamental
requirement conceptually. Since a partial file (not partial parsing
mind you, but a really incomplete file), has no state at any number of
points in the text!
So, I know this will sound funny, but what you might do is list the
portion of your requirements that you can accomplish lexically, and
that portion that you can not.
Of the portion that is not obviously lexical try to see how to wrap
the lexer or put sub functions within it to deal with the issue as
Editors are keyboard bound, this means that you can actually spend a
lot more CPU and disk I/O time in wasteful things like read-ahead and
not really offend folks. Generally read your whole file (and it's
includes or whatever). Try to handle scope and block considerations by
pairing start and end boundaries.
Think of your file as a tagged file (even if it's 'language' has
nothing to do with markup technologies). Impose the starting
assumption that the file is complete chaos, and in need of temporary
start and end tags; and that the file is the contents betwixt those
Then process dynamically permiting any combination of well tagged and
poorly tagged subsections to occur within the assumed temporary start
and end tags.
Any time a poorly formed section gets corrected, drop the bounding
temporary start/end tags. Any time a well formed section gets
de-corrected imposed a temporary start/end tag boundary pair. (all
Deal with your file as though it is markup of an imaginary, purely
abstract type, and that all of the content is just attributes.between
tags. Move away from the idea that your source file is text and of
primary concern. Instead deploy a structure from the start, focus on
that structure, and allow the text to drift into attribute slots
within the structure.
Hope that helps get ya' thinkin'.
Return to the
Search the comp.compilers archives again.