|Approaches to code formatters pioter@terramail.CUTTHIS.pl (Piotr Zgorecki) (2002-12-01)|
|Re: Approaches to code formatters firstname.lastname@example.org (Alex K. Angelopoulos) (2002-12-03)|
|Re: Approaches to code formatters email@example.com (Nils M Holm) (2002-12-03)|
|Re: Approaches to code formatters firstname.lastname@example.org (Ira Baxter) (2002-12-03)|
|Re: Approaches to code formatters email@example.com (2002-12-07)|
|From:||"Alex K. Angelopoulos" <firstname.lastname@example.org>|
|Date:||3 Dec 2002 00:39:05 -0500|
|Posted-Date:||03 Dec 2002 00:39:05 EST|
Oh, goody... a question I might have an answer to... <g>
Here's an alternate approach which you might want to consider, which
completely ignores standard tokenizing and lexing techniques. I had
to use it recently for prettifying VBScript, which suffers from a host
of hereditary syntactic diseases - and there are no decent general
lexers out there for it anyway.
What I did was to simply read in code fragments, then normalize them
all: by text massaging, you can kill all of the initial and final
whitespace in lines, and strip out all internal blank lines.
The initial result of this is a block-o-code, but the nice thing about
it is that it allows you to start from a known state. You can then
very easily use regular expressions to break out functions and
internal structures, and format each chunk according to your
standards, without worrying about how the original programmer laid it
"Piotr Zgorecki" <pioter@terramail.CUTTHIS.pl> wrote in message
> Did anybody write a code formatter for C-like languages ? I'm writing
> one at the moment, and the farther I go, the more I dislike the
> approach I have taken. I decided not to have any parser, instead I
> merely take the tokens that lexer feeds me, and try to format "by
> hand", which means lots of scans through a token buffer. ...
Return to the
Search the comp.compilers archives again.