Approaches to code formatters

"Piotr Zgorecki" <>
1 Dec 2002 22:48:39 -0500

          From comp.compilers

Related articles
Approaches to code formatters (Piotr Zgorecki) (2002-12-01)
Re: Approaches to code formatters (Alex K. Angelopoulos) (2002-12-03)
Re: Approaches to code formatters (Nils M Holm) (2002-12-03)
Re: Approaches to code formatters (Ira Baxter) (2002-12-03)
Re: Approaches to code formatters (2002-12-07)
| List of all articles for this month |

From: "Piotr Zgorecki" <>
Newsgroups: comp.compilers
Date: 1 Dec 2002 22:48:39 -0500
Organization: Internet Cable Provider News Server
Keywords: tools
Posted-Date: 01 Dec 2002 22:48:39 EST


Did anybody write a code formatter for C-like languages ? I'm writing
one at the moment, and the farther I go, the more I dislike the
approach I have taken. I decided not to have any parser, instead I
merely take the tokens that lexer feeds me, and try to format "by
hand", which means lots of scans through a token buffer. Now this is
good for a simple paren or bracket indentation, but get's ugly with
anything less trivial.

So, has anybody found a better approach ? I'm wondering about a
generalized grammar that will accept C/C++/Java. But this is prone for
errors in the code being formatted (I want my formatter to format an
erroneus code without stopping with nasty parse errors, and I don't
feel like writing a compiler front-end today :). Besides, whitespaces
are important tokens that have to be properly recognized, so the
grammar would have to include whitespace between every two other
tokens. This gets ugly again.

So, if someone has any ideas/experience about code formatters, I would
appreciate your help.

Piotr Zgorecki
[There are lots of C pretty-printers. I think they all use pattern
matching heuristics, for just the reasons you gave -- whitespace, and
real C programs, particularly with #if, aren't easily parsable. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.