Re: Writing a C/C++ compiler in C++

vbdis@aol.com (VBDis)
1 Feb 2004 12:50:12 -0500

          From comp.compilers

Related articles
Writing a C/C++ compiler in C++ dezakin@usa.net (2004-01-22)
Re: Writing a C/C++ compiler in C++ david.boyle@ed.tadpole.com (2004-01-31)
Re: Writing a C/C++ compiler in C++ haberg@matematik.su.se (2004-02-01)
Re: Writing a C/C++ compiler in C++ jo@spiffy.ox.compsoc.net (Joel Dillon) (2004-02-01)
Re: Writing a C/C++ compiler in C++ vbdis@aol.com (2004-02-01)
Re: Writing a C/C++ compiler in C++ dezakin@usa.net (2004-02-01)
Re: Writing a C/C++ compiler in C++ vbdis@aol.com (2004-02-04)
Re: Writing a C/C++ compiler in C++ hkaiser@users.sourceforge.net (2004-02-12)
Re: Writing a C/C++ compiler in C++ jakacki@hotmail.com (2004-02-12)
Re: Writing a C/C++ compiler in C++ jo@spiffy.ox.compsoc.net (Joel Dillon) (2004-02-13)
Re: Writing a C/C++ compiler in C++ vbdis@aol.com (2004-02-26)
[1 later articles]
| List of all articles for this month |

From: vbdis@aol.com (VBDis)
Newsgroups: comp.compilers
Date: 1 Feb 2004 12:50:12 -0500
Organization: AOL Bertelsmann Online GmbH & Co. KG http://www.germany.aol.com
References: 04-01-146
Keywords: C++
Posted-Date: 01 Feb 2004 12:50:12 EST

dezakin@usa.net (Dez Akin) schreibt:
>As far as I understand it you write a lexer that tokenizes all the
>symbols, then write a parser that parses all of the tokens


Don't forget the preprocessor! When K&R C could work with an external
preprocessor, newer C specs require that the preprocessor is part of
the lexer. The combined lexer/preprocessor then has to recognize
various "languages", different for C/C++, preprocessor directives and
(optional) assembly code. If you decide to use an external
preprocessor, for simplicity, then you may encounter problems with the
line and column numbers for error messages.


Since I started to implement an halfways conforming C
lexer/preprocessor, I know why a C compiler is so much slower than
e.g. the Delphi compiler, as was the topic of an former thread.




>So it seemed straightforward except that C++ isn't a LALR(1) grammer
>that Bison or Byacc accepts. Does this pose problems in writing the
>lexer?


AFAIK the lexer is not affected by the C/C++ grammar, only the set of
keywords and operators differs amongst these languages.


>Basically I'd like to use a lot of generic programming components
>such as STL and boost templates for doing much of the pattern
>matching and tree manipulation, and I'm looking for a place to
>start. Can I start with a simple lexer or lexer generator, or does
>that not make sense for C++ (or even just C)?


You may read the gcc documentation, where the cpp part (cpp.info,
cppinternals.info) discusses some lexer related topics. These guys use
an handcrafted lexer, and I'd suggest that you do it the same
way. Then you have better control over the various exceptions
(comments, continuation lines...) which are not easy to express in a
(lex/yacc) grammar.


DoDi
[The lexer is affected by the grammar to the extent that it has to
know what tokens are type names and which ones are variable names.
-John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.