Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?UTF-8?B?b3IgSmF2YS9DKysgKGtpbmQgb2YgY29tcGxleCBsYW5nYXVnZSkgd2l0aG91dCA=?= =?UTF-8?B?J2xleGFyIGhhY2snIHN1cHBvcnQ=?=

Hans-Peter Diettrich <>
Sat, 18 Aug 2012 10:13:46 +0100

          From comp.compilers

Related articles
=?UTF-8?Q?Bison_determinis=E2=80=8Btic_LALR=281=29_parser_for_Java=2FC (2012-08-17)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (Hans-Peter Diettrich) (2012-08-18)
Re: lexer speed, was Bison (Hans-Peter Diettrich) (2012-08-20)
Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?U (2012-08-20)
Re: lexer speed, was Bison (Hans-Peter Diettrich) (2012-08-20)
Re: lexer speed, was Bison (BGB) (2012-08-20)
Re: lexer speed, was Bison (Hans-Peter Diettrich) (2012-08-21)
Re: lexer speed, was Bison (BartC) (2012-08-21)
[6 later articles]
| List of all articles for this month |

DKIM-Signature: v=1; a=rsa-sha256; c=simple;; h=cc:from:subject:date:sender:message-id:references:mime-version:content-type:content-transfer-encoding:vbr-info; s=b587.503153f3.k1208;; bh=lrQ4C+RdCmX62Z0jaife6Z1Tk+OuR09ztfUlYsgs6V8=; b=d3Fxns45b75Mnp69pqK6KzjG/+rtFI0Yjh0v8rm0e6v2q7z03qU9he1NmdQPuoZDxOPbssmdMrCmvZfIJ7hnXq5pRiRSf7pWPOaaP26e+6nmaLE/kSaGgZbmZzhOztbWxsyGOhEeY9fSZL/lFvrOggud6xmAHjnm0zywSiuHPCo=
VBR-Info:; mc=all;
From: Hans-Peter Diettrich <>
Newsgroups: comp.compilers
Date: Sat, 18 Aug 2012 10:13:46 +0100
Organization: Compilers Central
References: 12-08-005
Keywords: bison, design, comment
Posted-Date: 19 Aug 2012 17:00:35 EDT schrieb:
> I need to write a parser for a programming langauge which is as
> complex as C++/Java, and to even complicate the matter, there are
> constructs in this langauge that doesn't allow me to use
> type/identifier dis-ambiguating lexer hack.

Why don't you fix your language, and remove such ambiguities? Look at
Pascal or other Wirthian languages...

> In other words, I will
> have to return just one lexical token (say IDENTIFIER) from the lexer
> for both type references as well as non-type variable references.

This shouldn't be a big problem, as long as the parser does not rely
on such a distinction. Once a symbol has been defined, it can contain
some indication about its nature.

> Given these restrictions, I was wondering if it would be a good idea
> to pick yacc/bison for my parser...? Or, should I consider a hand
> written recursive descent parser.

I don't see how this decision is related to above problem.

> Regards.
> [Get it working in bison, then in the unlikely event that's not fast
> enough, profile your compiler to see where it's spending its time and
> fix what needs to be fixed. Although in theory GLR can be very slow,
> in practice the ambiguities are generally resolved within a few tokens
> and the performance is fine. compilers always spend way more time in
> the lexer than the parser anyway. Writing RD parsers by hand can be
> fun, but you never know what language it actually parses. -John]

There exist parser generators for several models. I also doubt that -
except in misdesigned C-ish languages - a compiler spends significant
time in the lexer. This may be true for dummy parsers, which do
nothing but syntax checks, but not for compilers with code generation,
optimization and more.

[Compilers spend a lot of time in the lexer, because that's the only
phase that has to look at the input one character at a time. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.