Re: LR-parser-based lexical analysis - does it work?

"Vladimir N. Makarov" <vmakarov@redhat.com>
18 Oct 2002 23:06:52 -0400

          From comp.compilers

Related articles
LR-parser-based lexical analysis - does it work? soenke.kannapinn@wincor-nixdorf.com (=?iso-8859-1?Q?S=F6nke_Kannapinn?=) (2002-10-13)
Re: LR-parser-based lexical analysis - does it work? cfc@shell01.TheWorld.com (Chris F Clark) (2002-10-18)
Re: LR-parser-based lexical analysis - does it work? vmakarov@redhat.com (Vladimir N. Makarov) (2002-10-18)
Re: LR-parser-based lexical analysis - does it work? vbdis@aol.com (VBDis) (2002-10-18)
Re: LR-parser-based lexical analysis - does it work? brian-l-smith@uiowa.edu (Brian Smith) (2002-10-18)
Re: LR-parser-based lexical analysis - does it work? grosch@cocolab.de (Josef Grosch) (2002-10-18)
Re: LR-parser-based lexical analysis - does it work? zackw@panix.com (Zack Weinberg) (2002-10-20)
| List of all articles for this month |

From: "Vladimir N. Makarov" <vmakarov@redhat.com>
Newsgroups: comp.compilers
Date: 18 Oct 2002 23:06:52 -0400
Organization: Compilers Central
References: 02-10-030
Keywords: LR(1), lex
Posted-Date: 18 Oct 2002 23:06:51 EDT

> In a compiler generator project we are thinking about building
> scanners using LR parser techniques.
>
> (Background: We are interested in describing compiler syntax
> and semantics *including* the lexical level using attributed
> CFGs. The high-speed performance of DFA-based scanners is
> nice, but we don't need ultimate speed.)


    One of the project goal of compiler-compiler MSTA was to use it for
scanner implementation too. MSTA is an extension of POSIX standard of
YACC using LR(k)/LALR(k) grammars. When the project started, the most
commonly used computers were based on Intel 386/486 processors. Pure
usage of LR(k) parsers would have resulted in too slow scanners.
Therefore additional special optimizations like extracting regular
parts of grammars and implementing parsing them by adequate methods
were implemented in MSTA. It permit to use MSTA for generation of
effective lexical analyzers.


    Usage of LR(k) grammars for scanner implementation has also some
adantages. You can describe easily scanners which can not be
described by regular expressions (i.e. nested comments).


    So speed is not a problem of usage of LR-grammars for scanner
implementations. I found only that CFG (even with MSTA extension of
YACC to make the description of scanners easier) is less convinient
description than regular expressions for the most scanners. The most
annoying such problem is dealing with conflict resolution.


    You can find MSTA on http://cocom.sf.net. There are examples of
scanner descriptions in the documentation and test examples.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.