Re: lex/flex for perl

Christian Wetzel <cnwetzel@linguistik.uni-erlangen.de>
30 Jul 1998 23:10:54 -0400

          From comp.compilers

Related articles
lex/flex for perl david@cm.co.za (David Maclean) (1998-07-28)
Re: lex/flex for perl cnwetzel@linguistik.uni-erlangen.de (Christian Wetzel) (1998-07-30)
Re: lex/flex for perl dwight@pentasoft.com (1998-07-30)
Re: lex/flex for perl ngkaboon@iscs.nus.edu.sg (valk) (1998-08-10)
| List of all articles for this month |

From: Christian Wetzel <cnwetzel@linguistik.uni-erlangen.de>
Newsgroups: comp.compilers
Date: 30 Jul 1998 23:10:54 -0400
Organization: CLUE -- Computerlinguistik Uni Erlangen
References: 98-07-216
Keywords: lex, perl, parse

David Maclean wrote:


> Can anyone tell me if there is a lex/flex for perl. There is byacc for
> perl but have not found lex for perl.


If you don't mind speed, just take the Parse::RecDescent module from
the CPAN[1]. Since it is a top-down parser, it allows you to use perl
regular expressions *in the grammar* to define terminals, with all
their capabilities of lookahead (and even look*behind* according to
the what I heard about the new perl 5.005[2]), non-greedyness of * and
+ operators etc. You'll have sophisticated methods to glue the parsing
to other perl code too, for example to build object oriented parse
trees, to change the grammar dynamically during the parse and so on.


I wrote a conversion GUI as my diploma thesis (a converter-generator
for context free languages) which generates perl code using
Parse::RecDescent and it works fine. The only disadvantage is that it
is slow.


Take a look at the demos that come with the module, especially the
uncommenting C demo, to see how it deals with lexing.


If you want to learn more about perl's scanning capabilities in
general, you should take a look at Jeffrey Friedl's book "Mastering
regular expressions"[3].


An example from perlfaq6 [4] for a tiny tokenizer:


    while (<>) {
            chomp;
            PARSER: {
                      m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; };
                      m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; };
                      m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; };
                      m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; };
            }
    }


Hope that helps,
    Christian


[1]: <URL:http://www.perl.com/CPAN/modules/00modlist.long.html>
[2]: <URL:http://www.perl.com/CPAN/src/perl5.005_01.tar.gz>
[3]: J. Friedl, "Mastering regular expressions. Powerful techniques
          for perl and other tools", O'Reilly 1997, ISBN 1-56592-257-3
[4]: type "perldoc perlfaq6" if you have perl > 5.004_04 installed.
          Note that the CPAN version of this document is older.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.