Re: lex question

Nathan Moore <nathan.moore@cox.net>
8 Dec 2005 02:33:05 -0500

          From comp.compilers

Related articles
lex question dormina@winnipeg-lnx.cc.gatech.edu (Mina Doroudi) (2005-12-02)
Re: lex question nathan.moore@cox.net (Nathan Moore) (2005-12-08)
Re: lex question toby@telegraphics.com.au (toby) (2005-12-08)
lex question alinares@fivetech.com (Antonio Linares) (1999-03-23)
Re: lex question adobni@itron.com.ar (Alejandro Dobniewski) (1999-03-28)
Re: lex question rkrayhawk@aol.com (1999-04-01)
Re: lex question cfc@world.std.com (Chris F Clark) (1999-04-03)
| List of all articles for this month |

From: Nathan Moore <nathan.moore@cox.net>
Newsgroups: comp.compilers
Date: 8 Dec 2005 02:33:05 -0500
Organization: Cox Communications
References: 05-12-008
Keywords: lex
Posted-Date: 08 Dec 2005 02:33:05 EST

Mina Doroudi wrote:
> I am writing a parser with lex. I have some problems:
> In the definition section I define a whole bunch of stuff and I also
> used them to define other things.
> So I have
> X [something]
> and I want Y to be anything but X so when I define it like:
> Y [^{X}] it only exclude the characters '{' , '}' ,and X
> I can't find a way to exclude the definitions and use them in
> other definitions.


I don't think that there is any automatic way to do what you want unless
you can use


{X} ACTION_FOR_X
.+ ACTION_FOR_EVERYTHING_ELSE


which is not likely to work unless you are trying to match the entire
input as either X or Y.
I had to do something similar for C style comments and ended up just
drawing out several DFAs for them and then converting those to
regular expressions. It was a lot of work, and I had to go back and
fix things that I messed up a lot, but it did work.
There are things about "match anything but" for anything that is not
simply character or character class that are a lot more complicated
to compute than you would think. It's really that lex can't really
always tell what you might want it to match.
>
> Also I'm trying to set rules for Oct, but lex doesn't let me logical ORs them
> together. and I can't do ranging either ([\001-\006])
> any Idea how to parse text with Oct?
Just to be clear -- You are trying to match the characters whose values
are between 1 and 6, right? Well I don't know how to do that. Very
interesting. It would be trivial to actually do, but I just don't know
the syntax or even if there is a syntax to do it b/c they all have no
short escape like \n, \a, ....
You could do:
((\001)|(\002)|(\003)|(\004)|(\005)|(\006))
It's ugly (esp since I used a bunch of extra () so that you could use it
in the middle of something else if you wanted to.
> -Mina Doroudi (dormina@cc.gatech.edu)
I wish I was at GA Tech, but I'm down the road in Macon.


Hope I was some help,
Nathan


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.