Related articles |
---|
lex question dormina@winnipeg-lnx.cc.gatech.edu (Mina Doroudi) (2005-12-02) |
Re: lex question nathan.moore@cox.net (Nathan Moore) (2005-12-08) |
Re: lex question toby@telegraphics.com.au (toby) (2005-12-08) |
lex question alinares@fivetech.com (Antonio Linares) (1999-03-23) |
Re: lex question adobni@itron.com.ar (Alejandro Dobniewski) (1999-03-28) |
Re: lex question rkrayhawk@aol.com (1999-04-01) |
Re: lex question cfc@world.std.com (Chris F Clark) (1999-04-03) |
From: | Nathan Moore <nathan.moore@cox.net> |
Newsgroups: | comp.compilers |
Date: | 8 Dec 2005 02:33:05 -0500 |
Organization: | Cox Communications |
References: | 05-12-008 |
Keywords: | lex |
Posted-Date: | 08 Dec 2005 02:33:05 EST |
Mina Doroudi wrote:
> I am writing a parser with lex. I have some problems:
> In the definition section I define a whole bunch of stuff and I also
> used them to define other things.
> So I have
> X [something]
> and I want Y to be anything but X so when I define it like:
> Y [^{X}] it only exclude the characters '{' , '}' ,and X
> I can't find a way to exclude the definitions and use them in
> other definitions.
I don't think that there is any automatic way to do what you want unless
you can use
{X} ACTION_FOR_X
.+ ACTION_FOR_EVERYTHING_ELSE
which is not likely to work unless you are trying to match the entire
input as either X or Y.
I had to do something similar for C style comments and ended up just
drawing out several DFAs for them and then converting those to
regular expressions. It was a lot of work, and I had to go back and
fix things that I messed up a lot, but it did work.
There are things about "match anything but" for anything that is not
simply character or character class that are a lot more complicated
to compute than you would think. It's really that lex can't really
always tell what you might want it to match.
>
> Also I'm trying to set rules for Oct, but lex doesn't let me logical ORs them
> together. and I can't do ranging either ([\001-\006])
> any Idea how to parse text with Oct?
Just to be clear -- You are trying to match the characters whose values
are between 1 and 6, right? Well I don't know how to do that. Very
interesting. It would be trivial to actually do, but I just don't know
the syntax or even if there is a syntax to do it b/c they all have no
short escape like \n, \a, ....
You could do:
((\001)|(\002)|(\003)|(\004)|(\005)|(\006))
It's ugly (esp since I used a bunch of extra () so that you could use it
in the middle of something else if you wanted to.
> -Mina Doroudi (dormina@cc.gatech.edu)
I wish I was at GA Tech, but I'm down the road in Macon.
Hope I was some help,
Nathan
Return to the
comp.compilers page.
Search the
comp.compilers archives again.