Re: flex not matching minus sign more than once

rkrayhawk@aol.com (RKRayhawk)
10 Aug 2003 22:49:36 -0400

          From comp.compilers

Related articles
flex not matching minus sign more than once erinf@jobsoft.com (2003-08-10)
Re: flex not matching minus sign more than once rkrayhawk@aol.com (2003-08-10)
| List of all articles for this month |
From: rkrayhawk@aol.com (RKRayhawk)
Newsgroups: comp.compilers
Date: 10 Aug 2003 22:49:36 -0400
Organization: AOL http://www.aol.com
References: 03-08-035
Keywords: lex
Posted-Date: 10 Aug 2003 22:49:36 EDT

erinf@jobsoft.com (Erin) on 8/10/03 10:00 AM EST
posted these flex code snipets
<<


i'm using flex v. 2.5.4 and i've run into issues with matching a literal minus
sign.




code snippet:


SP [^A-Za-z0-9 ]
MINUS ("-")
AN [A-Za-z0-9 ]


(({AN}|{MINUS}){2,15}{SP}{1}){1} works correctly
(({AN}|{MINUS}){2,15}{SP}{1}){2} does not - flex gets hung up


i've tried:
MINUS [-]
MINUS (-)
MINUS -


i've also tried:
({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP} does not work
({AN}|{MINUS}){2,15}{SP}({AN}){2,15}{SP} works


does anyone have a clue as to what i'm doing wrong?! thanks so much!


erin
[I've never had a lot of luck with named patterns. I suspect there are some
long-standing bugs that haven't been shaken out. -John]


>>


Your {SP} named pattern could match a minus sign, which is atleast
competitive against the {MINUS} named pattern. You thus have the honor
of having created a somewhat ambiguous lexer. Flex and its kin do not
diagnose this problem, (because one can intend that! but you
immediately get non-intuitive when you venture into that)..


That much is suggested by your comment that


({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP} does not work
({AN}|{MINUS}){2,15}{SP}({AN}){2,15}{SP} works


although one would have to know exactly what input is bringing about your
described results. It would seem that
{MINUS} is getting greedy and leaving nothing for {SP}.


So


({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP}?


or


({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP}*


_might_ work. But I would eliminate the ambiguity.


Maybe you want


SP [^-A-Za-z0-9 ]


And flex may like that better as


SP [^\-A-Za-z0-9 ]


And if you do not really need two separate rules
for MINUS and AN, you can combined them with


ANM ([\-A-Za-z0-9 ])


If you do not want to go that far, then atleast try parens around AN


AN ([A-Za-z0-9 ])


Concerning your comment that




(({AN}|{MINUS}){2,15}{SP}{1}){1} works correctly
(({AN}|{MINUS}){2,15}{SP}{1}){2} does not - flex gets hung up


It is harder to see this problem, but since you getting into multiple
occurences, one might guess that you are finding end of line or end of
file directly after a MINUS, and we cannot see from your post if you
have a \ n rule which could be interfering.


It can be assumed that you have some experience, and the following is
suggested just in case you haven't considered it. But you may want to
have explicit rules for SPaces [ \t], and a newline rule \ n , and
possibly a dot rule for all else rather than then SP rule enumerating
'other' by notted character class [^ ].


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.