10 Aug 2003 22:49:36 -0400

From: | rkrayhawk@aol.com (RKRayhawk) |

Newsgroups: | comp.compilers |

Date: | 10 Aug 2003 22:49:36 -0400 |

Organization: | AOL http://www.aol.com |

References: | 03-08-035 |

Keywords: | lex |

Posted-Date: | 10 Aug 2003 22:49:36 EDT |

erinf@jobsoft.com (Erin) on 8/10/03 10:00 AM EST

posted these flex code snipets

<<

i'm using flex v. 2.5.4 and i've run into issues with matching a literal minus

sign.

code snippet:

SP [^A-Za-z0-9 ]

MINUS ("-")

AN [A-Za-z0-9 ]

(({AN}|{MINUS}){2,15}{SP}{1}){1} works correctly

(({AN}|{MINUS}){2,15}{SP}{1}){2} does not - flex gets hung up

i've tried:

MINUS [-]

MINUS (-)

MINUS -

i've also tried:

({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP} does not work

({AN}|{MINUS}){2,15}{SP}({AN}){2,15}{SP} works

does anyone have a clue as to what i'm doing wrong?! thanks so much!

erin

[I've never had a lot of luck with named patterns. I suspect there are some

long-standing bugs that haven't been shaken out. -John]

*>>*

Your {SP} named pattern could match a minus sign, which is atleast

competitive against the {MINUS} named pattern. You thus have the honor

of having created a somewhat ambiguous lexer. Flex and its kin do not

diagnose this problem, (because one can intend that! but you

immediately get non-intuitive when you venture into that)..

That much is suggested by your comment that

({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP} does not work

({AN}|{MINUS}){2,15}{SP}({AN}){2,15}{SP} works

although one would have to know exactly what input is bringing about your

described results. It would seem that

{MINUS} is getting greedy and leaving nothing for {SP}.

So

({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP}?

or

({AN}|{MINUS}){2,15}{SP}({AN}|{MINUS}){2,15}{SP}*

_might_ work. But I would eliminate the ambiguity.

Maybe you want

SP [^-A-Za-z0-9 ]

And flex may like that better as

SP [^\-A-Za-z0-9 ]

And if you do not really need two separate rules

for MINUS and AN, you can combined them with

ANM ([\-A-Za-z0-9 ])

If you do not want to go that far, then atleast try parens around AN

AN ([A-Za-z0-9 ])

Concerning your comment that

(({AN}|{MINUS}){2,15}{SP}{1}){1} works correctly

(({AN}|{MINUS}){2,15}{SP}{1}){2} does not - flex gets hung up

It is harder to see this problem, but since you getting into multiple

occurences, one might guess that you are finding end of line or end of

file directly after a MINUS, and we cannot see from your post if you

have a \ n rule which could be interfering.

It can be assumed that you have some experience, and the following is

suggested just in case you haven't considered it. But you may want to

have explicit rules for SPaces [ \t], and a newline rule \ n , and

possibly a dot rule for all else rather than then SP rule enumerating

'other' by notted character class [^ ].

