Re: Simple question on lex/yacc specifications

Eric Fowler <eric.fowler@gmail.com>
Sat, 14 Mar 2009 17:15:24 -0700

          From comp.compilers

Related articles
Simple question on lex/yacc specifications eric.fowler@gmail.com (Eric Fowler) (2009-03-13)
Re: Simple question on lex/yacc specifications eric.fowler@gmail.com (Eric Fowler) (2009-03-14)
Re: Simple question on lex/yacc specifications kym@svalbard.freeshell.org (russell kym horsell) (2009-03-15)
Re: Simple question on lex/yacc specifications max@gustavus.edu (Max Hailperin) (2009-03-15)
Re: Simple question on lex/yacc specifications eric.fowler@gmail.com (Eric Fowler) (2009-03-15)
| List of all articles for this month |

From: Eric Fowler <eric.fowler@gmail.com>
Newsgroups: comp.compilers
Date: Sat, 14 Mar 2009 17:15:24 -0700
Organization: Compilers Central
References: 09-03-058
Keywords: yacc, comment
Posted-Date: 14 Mar 2009 20:56:18 EDT

Mmmm, yeah, but a little more detail would help.


Here is what is NOT working (cut down):


foo :
FOO COMMA opt_decimalnum {printf("YACC saw FOO!");}
;


decimalnum:
DECIMALNUM COMMA
;


opt_decimalnum:
decimalnum
| COMMA
;




Where my lexer defines DECIMALNUM and COMMA as you might expect, and
tokenizes them as I want it to.


The lexer sees
FOO,,
as FOO COMMA COMMA
and yacc sees it as FOO. All is well, but ...


FOO,1,


gets "syntax error" printed four times, but the lexer subsequently sees it as
FOO COMMA DECIMALNUM COMMA
which is fine. Yacc does not recognize it.


Obviously I am doing something fundamentally wrong.


I tried transposing the rules for decimalnum and opt_decimalnum in the
yacc rules file, and it inverted the outcome, in other words, the
string with the empty number token is now not recognized, but a string
with digits is.


Eric


On Fri, Mar 13, 2009 at 9:14 PM, Eric Fowler <eric.fowler@gmail.com> wrote:
> This should be pretty easy: I am a relative newbie to lex & yacc, and
> I am writing a parser for NMEA strings as a toy project. NMEA strings
> are output by marine electronic equipment, and the ones I care about
> now look like this:
>
> !AIVDM,1,1,,B,15N@wP0P00o?
> ruLK?UMMbOw>04KH,0*31
> !AIVDM,1,1,,B,15Mj2u001vo?tV8K?<ub>8;@0D1<,0*17
> !AIVDM,2,1,3,B,55P5TL01VIaAL@7WKO@mBplU@<PDhh000000001S;AJ::4A80?4i@E53,0*3E
> !AIVDM,2,2,3,B,1@0000000000000,2*55
> !AIVDM,1,1,,B,15N:bCPP01G?jPfKADGUvww>2<17,0*4D
>
> So they are pretty simple, and I have a pretty good handle on parsing them.
>
> The problem is that fourth comma-delimited field, which can hold
> either a one-digit decimal, a two-digit decimal, or be empty: "...
> ,1,..." or "..,22,..." or "...,,..." .
>
> How do I recognize the an empty field? There are a lot of fairly
> trivial ways to do this, but as far as I can see all would trip up by
> recognizing everything, or would hose yaccability by forcing me to
> include the comma delimiter as part of the token (so the token is
> returned only if it has a comma tacked on the end of it ... yecch ...


[Try this:
foo :
FOO COMMA opt_decimalnum COMMA {printf("YACC saw FOO!");}
;


opt_decimalnum:
DECIMALNUM
| /* nothing */
;


Empty RHS rules are valid and useful in a situation like this. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.