Set of allowable next tokens

meissner@osf.org
Fri, 11 Jan 91 22:23:48 -0500

          From comp.compilers

Related articles
Set of allowable next tokens jmr1g@ra.cs.Virginia.EDU (1991-01-11)
Set of allowable next tokens meissner@osf.org (1991-01-11)
Re: Set of allowable next tokens csr!nigelh@sol.uvic.ca (1991-01-12)
Re: Set of allowable next tokens mah@nestroy.wu-wien.ac.at (1991-01-14)
Re: Set of allowable next tokens megatest!djones@decwrl.dec.com (1991-01-16)
Re: Set of allowable next tokens megatest!djones@decwrl.dec.com (1991-01-15)
| List of all articles for this month |
Newsgroups: comp.compilers
From: meissner@osf.org
In-Reply-To: jmr1g@ra.cs.Virginia.EDU's message of 11 Jan 91 20:39:16 GMT
Keywords: yacc, parse, debug
Organization: Compilers Central
Date: Fri, 11 Jan 91 22:23:48 -0500

jmr1g@ra.cs.Virginia.EDU (Jeremiah M. Ratner) writes:


| Is there any way i can configure yacc or some other tool to tell me,
| at each step in a parse, the set of tokens that may follow the current
| token? I am currently doing this by hand, and it is, as they say,
| a tedious and error-prone process.


If you mean in a static file that you can look at, bison's -v flag can
do this. Yacc on my system at least also has a -v flag that produces
much the same information in the 'y.output' file.


If you mean a list of valid tokens at the time of a parser error,
bison 1.12 has this if you define YYERROR_VERBOSE. A reworking of
this was just submitted to the gnu newsgroups and may appear in later
revisions. You could move the code up in bison.simple so that it
occurs all the time instead of on an error.


Going back to the -v switch, consider the following grammer:


%token INT VAR


%%
expression: expr ';' ;


expr: VAR '=' expr
| add_expr ;


add_expr: add_expr add_op mult_expr
| mult_expr ;


mult_expr: mult_expr mult_op primary
| primary ;


primary: '(' expr ')'
| VAR
| INT ;


add_op: '+'
| '-' ;


mult_op: '*'
| '/' ;


Bison produces the following in the <file>.output file (the period
indicates the current location):


token types:
  type -1 is $
  type 40 is '('
  type 41 is ')'
  type 42 is '*'
  type 43 is '+'
  type 45 is '-'
  type 47 is '/'
  type 59 is ';'
  type 61 is '='
  type 256 is error
  type 258 is INT
  type 259 is VAR




state 0


        INT shift, and go to state 1
        VAR shift, and go to state 2
        '(' shift, and go to state 3


        expr go to state 4
        expression go to state 22
        add_expr go to state 5
        mult_expr go to state 6
        primary go to state 7






state 1


        primary -> INT . (rule 10)


        $default reduce using rule 10 (primary)






state 2


        expr -> VAR . '=' expr (rule 2)
        primary -> VAR . (rule 9)


        '=' shift, and go to state 8


        $default reduce using rule 9 (primary)






state 3


        primary -> '(' . expr ')' (rule 8)


        INT shift, and go to state 1
        VAR shift, and go to state 2
        '(' shift, and go to state 3


        expr go to state 9
        add_expr go to state 5
        mult_expr go to state 6
        primary go to state 7






state 4


        expression -> expr . ';' (rule 1)


        ';' shift, and go to state 10






state 5


        expr -> add_expr . (rule 3)
        add_expr -> add_expr . add_op mult_expr (rule 4)


        '+' shift, and go to state 11
        '-' shift, and go to state 12


        $default reduce using rule 3 (expr)


        add_op go to state 13






state 6


        add_expr -> mult_expr . (rule 5)
        mult_expr -> mult_expr . mult_op primary (rule 6)


        '*' shift, and go to state 14
        '/' shift, and go to state 15


        $default reduce using rule 5 (add_expr)


        mult_op go to state 16






state 7


        mult_expr -> primary . (rule 7)


        $default reduce using rule 7 (mult_expr)






state 8


        expr -> VAR '=' . expr (rule 2)


        INT shift, and go to state 1
        VAR shift, and go to state 2
        '(' shift, and go to state 3


        expr go to state 17
        add_expr go to state 5
        mult_expr go to state 6
        primary go to state 7






state 9


        primary -> '(' expr . ')' (rule 8)


        ')' shift, and go to state 18






state 10


        expression -> expr ';' . (rule 1)


        $default reduce using rule 1 (expression)






state 11


        add_op -> '+' . (rule 11)


        $default reduce using rule 11 (add_op)






state 12


        add_op -> '-' . (rule 12)


        $default reduce using rule 12 (add_op)






state 13


        add_expr -> add_expr add_op . mult_expr (rule 4)


        INT shift, and go to state 1
        VAR shift, and go to state 19
        '(' shift, and go to state 3


        mult_expr go to state 20
        primary go to state 7






state 14


        mult_op -> '*' . (rule 13)


        $default reduce using rule 13 (mult_op)






state 15


        mult_op -> '/' . (rule 14)


        $default reduce using rule 14 (mult_op)






state 16


        mult_expr -> mult_expr mult_op . primary (rule 6)


        INT shift, and go to state 1
        VAR shift, and go to state 19
        '(' shift, and go to state 3


        primary go to state 21






state 17


        expr -> VAR '=' expr . (rule 2)


        $default reduce using rule 2 (expr)






state 18


        primary -> '(' expr ')' . (rule 8)


        $default reduce using rule 8 (primary)






state 19


        primary -> VAR . (rule 9)


        $default reduce using rule 9 (primary)






state 20


        add_expr -> add_expr add_op mult_expr . (rule 4)
        mult_expr -> mult_expr . mult_op primary (rule 6)


        '*' shift, and go to state 14
        '/' shift, and go to state 15


        $default reduce using rule 4 (add_expr)


        mult_op go to state 16






state 21


        mult_expr -> mult_expr mult_op primary . (rule 6)


        $default reduce using rule 6 (mult_expr)






state 22


        $ shift, and go to state 23






state 23


        $ shift, and go to state 24






state 24


        NO ACTIONS




Quoting from the manual:


File: bison.info, Node: Debugging, Next: Invocation,
Prev: Context Dependency, Up: Top


Debugging Your Parser
*********************


If a Bison grammar compiles properly but doesn't do what you want
when it runs, the `yydebug' parser-trace feature can help you figure
out why.


To enable compilation of trace facilities, you must define the macro
`YYDEBUG' when you compile the parser. You could use `-DYYDEBUG=1'
as a compiler option or you could put `#define YYDEBUG 1' in the C
declarations section of the grammar file (*note C Declarations::.).
Alternatively, use the `-t' option when you run Bison (*note
Invocation::.). We always define `YYDEBUG' so that debugging is
always possible.


The trace facility uses `stderr', so you must add
`#include <stdio.h>' to the C declarations section unless it is
already there.


Once you have compiled the program with trace facilities, the way to
request a trace is to store a nonzero value in the variable `yydebug'.
You can do this by making the C code do it (in `main', perhaps), or
you can alter the value with a C debugger.


Each step taken by the parser when `yydebug' is nonzero produces a
line or two of trace information, written on `stderr'. The trace
messages tell you these things:


      * Each time the parser calls `yylex', what kind of token was read.


      * Each time a token is shifted, the depth and complete contents of
          the state stack (*note Parser States::.).


      * Each time a rule is reduced, which rule it is, and the complete
          contents of the state stack afterward.


To make sense of this information, it helps to refer to the listing
file produced by the Bison `-v' option (*note Invocation::.). This
file shows the meaning of each state in terms of positions in various
rules, and also what each state will do with each possible input
token. As you read the successive trace messages, you can see that
the parser is functioning according to its specification in the
listing file. Eventually you will arrive at the place where
something undesirable happens, and you will see which parts of the
grammar are to blame.


The parser file is a C program and you can use C debuggers on it, but
it's not easy to interpret what it is doing. The parser function is
a finite-state machine interpreter, and aside from the actions it
executes the same code over and over. Only the values of variables
show where in the grammar it is working.


....




`-v'
`+verbose'
          Write an extra output file containing verbose descriptions of
          the parser states and what is done for each type of look-ahead
          token in that state.


          This file also describes all the conflicts, both those resolved
          by operator precedence and the unresolved ones.


          The file's name is made by removing `.tab.c' or `.c' from the
          parser output file name, and adding `.output' instead.


          Therefore, if the input file is `foo.y', then the parser file is
          called `foo.tab.c' by default. As a consequence, the verbose
          output file is called `foo.output'.


--
Michael Meissner email: meissner@osf.org phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.