Re: byacc help needed

bliss@sp64.csrd.uiuc.edu (Brian Bliss)
Mon, 23 Nov 1992 18:04:36 GMT

          From comp.compilers

Related articles
byacc help needed thewalt@ce.Berkeley.EDU (1992-11-22)
Re: byacc help needed bliss@sp64.csrd.uiuc.edu (1992-11-23)
| List of all articles for this month |
Newsgroups: comp.compilers
From: bliss@sp64.csrd.uiuc.edu (Brian Bliss)
Organization: UIUC Center for Supercomputing Research and Development
Date: Mon, 23 Nov 1992 18:04:36 GMT
References: 92-11-127
Keywords: yacc, errors

thewalt@ce.Berkeley.EDU (C. Thewalt) writes:
|> It appears that the parse state in y.tab.c is controlled by a few scalar
|> variables. If this is the case, it would seem possible to squirrel away
|> copies and do a setjmp whenever we are in an acceptable state, and when
|> syntax errors occur do a longjmp back to the saved state and reset the
|> variables. ...


|> [It might be possible, but I'd think it'd be a lot easier to use the yacc
|> ``error'' token.


Here is code to do it in bison, which is also works with regular yacc; the
internal vars all have the same name in both; I wouldn't doubt if this
works fine with byacc, also (but certainly check out a generated source
file to make sure I haven't missed any internal byacc-specific vars):


typedef struct MY_JMP_BUF {
      jmp_buf buf;
      union STACK *semantic_stack_next; /* I didn't use yacc's semantic stack */
      int level;
      int block_no;


      /* the rest are yacc internal vars */
      int state;
      int n;
      short *ssp;
      int *vsp;
      short *ss;
      int *vs;
      int len;
} my_jmp_buf;


#define parse_setjmp(_val,_buffer) \
{ \
(_buffer).semantic_stack_next = semantic_stack_next; \
(_buffer).level = level; \
(_buffer).block_no = block_no; \
(_buffer).state = yystate; \
(_buffer).n = yyn; \
(_buffer).ssp = yyssp; \
(_buffer).vsp = yyvsp; \
(_buffer).ss = yyss; \
(_buffer).vs = yyvs; \
(_buffer).len = yylen; \
(_val) = setjmp ((_buffer).buf); \
if ((_val) != 0) { \
semantic_stack_next = (_buffer).semantic_stack_next; \
yystate = (_buffer).state; \
yyn = (_buffer).n; \
yyssp = (_buffer).ssp; \
yyvsp = (_buffer).vsp; \
yyss = (_buffer).ss; \
yyvs = (_buffer).vs; \
yylen = (_buffer).len; \
yynerrs = 0; \
} \
}


/*
this must be coded as a macro; a subroutine would return and the
jmp_buf would no longer contain info for a valid stack frame.
*/


#define parse_longjmp(_val,_buffer) \
longjmp ((_buffer).buf, (_val))


/*
generally, you #define (or code) yyerror () to print out "syntax error",
and then do a longjmp.
*/




points to keep in mind:


1) between the time you perform the setjmp and longjmp, the semantic stack
      (or any of yacc's internal stacks) must not shrink past the level they
      were at when the setjmp was performed. actually, since we've already
      stored the action to be performed by yacc after executing this imbedded
      code, you can cheat and allow a reduction right after the setjmp, but
      the stack may not shrink further than it does then. i.e.


      stmt: empty {
                      parse_setjmp (val, buffer);
                      if (val) { recover_from_error; }
                  } stmt2;
      stmt2: for_stmt | dec_stmt ...


      works fine, as does (now we're cheating):


      stmt: do_setjmp stmt2;
      do_setjmp: empty {
                                parse_setjmp (val, buffer);
                                if (val) { recover_from_error; }
                            };
      stmt2: for_stmt | dec_stmt ...


2) since the block of code where the setjmp was performed (normally) is
      exited before the longjmp takes place, neither val (in the above example)
      or the jmp buffer itself may be declared local to the block where the
      setjmp is performed. you can either insert them local to yyparse(),
      or else statically allocate them.


3) note we did not store the value of yychar, the input token.
      the following code would replace recover_from_error in the above
      example, if we were writing a C compiler, and wished to scan ahead
      to the next statement in the current block (you should also check
      for '{' and '}'):


            int nest = 0;
            while ((yychar = yylex ()) != 0) {
                  if (yychar == ')') {
                        nest--;
                  }
                  else if (yychar == '(') {
                        nest++;
                  }
                  else if ((yychar == ';') && (nest <= 0)) {
                        break;
                  }
            }
            if (yychar == 0) return;


4) this has several advantages over yacc's error recovery mechanism:
      you know exactly what state you are resetting the parser to, you
      can jump to different states depending upon what tokens you find
      in the input stream, and I've found it easy enough to use that I've
      never had occasion to use yacc's built-in error token in a "real"
      parser.


bb




--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.