|Is that difficult to write a Shift-Reduce parser email@example.com (kuangpma) (2010-04-29)|
|Re: Is that difficult to write a Shift-Reduce parser firstname.lastname@example.org (russell kym horsell) (2010-05-01)|
|Is that difficult to write a Shift-Reduce parser email@example.com (Chariton Karamitas) (2010-05-02)|
|Re: Is that difficult to write a Shift-Reduce parser firstname.lastname@example.org (2010-05-01)|
|Re: Is that difficult to write a Shift-Reduce parser cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-02)|
|Re: Is that difficult to write a Shift-Reduce parser email@example.com (Tony Finch) (2010-05-04)|
|Re: Is that difficult to write a Shift-Reduce parser firstname.lastname@example.org (Stephen Horne) (2010-05-07)|
|Re: Is that difficult to write a Shift-Reduce parser email@example.com (2010-05-09)|
|From:||firstname.lastname@example.org (Rob Warnock)|
|Date:||Sat, 01 May 2010 21:31:49 -0500|
|Organization:||Rob Warnock, Consulting Systems Architect|
|Posted-Date:||02 May 2010 22:24:07 EDT|
|Originator:||email@example.com (Rob Warnock)|
russell kym horsell <firstname.lastname@example.org> wrote:
| kuangpma <email@example.com> wrote:
| > I am learning compiler theory, several books teach how to write (even
| > with source code) a Recursive Decent parser but not Shift Reduce
| > parser, only tell the theroy behind it. Particularly the Louden book,
| > it says that handcrafting a Shift Reduce parser is very difficult so
| > it suggests to use tools such like yacc to generate the parser.
| There are various shift/reduce stategies. Most modern texts probably
| assume or are locked onto LR(k) or LALR as the only S/R method around.
| Some old stuff is fairly easy to understand. E.g. the precedence
| grammar idea or -- even more pedagogic -- operator precedence.
| Usually S/R parsers have a fixed "engine" that is driven by tables.
| Making the tables compact takes up a lot of the intelligibility bandwidth.
| But some very old co-no ("current operator -- next operator") parsers(*)
| were nothing more than a fancy 2-d switch statement.
| (*) Yes, John, I'm talking about compilers that took 80 column cards as input.
| [Yes, I used them too. It sounds like you're talking about operator
| precedence expression parsers, which are simpler than what we usually refer
| to as shift/reduce. -John]
One of my favorite hand-coded parser styles is the hybrid Recursive Descent/
Simple Operator Precedence parser used by the BLISS family of compilers,
which uses a variant of what Russell calls "co-no"; namely, a *four*-
lexeme "window" [lexeme shift register] with the elements called
SYM (current symbol or oter value), DEL (current operator/delimiter),
FUTSYM (next SYM), and FUTDEL (next operator/delimiter). Since at this
level the grammar is Simple Operator Precedence, FUTSYM might be empty
but DEL nor FUTDEL could be [that is, there must always be at least one
delimiter between any two "symbols" (values)].
The basic lexical reader is called RUND (Read Until Next Delimiter).
A simplified version in Scheme [cribbed from some old code of mine
so apologies for any incompleteness/inconsistencies]:
(define (rund) ; Operates on globals, ugh!
(set! oldsym sym) ; Slide the window
(set! olddel del)
(set! del (read-token))
(if (token-is-del? del)
(set! sym null-token) ; Empty value between delims
(set! sym del)
(set! del (read-token))
(unless (token-is-del? del)
(error "Two adjacent non-delimeters!")))))
and the parse code for (say) binary infix operators looks
something like this:
(let ((mysym sym) ; My left operand
(myprio (del-prio del)) ; Me
(mycode (del-code del)))
(rund) ; Step forward
((<= (del-prio del) myprio) ; "<=" causes left-associativity
(make-graph-lexeme mycode mysym sym))
(set! sym ((del-reduce-function del))))))
and the generic expression parser is:
((<= (del-prio del) prio-stopper) ; Right-paren, semicolon, etc.
(set! sym ((del-reduce-function del)))))
The "trick" to hybridizing this with recursive descent is to assign
*very* high operator priorities to the start tokens of control expressions
[e.g., "keywords" such as BEGIN, IF, WHILE, etc.] and *very* low priorities
to the ends of expressions ["stoppers" such as END, ELSE, FI, right-paren,
right-bracket, comma, semicolon, &c]. This causes the simple operator
precedence parser to quite naturally behave as a recursive descent parser
when it encounters any of those constructs. ;-}
 In the BLISS parsers, a lexeme is either a terminal valued token
[literal, variable, etc.] or the result of a reduce operation
representing a value (in which case it is called a "graph table lexeme").
Rob Warnock <firstname.lastname@example.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
Return to the
Search the comp.compilers archives again.