Related articles |
---|
Is that difficult to write a Shift-Reduce parser kuangpma@gmail.com (kuangpma) (2010-04-29) |
Re: Is that difficult to write a Shift-Reduce parser kym@sdf.lonestar.org (russell kym horsell) (2010-05-01) |
Is that difficult to write a Shift-Reduce parser chakaram@auth.gr (Chariton Karamitas) (2010-05-02) |
Re: Is that difficult to write a Shift-Reduce parser rpw3@rpw3.org (2010-05-01) |
Re: Is that difficult to write a Shift-Reduce parser cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-02) |
Re: Is that difficult to write a Shift-Reduce parser dot@dotat.at (Tony Finch) (2010-05-04) |
Re: Is that difficult to write a Shift-Reduce parser sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-05-07) |
Re: Is that difficult to write a Shift-Reduce parser rpw3@rpw3.org (2010-05-09) |
From: | rpw3@rpw3.org (Rob Warnock) |
Newsgroups: | comp.compilers |
Date: | Sat, 01 May 2010 21:31:49 -0500 |
Organization: | Rob Warnock, Consulting Systems Architect |
References: | 10-04-072 10-05-003 |
Keywords: | parse, history |
Posted-Date: | 02 May 2010 22:24:07 EDT |
Originator: | rpw3@rpw3.org (Rob Warnock) |
russell kym horsell <kym@sdf.lonestar.org> wrote:
+---------------
| kuangpma <kuangpma@gmail.com> wrote:
| > I am learning compiler theory, several books teach how to write (even
| > with source code) a Recursive Decent parser but not Shift Reduce
| > parser, only tell the theroy behind it. Particularly the Louden book,
| > it says that handcrafting a Shift Reduce parser is very difficult so
| > it suggests to use tools such like yacc to generate the parser.
|
| There are various shift/reduce stategies. Most modern texts probably
| assume or are locked onto LR(k) or LALR as the only S/R method around.
|
| Some old stuff is fairly easy to understand. E.g. the precedence
| grammar idea or -- even more pedagogic -- operator precedence.
|
| Usually S/R parsers have a fixed "engine" that is driven by tables.
| Making the tables compact takes up a lot of the intelligibility bandwidth.
| But some very old co-no ("current operator -- next operator") parsers(*)
| were nothing more than a fancy 2-d switch statement.
....
| (*) Yes, John, I'm talking about compilers that took 80 column cards as input.
| [Yes, I used them too. It sounds like you're talking about operator
| precedence expression parsers, which are simpler than what we usually refer
| to as shift/reduce. -John]
+---------------
One of my favorite hand-coded parser styles is the hybrid Recursive Descent/
Simple Operator Precedence parser used by the BLISS family of compilers,
which uses a variant of what Russell calls "co-no"; namely, a *four*-
lexeme[1] "window" [lexeme shift register] with the elements called
SYM (current symbol or oter value), DEL (current operator/delimiter),
FUTSYM (next SYM), and FUTDEL (next operator/delimiter). Since at this
level the grammar is Simple Operator Precedence, FUTSYM might be empty
but DEL nor FUTDEL could be [that is, there must always be at least one
delimiter between any two "symbols" (values)].
The basic lexical reader is called RUND (Read Until Next Delimiter).
A simplified version in Scheme [cribbed from some old code of mine
so apologies for any incompleteness/inconsistencies]:
(define (rund) ; Operates on globals, ugh!
(set! oldsym sym) ; Slide the window
(set! olddel del)
(set! del (read-token))
(if (token-is-del? del)
(set! sym null-token) ; Empty value between delims
(begin
(set! sym del)
(set! del (read-token))
(unless (token-is-del? del)
(error "Two adjacent non-delimeters!")))))
and the parse code for (say) binary infix operators looks
something like this:
(define (expr-infix)
(let ((mysym sym) ; My left operand
(myprio (del-prio del)) ; Me
(mycode (del-code del)))
(rund) ; Step forward
(do ()
((<= (del-prio del) myprio) ; "<=" causes left-associativity
(make-graph-lexeme mycode mysym sym))
(set! sym ((del-reduce-function del))))))
and the generic expression parser is:
(define (expression)
(do ()
((<= (del-prio del) prio-stopper) ; Right-paren, semicolon, etc.
sym)
(set! sym ((del-reduce-function del)))))
See
The "trick" to hybridizing this with recursive descent is to assign
*very* high operator priorities to the start tokens of control expressions
[e.g., "keywords" such as BEGIN, IF, WHILE, etc.] and *very* low priorities
to the ends of expressions ["stoppers" such as END, ELSE, FI, right-paren,
right-bracket, comma, semicolon, &c]. This causes the simple operator
precedence parser to quite naturally behave as a recursive descent parser
when it encounters any of those constructs. ;-}
-Rob
[1] In the BLISS parsers, a lexeme is either a terminal valued token
[literal, variable, etc.] or the result of a reduce operation
representing a value (in which case it is called a "graph table lexeme").
-----
Rob Warnock <rpw3@rpw3.org>
627 26th Avenue <URL:http://rpw3.org/>
San Mateo, CA 94403 (650)572-2607
Return to the
comp.compilers page.
Search the
comp.compilers archives again.