Newsgroups: | comp.compilers |
From: | "Karsten Nyblad, TFL, Denmark" <KARSTEN@tfl.dk> |
Keywords: | pascal, parse |
Organization: | Compilers Central |
References: | <9010091533.AA02386@apple.com> |
Date: | 10 Oct 90 18:06:55 +200 |
In article <9010091533.AA02386@apple.com>, amb@apple.com (A. Michael Burbidge) writes:
> After struggling for some time to write a yacc description for the
> Pascal language and after reading the description of the modifier yacc
> contained in the UCB Pascal source directory I am beginning to wonder
> if an LR(1) parsing algorithm can parse Pascal. ...
I once wrote an LALR(1) parser for a superset of Pascal called Pascal+.
I used the ISO Pascal standard except from for the extensions. Yes you
can write an LR(1) parser, but you parser should be capable of handling
disambiguating rules or you will have problems with the ELSE keyword.
To make Pascal true LR(1) you need to make a little transformation
of the grammar. Let's make an example:
This is a little grammar like the four statements assignment,
while loop and ifthen and ifthenelse.
Stmt -> Assigment
| While_header Stmt
| IfThen_header Stmt ELSE Stmt
| IfThen_header Stmt .
This grammar is ambiguous because the parser will not know
how to parse
IfThen_header Ifthen_header Assignment ELSE Assignment
Now we transform the grammar in the following way:
We divide Stmt in two: 1) all Stmt ending on
IfThen_header Stmt and 2) all Stmt not ending on
IfThen_header:
Stmt -> NoDangling
| Dangling .
NoDangling -> Assignment
| While_header NoDangling
| IfThen_header NoDangling ELSE NoDangling .
Dangling -> While_header Dangling
| IfThen_header NoDangling ELSE Dangling
| IfThen_header Stmt .
This grammar is LALR(1) and thus LR(1).
Then about making semicolon optional. I also tried
deleting all semicolons from my grammar. You do not need
semicolons except in one case: the case statement, e.g.:
CASE expr OF
2 : a:= a ; - 3 : END
^
Try deleting the semicolon. Then the parser will not know
that the - is the start of a label until it has seen the colon.
You can let your scanner divide - and + into prefix and
infix operators by doing some lookahead in the scanner, but
take care of
writeln(5-3:3);
Karsten Nyblad
TFL, A Danish Telecommunication Research Laboratory
E-mail: karsten@tfl.dk
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.