Re: A minimal LL(1) parser generator ?

anton@mips.complang.tuwien.ac.at (Anton Ertl)
Wed, 22 Jan 2020 17:12:22 GMT

From comp.compilers

Related articles
[9 earlier articles]
Re: A minimal LL(1) parser generator ? gneuner2@comcast.net (George Neuner) (2020-01-02)
Re: A minimal LL(1) parser generator ? rockbrentwood@gmail.com (2020-01-04)
Re: A minimal LL(1) parser generator ? gaztoast@gmail.com (honey crisis) (2020-01-05)
Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05)
Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05)
Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-05)
*Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at* (2020-01-22)**
Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at (2020-01-22)
Re: A minimal LL(1) parser generator ? carlglassberg@gmail.com (2020-01-23)
Re: A minimal LL(1) parser generator ? anton@mips.complang.tuwien.ac.at (2020-01-25)
Re: A minimal LL(1) parser generator ? FredJScipione@alum.RPI.edu (Fred J. Scipione) (2020-01-25)

| List of all articles for this month |

From:	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups:	comp.compilers
Date:	Wed, 22 Jan 2020 17:12:22 GMT
Organization:	Institut fuer Computersprachen, Technische Universitaet Wien
References:	19-12-016 19-12-030 19-12-032 19-12-040 20-01-001 20-01-003 20-01-008
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="27414"; mail-complaints-to="abuse@iecc.com"
Keywords:	parse, syntax
Posted-Date:	22 Jan 2020 16:10:13 EST

carlglassberg@gmail.com writes:
[Applied the corrections from later postings]
>On Thursday, January 2, 2020 at 10:21:53 AM UTC-8, Anton Ertl wrote:
>...
>> (( a b && c d && ))
>
>1. For concatenation of these, I would write, in Gray:
> ((a b &&))((c d &&)) for EBNF: a { b a } c { d c }
>because wouldn't
> ((a b && c d &&))
>be the same as?
> a { b a } c { d a { b a } c }

In Gray, you cannot concatenate x and y by writing them adjacent to
one another, so

a b && c

are just two unrelated grammar expressions: "a b &&" and "c". If you continue that with

a b && c d

you now have three unrelated grammar expressions. Now apply the &&
operator:

a b && c d &&

and you have two unrelated grammar expressions: "a b &&" and "c d &&".
In order to combine them into a concatenatio, you normally surround
them with (( ... )):

(( a b && c d && ))

(or alternatively, you use the binary postfix operator "concat"). Likewise,

(( a b && )) (( c d && ))

would be two unrelated grammar expressions.

Also, note that each word (including grammar operators) has to be
separated from adjacent words with white space.

>2. Let us compare separator-list notation.
>
>In Waite and Goos, "Compiler Construction", there is an infix operator, "||", for separator list, not to be confused with Gray's meta-symbol for "or".
>And Eiffel has a special notation for separator list.
>
>In comparison, Anton's Gray has a postfix operator, "&&", for separator-list.
>
>/Comparison of separator-list notation in
>Waite/Goos EBNF, Eiffel EBNF (BNF-E), and Gray EBNF/
>
> Wirth EBNF Gray EBNF:
>-------------------------------------------------------------------
>Waite/Goos: x || y ==========> x { y x } x y &&
>
>In Eiffel: { x y ...}* ===> [ x { y x } ] (( x y && )) ??
> { x y ...}+ ===> x { y x } x y &&
>-------------------------------------------------------------------

(( x y && )) ??

is equivalent to

x y && ??

>I do wish Gray had a production rule terminator, say semicolon ";" or stop "."
>
>The end of 1 rule would be clearly distinguishable from the beginning of the next without special processing of end-of-line which makes scanning and
>parsing space sensitive. We would avoid the need for LL(2) parsing or other
>parsing and scanning techniques.

Gray just uses Forth's parser (which just scans for white space, i.e.,
as far as the parser is concerned, it's a regular, not a context-free
language). The rest works by putting grammar expressions on the
stack, and having grammar operators that take grammar expressions from
the stack and put the resulting grammar expression on the stack;
that's why the operators are postfix. The

(( ... || ... || ... ))

syntax is syntactic sugar for a more readable syntax than using the
binary postfix operators "concat" and "alt". If && had been part of
the original concept instead of an afterthought that demonstrates the
extensibility, maybe I would have had syntactic sugar for that, too.
Even as a Forth programmer, I admit that binary postfix operators are
not very readable when expressions with three or more binary operators
are involved (which is rare in most programming, but pretty common in
grammars).

Back to the question of production rule terminators: As a result of
the above, a rule typically reads

(( ... )) <- nonterminal

so it's pretty clear to the system and the programmer where a rule
starts and where it ends. However, there is little error checking
involved, and if you write

(( ... (( ... )) <- nonterminal

(i.e., if you forget to write all the necessary closing "))"), you
will get stuff left over on the stack and the nonterminal will not do
what you want. While the compiler does not flag that as error, this
error is easy to notice and find in testing on typical Forth systems
(and it's something you could also get with other Forth code).

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: A minimal LL(1) parser generator ?

anton@mips.complang.tuwien.ac.at (Anton Ertl)Wed, 22 Jan 2020 17:12:22 GMT

anton@mips.complang.tuwien.ac.at (Anton Ertl)
Wed, 22 Jan 2020 17:12:22 GMT