Re: Supporting multiple input syntaxes

antispam@math.uni.wroc.pl
Thu, 11 Feb 2021 23:27:43 +0000 (UTC)

          From comp.compilers

Related articles
[7 earlier articles]
Re: Supporting multiple input syntaxes mijoryx@yahoo.com.dmarc.email (luser droog) (2020-08-15)
Re: Supporting multiple input syntaxes davidlovemore@gmail.com (David Lovemore) (2020-08-16)
Re: Supporting multiple input syntaxes mijoryx@yahoo.com.dmarc.email (luser droog) (2020-08-20)
Re: Supporting multiple input syntaxes gah4@u.washington.edu (gah4) (2020-08-23)
Re: Supporting multiple input syntaxes mijoryx@yahoo.com.dmarc.email (luser droog) (2020-08-23)
Re: Supporting multiple input syntaxes mijoryx@yahoo.com.dmarc.email (luser droog) (2020-08-23)
Re: Supporting multiple input syntaxes antispam@math.uni.wroc.pl (2021-02-11)
Re: Supporting multiple input syntaxes elronnd@elronnd.net (Elijah Stone) (2021-02-17)
Re: Supporting multiple input syntaxes antispam@math.uni.wroc.pl (2021-02-23)
| List of all articles for this month |

From: antispam@math.uni.wroc.pl
Newsgroups: comp.compilers
Date: Thu, 11 Feb 2021 23:27:43 +0000 (UTC)
Organization: Politechnika Wroclawska
References: 20-08-002
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="1495"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, design
Posted-Date: 11 Feb 2021 19:05:51 EST

luser droog <mijoryx@yahoo.com.dmarc.email.dmarc.email> wrote:
> I've got my project successfully parsing the circa-1975 C syntax
> from that old manual. I'd like to add parsers for K&R1 and c90
> syntaxes.
>
> How separate should these be? Should they be complete
> separate grammars, or more piecewise selection?
>
> My feeling is that separating them will be less headache, but maybe
> there's some advantage to changing out smaller pieces of the grammar
> in that it might be easier to make sure that they produce the same
> structure compatible with the backend.
>
> Any guidance in this area?
>
> https://github.com/luser-dr00g/pcomb/blob/master/pc9syn.c
>
> [Really, it's up to you. My inclination would be to make them
> separate but use some sort of macro setup so you can insert
> common pieces into each of the grammars. -John]


Gnu Pascal supports several Pascal dialects. Gnu Pascal uses
unified parser for all dialects. Some ideas used:
- flags in scanner decide if dialect specific tokens are
    recognized
- superset parsing: several constructs are generalized so
    that single construct represents things that othewise
    would lead to conflits. Later semantic stage looks at
    dialects flags, prunes things not allowed in given
    dialect. Example of superset contruction is rule
    'call_or_cast', it handles several syntactically similar
    constructs that are usually given by separate syntax
    rules. Semantic rules beside dialect flags use types to
    decide of meaning.
- even after usin two tricks above grammar still have
    LALR conflicts, they are resolved using GLR option
    of Bison. All conflicts are resolvable using lookahead,
    and AFAICS some are only resolvable with lookahead.
    Parser lookahead means that traditional trick of
    passing semantic info back to scanner does not work
    (parser actions are delayed, so scanner may be forced
    to produce token before semantic info is available).
    Still, it seems that GLR leads to cleaner parser.


My impression is that variation in Pascal dialects is larger
than in C dialects, so case for unified parser in C IMHO
is much stronger. OTOH Gnu Pascal is full compiler with
semantic actions invoked from grammar rules. Semantic code
embedded in the parser changed much more than grammar rules,
so maintaining separate parsers probably would be a
nightmare.


--
                                                            Waldek Hebisch


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.