Bison 3.7 is released

akim.demaille@gmail.com
Sat, 1 Aug 2020 01:31:09 -0700 (PDT)

          From comp.compilers

Related articles
Bison 3.7 is released akim.demaille@gmail.com (2020-08-01)
| List of all articles for this month |
From: akim.demaille@gmail.com
Newsgroups: comp.compilers
Date: Sat, 1 Aug 2020 01:31:09 -0700 (PDT)
Organization: Compilers Central
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="99539"; mail-complaints-to="abuse@iecc.com"
Keywords: yacc, available
Posted-Date: 01 Aug 2020 10:21:38 EDT

I am very happy to announce the release of Bison 3.7, whose main novelty,
contributed by Vincent Imbimbo, is the generation of counterexamples for
conflicts. For instance on a grammar featuring the infamous "dangling else"
problem, "bison -Wcounterexamples" now gives:


        $ bison -Wcounterexamples else.y
        else.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
        else.y: warning: shift/reduce conflict on token "else"
[-Wcounterexamples]
            Example: "if" exp "then" "if" exp "then" exp • "else" exp
            Shift derivation
                exp
                ↳ "if" exp "then" exp
                                                    ↳ "if" exp "then" exp • "else" exp
            Reduce derivation
                exp
                ↳ "if" exp "then" exp "else" exp
                                                    ↳ "if" exp "then" exp •


which actually proves that the grammar is ambiguous by exhibiting a text
sample with two derivations (corresponding to two parse trees). When Bison
is installed with text styling enabled, the example is actually shown twice,
with colors demonstrating the ambiguity (see
https://www.gnu.org/software/bison/manual/html_node/Counterexamples.html).


Joshua Watt contributed the option "--file-prefix-map OLD=NEW", to make
reproducible builds.


There are many other changes (hyperlinks in the diagnostics, reproducible
handling of string aliases with escapes, improvements in the push parsers,
etc.), please see the NEWS below for more details.


Cheers!


              Akim


PS/ The experimental back-end for the D programming language is still
looking for active support from the D community.


==================================================================


Here are the compressed sources:
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.gz (5.1MB)
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.lz (3.1MB)
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.xz (3.1MB)


Here are the GPG detached signatures[*]:
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.gz.sig
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.lz.sig
    https://ftp.gnu.org/gnu/bison/bison-3.7.tar.xz.sig


Use a mirror for higher download bandwidth:
    https://www.gnu.org/order/ftp.html


[*] Use a .sig file to verify that the corresponding file (without the
.sig suffix) is intact. First, be sure to download both the .sig file
and the corresponding tarball. Then, run a command like this:


    gpg --verify bison-3.7.tar.gz.sig


If that command fails because you don't have the required public key,
then run this command to import it:


    gpg --keyserver keys.gnupg.net --recv-keys 0DDCAA3278D5264E


and rerun the 'gpg --verify' command.


This release was bootstrapped with the following tools:
    Autoconf 2.69
    Automake 1.16.2
    Flex 2.6.4
    Gettext 0.19.8.1
    Gnulib v0.1-3644-gac34618e8


==================================================================


GNU Bison is a general-purpose parser generator that converts an annotated
context-free grammar into a deterministic LR or generalized LR (GLR) parser
employing LALR(1) parser tables. Bison can also generate IELR(1) or
canonical LR(1) parser tables. Once you are proficient with Bison, you can
use it to develop a wide range of language parsers, from those used in
simple desk calculators to complex programming languages.


Bison is upward compatible with Yacc: all properly-written Yacc grammars
work with Bison with no change. Anyone familiar with Yacc should be able to
use Bison with little trouble. You need to be fluent in C, C++ or Java
programming in order to use Bison.


Bison and the parsers it generates are portable, they do not require any
specific compilers.


GNU Bison's home page is https://gnu.org/software/bison/.


==================================================================


* Noteworthy changes in release 3.7 (2020-07-23) [stable]


** Deprecated features


    The YYPRINT macro, which works only with yacc.c and only for tokens, was
    obsoleted long ago by %printer, introduced in Bison 1.50 (November 2002).
    It is deprecated and its support will be removed eventually.


    In conformance with the recommendations of the Graphviz team, in the next
    version Bison the option `--graph` will generate a *.gv file by default,
    instead of *.dot. A transition started in Bison 3.4.


** New features


*** Counterexample Generation


    Contributed by Vincent Imbimbo.


    When given `-Wcounterexamples`/`-Wcex`, bison will now output
    counterexamples for conflicts.


**** Unifying Counterexamples


    Unifying counterexamples are strings which can be parsed in two ways due
    to the conflict. For example on a grammar that contains the usual
    "dangling else" ambiguity:


        $ bison else.y
        else.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
        else.y: note: rerun with option '-Wcounterexamples' to generate conflict
counterexamples


        $ bison else.y -Wcex
        else.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
        else.y: warning: shift/reduce conflict on token "else"
[-Wcounterexamples]
            Example: "if" exp "then" "if" exp "then" exp • "else" exp
            Shift derivation
                exp
                ↳ "if" exp "then" exp
                                                    ↳ "if" exp "then" exp • "else" exp
            Example: "if" exp "then" "if" exp "then" exp • "else" exp
            Reduce derivation
                exp
                ↳ "if" exp "then" exp "else" exp
                                                    ↳ "if" exp "then" exp •


    When text styling is enabled, colors are used in the examples and the
    derivations to highlight the structure of both analyses. In this case,


        "if" exp "then" [ "if" exp "then" exp • ] "else" exp


    vs.


        "if" exp "then" [ "if" exp "then" exp • "else" exp ]




    The counterexamples are "focused", in two different ways. First, they do
    not clutter the output with all the derivations from the start symbol,
    rather they start on the "conflicted nonterminal". They go straight to the
    point. Second, they don't "expand" nonterminal symbols uselessly.


**** Nonunifying Counterexamples


    In the case of the dangling else, Bison found an example that can be
    parsed in two ways (therefore proving that the grammar is ambiguous).
    When it cannot find such an example, it instead generates two examples
    that are the same up until the dot:


        $ bison foo.y
        foo.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
        foo.y: note: rerun with option '-Wcounterexamples' to generate conflict
counterexamples
        foo.y:4.4-7: warning: rule useless in parser due to conflicts [-Wother]
                4 | a: expr
                    | ^~~~


        $ bison -Wcex foo.y
        foo.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
        foo.y: warning: shift/reduce conflict on token ID [-Wcounterexamples]
            First example: expr • ID ',' ID $end
            Shift derivation
                $accept
                ↳ s $end
                    ↳ a ID
                        ↳ expr
                            ↳ expr • ID ','
            Second example: expr • ID $end
            Reduce derivation
                $accept
                ↳ s $end
                    ↳ a ID
                        ↳ expr •
        foo.y:4.4-7: warning: rule useless in parser due to conflicts [-Wother]
                4 | a: expr
                    | ^~~~


    In these cases, the parser usually doesn't have enough lookahead to
    differentiate the two given examples.


**** Reports


    Counterexamples are also included in the report when given
    `--report=counterexamples`/`-rcex` (or `--report=all`), with more
    technical details:


        State 7


            1 exp: "if" exp "then" exp • [$end, "then", "else"]
            2 | "if" exp "then" exp • "else" exp


            "else" shift, and go to state 8


            "else" [reduce using rule 1 (exp)]
            $default reduce using rule 1 (exp)


            shift/reduce conflict on token "else":
                    1 exp: "if" exp "then" exp •
                    2 exp: "if" exp "then" exp • "else" exp
                Example: "if" exp "then" "if" exp "then" exp • "else" exp
                Shift derivation
                    exp
                    ↳ "if" exp "then" exp
                                                        ↳ "if" exp "then" exp • "else" exp
                Example: "if" exp "then" "if" exp "then" exp • "else" exp
                Reduce derivation
                    exp
                    ↳ "if" exp "then" exp "else" exp
                                                        ↳ "if" exp "then" exp •


*** File prefix mapping


    Contributed by Joshua Watt.


    Bison learned a new argument, `--file-prefix-map OLD=NEW`. Any file path
    in the output (specifically `#line` directives and `#ifdef` header guards)
    that begins with the prefix OLD will have it replaced with the prefix NEW,
    similar to the `-ffile-prefix-map` in GCC. This option can be used to
    make bison output reproducible.


** Changes


*** Diagnostics


    When text styling is enabled and the terminal supports it, the warnings
    now include hyperlinks to the documentation.


*** Relocatable installation


    When installed to be relocatable (via `configure --enable-relocatable`),
    bison will now also look for a relocated m4.


*** C++ file names


    The `filename_type` %define variable was renamed `api.filename.type`.
    Instead of


        %define filename_type "symbol"


    write


        %define api.filename.type {symbol}


    (Or let `bison --update` do it for you).


    It now defaults to `const std::string` instead of `std::string`.


*** Deprecated %define variable names


    The following variables have been renamed for consistency. Backward
    compatibility is ensured, but upgrading is recommended.


        filename_type -> api.filename.type
        package -> api.package


*** Push parsers no longer clear their state when parsing is finished


    Previously push-parsers cleared their state when parsing was finished (on
    success and on failure). This made it impossible to check if there were
    parse errors, since `yynerrs` was also reset. This can be especially
    troublesome when used in autocompletion, since a parser with error
    recovery would suggest (irrelevant) expected tokens even if there were
    failure.


    Now the parser state can be examined when parsing is finished. The parser
    state is reset when starting a new parse.


** Documentation


*** Examples


    The bistromathic demonstrates %param and how to quote sources in the error
    messages:


        > 123 456
        1.5-7: syntax error: expected end of file or + or - or * or / or ^ before
number
                1 | 123 456
                    | ^~~


** Bug fixes


*** Include the generated header (yacc.c)


    Historically, when --defines was used, bison generated a header and pasted
    an exact copy of it into the generated parser implementation file. Since
    Bison 3.4 it is possible to specify that the header should be `#include`d,
    and how. For instance


        %define api.header.include {"parse.h"}


    or


        %define api.header.include {<parser/parse.h>}


    Now api.header.include defaults to `"header-basename"`, as was intended in
    Bison 3.4, where `header-basename` is the basename of the generated
    header. This is disabled when the generated header is `y.tab.h`, to
    comply with Automake's ylwrap.


*** String aliases are faithfully propagated


    Bison used to interpret user strings (i.e., decoding backslash escapes)
    when reading them, and to escape them (i.e., issue non-printable
    characters as backslash escapes, taking the locale into account) when
    outputting them. As a consequence non-ASCII strings (say in UTF-8) ended
    up "ciphered" as sequences of backslash escapes. This happened not only
    in the generated sources (where the compiler will reinterpret them), but
    also in all the generated reports (text, xml, html, dot, etc.). Reports
    were therefore not readable when string aliases were not pure ASCII.
    Worse yet: the output depended on the user's locale.


    Now Bison faithfully treats the string aliases exactly the way the user
    spelled them. This fixes all the aforementioned problems. However, now,
    string aliases semantically equivalent but syntactically different (e.g.,
    "A", "\x41", "\101") are considered to be different.


*** Crash when generating IELR


    An old, well hidden, bug in the generation of IELR parsers was fixed.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.