Re: Parser Reversed

"Matt P. Dziubinski" <matdzb@gmail.com>
Sun, 11 Mar 2018 15:08:54 +0100

          From comp.compilers

Related articles
Parser Reversed DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2018-03-11)
Re: Parser Reversed matdzb@gmail.com (Matt P. Dziubinski) (2018-03-11)
Re: Parser Reversed 157-073-9834@kylheku.com (Kaz Kylheku) (2018-03-12)
Re: Parser Reversed DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2018-03-13)
Re: Parser Reversed DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2018-03-13)
| List of all articles for this month |
From: "Matt P. Dziubinski" <matdzb@gmail.com>
Newsgroups: comp.compilers
Date: Sun, 11 Mar 2018 15:08:54 +0100
Organization: http://www.wit.edu.pl
References: 18-03-038
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="44798"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, tools, question
Posted-Date: 12 Mar 2018 16:18:26 EDT

On 3/11/2018 08:32, Hans-Peter Diettrich wrote:
> A grammar can be used to *check* for valid sentences of a language, but
> it also can be used to *create* valid sentences. For a pretty printer or
> decompiler test I need a sentence generator for logical expressions. For
> now the language can be restricted to AND, OR, variables and (kind of)
> parentheses. Later on NOT and XOR can be added. RPN is one alternative
> for the "kind of parentheses", eliminating the need for a specific
> operator precedence.
>
> Now I'm looking for possible implementations of such a generator, in
> addition to my own ideas. So far the output can be anything, e.g. source
> code or machine code, or some tree (AST...).
>
> Any ideas or references to such projects?


Hi!


Csmith comes to mind: https://embed.cs.utah.edu/csmith/


Reference: Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. PLDI
2011. "Finding and Understanding Bugs in C Compilers"
Paper: http://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf
LtU post: http://lambda-the-ultimate.org/node/4241


Summary (from the paper): "The shape of a program generated by Csmith is
governed by a grammar for a subset of C. A program is a collection of
type, variable, and function definitions; a function body is a block; a
block contains a list of declarations and a list of statements; and a
statement is an expression, control-flow construct (e.g., `if`,
`return`, `goto`, or `for`), assignment, or block. Assignments are
modeled as statementsbnot expressionsbwhich reflects the most common
idiom for assignments in C code. We leverage our grammar to produce
other idiomatic code as well: in particular, we include a statement kind
that represents a loop iterating over an array. The grammar is
implemented by a collection of hand-coded C++ classes."


You may also want to take a look at the following:


* "Effect-Driven QuickChecking of Compilers" (notably, the following
goes substantially further than relying solely on the grammar grammar by
making use of the type system -- more in the paper):


Code (Effect-Driven Compiler Tester): https://github.com/jmid/efftester
Paper: http://janmidtgaard.dk/papers/Midtgaard-al%3AICFP17-full.pdf
Talk: https://podcasts.ox.ac.uk/effect-driven-quickchecking-compilers


* "Structure-aware fuzzing for Clang and LLVM with libprotobuf-mutator"
- Kostya Serebryany, Vitaly Buka and Matt Morehouse - 2017 LLVM
Developersb Meeting
https://www.youtube.com/watch?v=U60hC16HEDY
https://llvm.org/devmtg/2017-10/#talk8


See: https://llvm.org/docs/FuzzingLLVM.html
In particular:
https://github.com/llvm-mirror/clang/tree/master/tools/clang-fuzzer


"This directory contains two utilities for fuzzing Clang: clang-fuzzer
and clang-proto-fuzzer. Both use libFuzzer to generate inputs to clang
via coverage-guided mutation.


The two utilities differ, however, in how they structure inputs to
Clang. clang-fuzzer makes no attempt to generate valid C++ programs and
is therefore primarily useful for stressing the surface layers of Clang
(i.e. lexer, parser). clang-proto-fuzzer uses a protobuf class to
describe a subset of the C++ language and then uses libprotobuf-mutator
to mutate instantiations of that class, producing valid C++ programs in
the process. As a result, clang-proto-fuzzer is better at stressing
deeper layers of Clang and LLVM."


For further reference, perhaps the following compiler correctness
resources (literature & software) can also be of help:
https://github.com/MattPD/cpplinks/blob/master/compilers.correctness.md


Best,


Matt P. Dziubinski


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.