Re: Parsing partial sentences

George Neuner <gneuner2@comcast.net>
Mon, 10 Apr 2017 20:22:22 -0400

          From comp.compilers

Related articles
Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-03)
Re: Parsing partial sentences pronesto@gmail.com (Fernando) (2017-04-04)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-07)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-07)
Re: Parsing partial sentences mail@slkpg.com (mail) (2017-04-07)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-07)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-10)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-11)
Re: Parsing partial sentences martin@gkc.org.uk (Martin Ward) (2017-04-11)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-11)
Re: Parsing partial sentences martin@gkc.org.uk (Martin Ward) (2017-04-11)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-11)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-12)
[8 later articles]
| List of all articles for this month |

From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Mon, 10 Apr 2017 20:22:22 -0400
Organization: A noiseless patient Spider
References: 17-04-001 17-04-002 17-04-003 17-04-004 17-04-006
Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="28213"; mail-complaints-to="abuse@iecc.com"
Keywords: C, parse
Posted-Date: 10 Apr 2017 20:38:36 EDT

On Fri, 7 Apr 2017 22:24:04 +0200, Hans-Peter Diettrich
<DrDiettrich1@netscape.net> wrote:


>Am 07.04.2017 um 18:59 schrieb George Neuner:
>
>> It might be easiest to just run the C preprocessor as a 1st pass and
>> then attempt to convert the result.
>
>That would obfuscate the source code widely :-(


That's true, but any C code you might encounter could, in itself, be
arbitrarily convoluted and hard to understand: e.g., deep pointer
chains, goto spaghetti, Duff devices, setjmp/longjmp, coroutines, etc.


I understand that you want to be able to preserve and relate the
original source to the translation, but I don't see any other real
choice other than to employ the C preprocessor. As John said, a
#define body may not even be a complete expression - nevermind a legal
one.


Normally a parser would error on an illegal or incomplete expression.
Creating a parser that can return useful information about arbitrary
quasi-legal input I think would be extremely difficult.




>> Since a #define body is just text, it can be anything - people have
>> created whole DSLs using #define. If you really need to figure them
>> out, I'm afraid you'll need (almost) the whole C language parser to do
>> it.
>
>I already wrote the parser, but it's LL. For code snippets a LR parser
>looks like the only solution?


You could do it in either LL or LR ... LR would be more runtime
efficient, but I think it would not be any easier to create the parser
in the first place.


To do what you want you'd need non-terminals not just for every legal
(sub)expression in C, but for every individual keyword, operator and
symbol, and also for any quasi-legal combination of them. The parser
would be enormous.




Building on John's example, consider what you'd do with


# define FOO +
# define BAR + 42
# define BAZ + c /* note 'c' is undefined */
int a, b;


        :
        a = a FOO b BAR BAZ;
        :




And then consider what you'd need to handle sh..stuff like this for
every keyword, operator, variable, etc.


George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.