Re: Parsing partial sentences

Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Tue, 11 Apr 2017 19:40:34 +0200

          From comp.compilers

Related articles
[3 earlier articles]
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-07)
Re: Parsing partial sentences mail@slkpg.com (mail) (2017-04-07)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-07)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-10)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-11)
Re: Parsing partial sentences martin@gkc.org.uk (Martin Ward) (2017-04-11)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-11)
Re: Parsing partial sentences martin@gkc.org.uk (Martin Ward) (2017-04-11)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-11)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-12)
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-20)
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-21)
Re: Parsing partial sentences walter@bytecraft.com (Walter Banks) (2017-04-27)
[5 later articles]
| List of all articles for this month |
From: Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Newsgroups: comp.compilers
Date: Tue, 11 Apr 2017 19:40:34 +0200
Organization: Compilers Central
References: 17-04-001 17-04-002 17-04-003 17-04-004 17-04-006 17-04-007 17-04-008
Injection-Info: miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="4497"; mail-complaints-to="abuse@iecc.com"
Keywords: C, parse
Posted-Date: 11 Apr 2017 23:10:09 EDT

Am 11.04.2017 um 10:31 schrieb Hans-Peter Diettrich:


[John]
> Here's a thought that sometimes works: try parsing the #define text,
> if it succeeds store the parsed version, otherwise store the text.


How do you detect "succeeds", and what's the parser output in this case?


My LL parser fails with the first token, because it expects a C module,
so that there is no useful output available.


An LR parser instead could stop with something like
    goal-->expression-->addition-->(identifier="a", identifier="b").


Hmm, a similar result could be achieved with an LL top-down parser,
whose goal is composed of the expected (handled) non-terminals. This
would only require a modification to the parser for that goal, which
tries all alternatives in sequence and stops on the first successful
parse (NFA). Parsing has to be restarted at the begin of the text
snippet, after an alternative failed to parse. This part is covered in
LR bottom-up parsers by the NFA to DFA conversion.




> But be prepared for that to fail, e.g.:
>
> #define FOO a + b


When I'd try to convert that into a function body, the resulting
function could look like
      void FOO() { a + b; }
or
      int FOO() { return a + b; }
what are not valid C modules, so that FOO were flagged to be
not-a-function, and its further occurences have to be expanded as usual.


Only if the identifiers a and b have been declared or #defined before,
FOO() would become a valid function (above) or constant of a compatible
type, like
      const int FOO = a+b;


Instead
      #define FOO 0x12345
would parse into something like
      goal-->value-->literal="0x1235"
which could be translated into
      const int FOO=0x12345;


In this case some manual postprocessing could define an enum
      typedef enum SomeField = { FOO=0x1234, BAR=0, BAZ=42 };
which then could be used to assign a more specific type to strzct fields
or subroutine arguments.


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.