Related articles |
---|
[10 earlier articles] |
Re: Parsing partial sentences martin@gkc.org.uk (Martin Ward) (2017-04-11) |
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-11) |
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-12) |
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-20) |
Re: Parsing partial sentences gneuner2@comcast.net (George Neuner) (2017-04-21) |
Re: Parsing partial sentences walter@bytecraft.com (Walter Banks) (2017-04-27) |
Re: Parsing partial sentences 686-678-9105@kylheku.com (Kaz Kylheku) (2017-04-27) |
Re: Parsing partial sentences DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2017-04-28) |
Re: Parsing partial sentences rugxulo@gmail.com (2017-04-28) |
Re: Parsing partial sentences marcov@toad.stack.nl (Marco van de Voort) (2017-04-29) |
Re: Parsing partial sentences 686-678-9105@kylheku.com (Kaz Kylheku) (2017-04-30) |
From: | Kaz Kylheku <686-678-9105@kylheku.com> |
Newsgroups: | comp.compilers |
Date: | Thu, 27 Apr 2017 19:08:09 +0000 (UTC) |
Organization: | Aioe.org NNTP Server |
References: | 17-04-001 17-04-023 |
Injection-Info: | miucha.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="66276"; mail-complaints-to="abuse@iecc.com" |
Keywords: | parse |
Posted-Date: | 27 Apr 2017 21:15:16 EDT |
On 2017-04-27, Walter Banks <walter@bytecraft.com> wrote:
> On 2017-04-03 3:57 AM, Hans-Peter Diettrich wrote:
>> Is there an easy way to parse e.g. C #defines into constants,
>> functions or other non-terminals, which are not the goal of the
>> entire grammar?
>
> In a word NO. #defines are always strings even when they look like
> constants, something I have found out the hard way. There have only been
> two ways that I have successfully dealt with #defines: a preprocessor
> pass or later and much faster pipeline the processing of C source and
> add the defined definition processing into part of the source fetch
> handling.
If we allow Pascal to be extended with a macro preprocessor,
I believe I could design a system for translating C to Pascal which
handles some macros, translating them to Pascal macros. Even some
macros that "break" syntactic boundaries, such as "list_for_each (var,
list) { block }".
I don't believe such a project has any value beyond getting
a pat on the back from another developer; I wouldn't spend any
time on such a thing. The end result might well be rejected by some
Pascal users, due to requiring the extended dialect, whether on
ideological grounds, or on practical issues with tooling (being able to
get the preprocessor running in a given Pascal development environment).
Here is a very high level sketch of the approach:
- We preprocess the C translation unit fully before parsing it.
- However, we use our own specialized C preprocessor which is
tightly integrated into our translator.
- Our specialized C preprocessor carefully tracks, in detail
the origin of every piece of syntax, to the macro which
substituted that syntax, either as an argument or as body
material.
- As our parser is analyzing the code, it preserves this information
in the abstract syntax tree: every tree fragment, if it
was the result of a macro expansion, is tracked to the
macro call. We also know that an entire node was the result
of a macro call, and what that macro call looked like.
- When we output the Pascal translation, if a tree node was the
result of a macro call (a macro call that we we were successfully
able to treat with our magic algoirthms) then rather than outputting
the Pascal translation of that tree node, we output the macro
call syntax (remembering that our Pascal dialect has a preprocessor
to handle that).
- We have a magic algorithm for reconstructing Pascal versions
of macro bodies which works roughly like this:
- When we translate a tree node from C to Pascal, if that tree node
came from macrology, we keep track of which Pascal fragments
correspond to C fragments.
- We then reverse the macrology: we analyze the Pascal and see which
fragments correspond to C material that came from a macro body,
and which came from substitution of macro arguments and such.
- From this we can reconstruct a Pascal macro body, using the
corresponding Pascal fragments.
- A Pascal fragment which coresponds to the insertion
of some argument X, is just represented by the same X in the
Pascal macro body.
- A Pascal fragment which corresponds to the insertion of
some body template material Y is replaced in the Pascal
version of the macro by the corresponding Pascal piece.
- Since a macro call often occurs in more than one place, we can
somehow combine information from multiple sites to improve the
translation, or at least validate that the same thing is
happening.
- The saving grace here what makes this even contemplatable is
that C macros are dumb positional substitution: just pasting
together of fragments. If C macros were like Lisp macros, doing
arbitrary Turing computation, good luck with this approach, right?
- I think, best forget C99 variadic macros and such cruft.
- The system could have a mode whereby it gives up trying to translate
a C macro to Pascal, but it at least reverses the syntax so that
the overall Pascal fragment corresponding to the C code which came
from that macro call is reversed back to a macro call. A diagnostic
can be generated "missing macro needed", and the users themselves
can use their human intelligence to finish the job (if possible).
Find the C version of the macro and try to translate it.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.