Re: parsing C function declarations to generate code to serialize the formal arguments

Hans-Peter Diettrich <DrDiettrich1@aol.com>
14 Dec 2006 17:28:46 -0500

          From comp.compilers

Related articles
parsing C function declarations to generate code to serialize the form einatbg@zahav.net.il (Moshe Pfeffer) (2006-12-14)
Re: parsing C function declarations to generate code to serialize the eliotm@pacbell.net (Eliot Miranda) (2006-12-14)
Re: parsing C function declarations to generate code to serialize the DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-12-14)
Re: parsing C function declarations to generate code to serialize the alex_mcd@btopenworld.com (Alex McDonald) (2006-12-15)
Re: parsing C function declarations to generate code to serialize the idbaxter@semdesigns.com (Ira Baxter) (2006-12-24)
| List of all articles for this month |

From: Hans-Peter Diettrich <DrDiettrich1@aol.com>
Newsgroups: comp.compilers
Date: 14 Dec 2006 17:28:46 -0500
Organization: Compilers Central
References: 06-12-056
Keywords: C, parse
Posted-Date: 14 Dec 2006 17:28:46 EST

Moshe Pfeffer wrote:


> The result:
>
> For each function F(i) in the H file there will be N(F(i)) trees
> generated (one for each formal parameter + return type).


In my C parser I found little need for such trees, instead an
appropriate function signature is sufficient in most cases, in detail
for de/serialization. IMO the overhead for parsing the signature vs.
traversing trees is neglectable, my signature are easy to parse *and*
easy to debug (true ASCII strings).




> Note: there will be no cycles in any of the trees, as non of the
> argument types reference themselves or parts of themselves (as you
> would find, say, in a linked-list struct).


In the type descriptions, pointers to structured types can establish
circular references. Cycles also can occur in the data, when pointers
are involved.


Have you considered what amount of "local" data is required, to store
records with (many) pointers to other data elements? You'll also have to
follow every given pointer, and serialize it's target, while avoiding
duplicate serialization of the same data structure. Consider an string
and one or more pointers *into* that string - how do you intend to
determine the range of related chars? Even worse with other pointers,
how would you determine the begin and end of an related array?




> The algorithms the code generator will use to process the formal parameter
> representation goes something like this:
>
> serialize(thing, buffer)
> case: simple type
> generate code to add it to the buffer
> case: pointer
> generate code to add the pointer to the buffer
> if the pointer is non-null, recurse:
> serialize(*thing, buffer)


Here you'll end up in an infinite loop, when you try to serialize a
bidirectionally linked list. Serializing e.g. an parse tree will result
in a serialization of almost the whole data section of the program...


IMO you'll have to find a more practical handling of pointers, before
you proceed with your project.


Also consider unions, possibly containing pointers to different data
types. Which branch would you follow in the serialization of a union?






> [Generally speaking, you can't parse a C header file with anything
> less than a full C parser. If you have particular applications in
> mind, you might be able to cheat and get by with less. -John]


Right, though only declarations and constant definitions have to be
parsed, whereas function definitions are unexpected in header files.
Parsing a fully preprocessed header file is feasable with limited
efforts, most code goes into the evaluation of expressions and type
specifications.


DoDi


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.