Re: representing functions with arguments in an abstract syntax tree

Chris Torek <torek@torek.net>
2 Jan 2004 03:39:10 -0500

From comp.compilers

Related articles
representing functions with arguments in an abstract syntax tree melkorainur@yahoo.com (2003-12-27)
*Re: representing functions with arguments in an abstract syntax tree torek@torek.net (Chris Torek)* (2004-01-02)**
Re: representing functions with arguments in an abstract syntax tree malcolm@55bank.freeserve.co.uk (Malcolm) (2004-01-02)
Re: representing functions with arguments in an abstract syntax tree cfc@world.std.com (Chris F Clark) (2004-01-02)
Re: representing functions with arguments in an abstract syntax tree jacob@jacob.remcomp.fr (jacob navia) (2004-01-02)
Re: representing functions with arguments in an abstract syntax tree witness@t-online.de (Uli Kusterer) (2004-01-02)

| List of all articles for this month |

From:	Chris Torek <torek@torek.net>
Newsgroups:	comp.compilers,comp.lang.c
Date:	2 Jan 2004 03:39:10 -0500
Organization:	None of the Above
References:	03-12-142
Keywords:	code
Posted-Date:	02 Jan 2004 03:39:10 EST

[NB: this article is cross-posted!]

In article <news:03-12-142@comp.compilers>
Melkor Ainur <melkorainur@yahoo.com> writes:
>... To be specific, how do (or is it even possible) I write a
>generic function pointer that can represent all my different
>functions. some that have multiple promotable-arguments
>(chars, ints) and pointers. and then, how do I Pass these
>functions their arguments? I currently suspect I can't do
>that within C ...

Not in a generic fashion, no. This is actually a comp.lang.c FAQ
(15.13).

C *does* allow you to store any "pointer to function" value in
any object of type "pointer to function" regardless of the function's
return-value type and argument types. These "extra" types (return
value and arguments) *are* part of the function's "type signature"
(a phrase found more often in C++ than C, due to the usual C++
"name mangling" practice for function overloading), and *do* have
to be present at the actual call, but C guarantees that you can
cast a function-pointer value to some other function-pointer type
and store all the "useful" parts for recovery. Any "extra" bits
that might depend on the function's type signature will be added
back by a second cast, back to the original type. For instance:

        extern double modf(double, double *);
        void (*p)(void);
        double x, y, z;
        ...
        p = (void (*)(void))modf; /* legal and well-defined */
        ... code that does not modify p ...
        /* assuming y has been set: */
        x = ((double (*)(double, double *))p)(y, &z); /* calls modf() */

>and that I might need to generate architecture specific
>assembly to store the arguments on the stack and then call the builtin
>function.

This is one way to do it. Note that some systems (even using C)
do not pass most arguments on a stack at all; on those machines
you will need architecture-specific code to store the arguments in
the appropriate argument registers (integer and/or floating-point,
e.g., SPARC and PowerPC).

There is another way to handle this, assuming that the set of
functions is fixed at interpreter-build-time (which is probably
true now but may not be part of your ultimate goal). Suppose
you have "interpreter level" functions f, g, and h and the kind
of data structures you described earlier. Then instead of calling
the C functions cf(), cg(), and ch() that implement f, g, and h
(but are, say, void cf(int), double cg(double *), and so on, i.e.,
have different type signatures from each other), have the interpreter's
core loop call functions do_f(), do_g(), and do_h(), which
read something like:

        struct retinfo *do_f(struct arginfo *arginfo) {
                static struct retinfo r = { TY_VOID };

                /* f needs one int */
                if (arginfo->numargs != 1 || arginfo->arglist->type != TY_INT)
                        panic("invalid call to do_f()");
                f(arginfo->arglist->argunion.un_int); /* actually call f */
                return &r;
        }

In other words, instead of coming up with a *generic* shim
to fit between "interpreter engine" and "external C functions",
you can resort to a specific set of translation-layer shim
functions. The total number is bounded by the number of
interpreter built-ins. If many of the translation layer
functions all call C functions with a single common type,
you might even want to use one translation-layer shim for
all such functions, passing it an additional parameter giving
the target function, e.g.:

        void *do_voidofint(struct funcall_info *info) {
                static struct retinfo r = { TY_VOID };
                struct arginfo *arginfo = info->arginfo;
                struct arglist *al;
                int (*fp)(int);

                /* whatever function we call, it requires one int */
                if (arginfo->numargs != 1 || (al = arginfo->arglist)->type != TY_INT)
                        panic("invalid call to do_voidofint()");

                /* convert info->target_function to the right type */
                fp = (int (*)(int))info->target_function;

                /* and call the function, whatever it is */
                fp(al->argunion.un_int);

                return &r;
        }

In this case, if you have 100 interpreter built-ins that call 100
different C functions that (collectively) have 9 different type
signatures, you only need 9 of these shims.

>That said, I'd suspect that other people have done or explored this
>type of work.

Steve Summit (the author of the comp.lang.c FAQ) has, for pretty
similar reasons (writing an interpreter).
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: representing functions with arguments in an abstract syntax tree

Chris Torek <torek@torek.net>2 Jan 2004 03:39:10 -0500

Chris Torek <torek@torek.net>
2 Jan 2004 03:39:10 -0500