Re: Guessing the Data Set to a Function

George Neuner <gneuner2@comcast.net>
Mon, 16 Sep 2013 03:56:50 -0400

          From comp.compilers

Related articles
Guessing the Data Set to a Function seimarao@gmail.com (Seima Rao) (2013-09-15)
Re: Guessing the Data Set to a Function gneuner2@comcast.net (George Neuner) (2013-09-16)
| List of all articles for this month |

From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Mon, 16 Sep 2013 03:56:50 -0400
Organization: A noiseless patient Spider
References: 13-09-005
Keywords: analysis
Posted-Date: 16 Sep 2013 14:52:29 EDT

On Sun, 15 Sep 2013 15:29:01 +0530, Seima Rao <seimarao@gmail.com>
wrote:


> I have a problem with deducing the Data Set to a function written in
> a High Level Language.
>
> I feel that the compiler is well equipped to do so, therefore I am
> posting my questions to this forum.
>
> Given a procedure/function in HLL(say, 'C') as an input, I want the
> compiler to deduce the data set corresponding to the
> parameters to the function.
>
> I see three cases:
>
> a) Shortest Path Executed By the Function
>
> b) Average Path Executed By the Function
>
> c) Longest Path Executed By the Function
>
> Obviously, All three depend upon the compiler
> being able to correctly guess the data set
> corresponding to the function's parameters.
>
> My question is whether this is possible or/and
> specifically, how to tell the Average path.


I'm not certain I understand your question.


In general you can't work backwards from the function to its
parameters because code generally is written in terms of sets: e.g.,
given the code "if a < 10 ... else ...", the branch taken tells you to
which set 'a' belongs - "less than 10" or "10 or greater" - but it
doesn't tell you what was the given value of 'a'. The type of the
parameter may help to narrow the set of possible values it can take:
however even a 32-bit integer already has a range which is problematic
for bounding calculations. Trying to compute bounds on multi-types
such as lists, arrays, structures (records), etc. is simply infeasible
in most cases.




WRT paths, most compilers produce various connectivity graphs. The
most common one is basic block connectivity, but there may be others -
for SSA or value numbering data flow, branch dominance, etc. -
depending on the quality of the compiler. Shortest path usually can
be readily answered by examining these graphs.


If there are no loops, or only statically counted loops, then, as with
shortest path, you can answer longest path from the connectivity
graph. However, with conditional loops, it is, in general,
computationally infeasible to bound the number of executions of a
loop. Occasionally there may be enough clues to make a meaningful
guess, however, the far more common case of knowing only that a loop
has bounds which are integer tells you little on a 64-bit machine.
What about a quadruple nested loop?


I don't know what *you* mean by "average", but I'm guessing it is
something numeric related to short and long. What generally is meant
by the "average path" through a function is the path that is taken
most frequently as averaged over the full range of input. Frequency
of execution does not necessarily correlate with path length, and
where is does the correlation quite often is inverse: i.e. in many
cases it is the longest paths that are taken most frequently.
Statically determining the frequency of a given path is, in general,
computationally infeasible.


Timing execution typically is done with synthetic input guaranteed to
force execution of the shortest and longest paths. The average path
typically is discovered by tracing execution with (what is thought -
or hoped - to be) a representative sample of input.


George


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.