Mon, 15 Mar 2010 20:08:33 +1100

Related articles |
---|

How to decide on the meet operator in Data-flow Analysis? vivekb1985@gmail.com (Vivek B) (2010-03-14) |

Re: How to decide on the meet operator in Data-flow Analysis? pat@jantar.org (Patryk Zadarnowski) (2010-03-15) |

From: | Patryk Zadarnowski <pat@jantar.org> |

Newsgroups: | comp.compilers |

Date: | Mon, 15 Mar 2010 20:08:33 +1100 |

Organization: | Compilers Central |

References: | 10-03-030 |

Keywords: | analysis |

Posted-Date: | 15 Mar 2010 21:55:24 EDT |

On 15/03/2010, at 4:18 AM, Vivek B wrote:

*> Dear All,*

*>*

*> In an iterative data flow analysis the solution obtained will be less*

*> than or equal to the IDEAL solution*

*>*

*> If I am trying to solve the problem in two different approaches -*

*> using two different meet operators (^)*

*>*

*> 1. where meet operator is U (union)*

*> 2. where the meet operator is (intersection)*

*>*

*> then the meet over path solutions, MOP1 and MOP2, obtained are both*

*> less than or equal to IDEAL. (being conservative)*

*>*

*> MOP1 <= IDEAL*

*>*

*> MOP2 <= IDEAL*

*>*

*> How can I know which among MOP1 and MOP2 gives a superior result?*

*>*

*> Is there a formal way to decide which meet operator has to be used ?*

I'm afraid you're looking at this the wrong way.

The union and intersection operations don't give you a pair of

different approximations to the same problem (your ideal solution),

but, rather, the same solution (namely: the least fixpoint, obtained

by typical iterative data flow algorithms) to a pair of two completely

different problems.

If you look at your data flow problem as a system of simultaneous

(recursive) set equations, then the set intersection operator in your

analysis will deliver the least solution, in the sense that it will

assign to each variable in the system the smallest possible set that

satisfies the constraints of the entire systems. Dually, using the

union operator will deliver the maximal solution, i.e., the largest

sets that satisfy your program's constraints.

Bear in mind that the maximal solution (i.e., the one delivered by the

union operator) often does not exist, so that your data flow analysis

will often (usually?) diverge, unless you perform the analysis over a

finite abstract domain (that is, unless you choose your approximations

out of a small finite set of possible values.)

Another way of thinking about it is this: in most cases, intersection

will give you "must satisfy" information about your data points, while

union will give you "may satisfy" criteria. So that, for example, if

you use your iterative data flow analysis to determine which pointers

address which objects, union will probably tell you which pointer MAY

point to a given object, while intersection will determine which

pointers always point to a given object in memory.

As an example, take sparse conditional constant propagation of Wegman-

Zadeck [1]. Although it is generally described as abstract

interpretation over a lattice-based approximation of run-time values,

you can equally well think of it as an iterative MOP solution to the

data-flow problem, in which the approximate value of each variable in

the program is described as a set of constants that the variable may

contain at any given execution path through the program. Their meet

operation is essentially isomorphic to set intersection, if we take

the initial value assigned to each variable (BOTTOM in

lattice-theoretic terminology) to represent the universal set, and the

ultimate "don't know" (or TOP) value to represent an empty set. The

algorithm begins by assigning the universal set to each variable, then

interesting that with singleton sets containing just a single constant

(arriving at a singleton sets, represented in the middle of

Wegman-Zadeck's lattice.) Finally, when two such singleton sets are

intersected, you'll get the same set back if and only if the sets are

identical to begin with (i.e., if the variable is assigned the same

constant value on both execution paths.) Otherwise, you'll get the

intersection of two non-overlapping singleton, i.e., an empty set, or

TOP. Note that the algorithm gives you "MUST" information.

The algorithm terminates because intersection is a monotonic non-

increasing function (in other words, it never adds new elements to the

initial approximation.) If you keep removing elements, you're

guaranteed to eventually reach either an empty set (TOP) or else a

point when your approximations stop changing.

Related value-range analysis [2] algorithms substitute the union

operator for intersection and, although superficially similar, have

very different set of properties. For one, union is monotonically

non-decreasing, so that there's no guarantees the algorithm will ever

terminate. In practice, this is usually solved by giving up the

analysis when the sets reach a certain size. Further, once you're

done, the result is a set of zero or more values for each variable in

the program, so that the compiler doesn't actually know which one of

these values will be actually assumed during execution. In short,

value-range analysis gives you "MAY have this constant value"

information, while constant propagation gives you "MUST have this

value" data. Very different results and applications, even though the

ONLY difference between the two algorithms is the precise meet

function (set union or set intersection) that was applied during

analysis.

Cheers,

Patryk Zadarnowski

[1] Wegman, Mark N. and Zadeck, Frank Kenneth, "Constant propagation

with conditional branches",

in POPL '85: Proceedings of the 12th ACM SIGACT-SIGPLAN symposium on

Principles of programming languages, pages 291-299, 1985.

[2] Patterson, Jason R. C., "Accurate static branch prediction by

value range propagation",

in PLDI '95: Proceedings of the ACM SIGPLAN 1995 conference on

Programming language design and implementation, pages 67-78, 1995.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.