Re: Data structures for efficient overload resolution

Barry Kelly <barry.j.kelly@gmail.com>
Mon, 17 Mar 2008 14:44:26 +0000

          From comp.compilers

Related articles
Data structures for efficient overload resolution stephenhorne100@aol.com (Steve Horne) (2008-03-11)
Re: Data structures for efficient overload resolution barry.j.kelly@gmail.com (Barry Kelly) (2008-03-14)
Re: Data structures for efficient overload resolution stephenhorne100@aol.com (Steve Horne) (2008-03-15)
Re: Data structures for efficient overload resolution barry.j.kelly@gmail.com (Barry Kelly) (2008-03-17)
Re: Data structures for efficient overload resolution cdodd@acm.org (Chris Dodd) (2008-03-17)
Re: Data structures for efficient overload resolution stephenhorne100@aol.com (Steve Horne) (2008-03-18)
Re: Data structures for efficient overload resolution stephenhorne100@aol.com (Steve Horne) (2008-03-18)
Re: Data structures for efficient overload resolution gneuner2@comcast.net (George Neuner) (2008-03-18)
Re: Data structures for efficient overload resolution stephenhorne100@aol.com (Steve Horne) (2008-03-22)
| List of all articles for this month |

From: Barry Kelly <barry.j.kelly@gmail.com>
Newsgroups: comp.compilers
Date: Mon, 17 Mar 2008 14:44:26 +0000
Organization: Compilers Central
References: 08-03-049 08-03-057 08-03-063
Keywords: types
Posted-Date: 18 Mar 2008 00:02:50 EDT

Steve Horne wrote:


> On Mar 14, 2:42 pm, Barry Kelly <barry.j.ke...@gmail.com> wrote:


> > If the overload type ordering is sane, then this process should be
> > linear in the number of candidates (and arguments). By sane, I mean that
> > any candidate which compares greater than any previous candidate should
> > also compare greater than any candidate equal to the previous candidate,
> > preventing a potential quadratic mini-explosion.


> For example, given a hierarchy with parent class p and child
> class c, consider the following sequence of candidates...
>
> 1 (p, c)
> 2 (c, p)
> 3 (c, c)
>
> When resolving for the call (c, c), candidates 1 and 2 will both
> appear valid with neither being more specific than the other.


> Therefore, when assessing candidate 3, it must be compared with both
> of these two earlier candidates. Only then can these candidates be
> eliminated.


I don't see the reason for your "therefore". Did you omit it?


This is indeed what I meant about the overload type ordering being sane.


> Comparisons must be done both with the call being resolved
> and with those previously identified candidates that have yet to be
> eliminated.


Candidates 1 & 2 compare equal to one another. By sane, I mean that it
is only necessary to compare candidate 3 with candidate 1, because, by
my sanity rule (see quote above), you don't need to compare against 2.


It may be that your rules don't permit this, though. Can you explain in
more detail your resolution requirements, such as a counter-example,
i.e. some case where you *do* have to compare against 2 - and which in
my scheme, would make your system "insane"? :)


> In my utility, the explosion issue is not this quadratic thing, but
> because I'm evaluating every case (every possible call) prior to
> generating a decision table.


Yes, that sounds expensive.


> The number of cases grows exponentially
> with the number of parameters, of course, and polynomially with the
> number of possible run-time types for each parameter. And even that's
> just the detonator - the real bang happens when the rule induction
> tries to make sense of all those cases in order to build the decision
> tree.


Of course, many exponential and high polynomial algorithms work fine
when n is small. Can you give some numbers - number of types, average
compatibility rates (for any random type, what fraction of the total set
of types is it compatible with, on average), numbers of candidates,
numbers of parameters, cost of testing for compatibility?


Basically, given that some of the problems with exponential explosion
are both setup time and space for the approach you describe, I'm
wondering if there's a compromise possible: a shorter setup time with a
less explosive space, at the cost of doing a little max-finding at the
end.


Candidates can be trivially bucketed by number of parameters, so the
number of candidates is only interesting for any given fixed number of
parameters, so assume the below for each possible number of parameters:


* Let n = number of candidates
* Let m = number of arg types
* Let p = number of parameters


Create a sparse matrix C ("C" for compatible), indexed by arg types *
parameter index, where each element references a set of candidates,
sorted in some cheap total ordering (if intersections are to be computed
via sorting). Maximum size is m*p, but it should be smaller because I
would expect there isn't a candidate for every possible argument type in
every possible parameter position.


Then, for each argument type t, for each candidate c, for each parameter
index i, add c to the set C[t,i] if c's parameter at index i can take
argument type t.


Obviously, there can be no more than n*m*p elements of the sets in total
and n for any given set, but hopefully far less - if there aren't far
less, then there's little point building this structure.


Then, for overload resolution, you only need process the intersection of
each set from each C[t,i], or perhaps simply the smallest set returned
for any of the parameter indexes. If the set size is ever 1, then the
job is over quite early.


Computing the intersection might be more expensive than warranted,
depending on how costly the max-finding overload resolution process is.


This overall should work out to (worst case) O(n*m*p*p) in size, and,
depending on the nature of the candidates and types, a greatly-reduced n
when doing the O(n) max-finding. I'm guessing that the real-world
n*m*p*p should be quite a bit less than m^p, unless I've misunderstood
the nature of your problem - that n is large in your case.


On the lookup side, it should be O(p+f*n), where f is the fraction that
the preprocessing work reduces the final resolution max-finding to.


In other words, I think the above would be madness in a real-world
compiler with e.g. Java-style overloading and typically only a small
handful of candidates that ever need to be considered, but it may make
sense for your case.


-- Barry


--
http://barrkel.blogspot.com/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.