Re: Best multimethods/multiple dispatch implementations?

George Neuner <gneuner2@comcast.net>
Wed, 01 Oct 2008 04:09:22 -0400

From comp.compilers

Related articles
[12 earlier articles]
Re: Best multimethods/multiple dispatch implementations? gneuner2@comcast.net (George Neuner) (2008-09-24)
Re: Best multimethods/multiple dispatch implementations? armelasselin@hotmail.com (Armel) (2008-09-24)
Re: Best multimethods/multiple dispatch implementations? lerno@dragonascendant.com (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2008-09-25)
Re: Best multimethods/multiple dispatch implementations? lerno@dragonascendant.com (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2008-09-25)
Re: Best multimethods/multiple dispatch implementations? gneuner2@comcast.net (George Neuner) (2008-09-28)
Re: Best multimethods/multiple dispatch implementations? lerno@dragonascendant.com (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2008-09-29)
*Re: Best multimethods/multiple dispatch implementations? gneuner2@comcast.net (George Neuner)* (2008-10-01)**
Re: Best multimethods/multiple dispatch implementations? lerno@dragonascendant.com (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2008-10-04)
Re: Best multimethods/multiple dispatch implementations? gneuner2@comcast.net (George Neuner) (2008-10-05)
Re: Best multimethods/multiple dispatch implementations? rajamukherji@gmail.com (Raja Mukherji) (2008-10-07)

| List of all articles for this month |

From:	George Neuner <gneuner2@comcast.net>
Newsgroups:	comp.compilers
Date:	Wed, 01 Oct 2008 04:09:22 -0400
Organization:	A noiseless patient Spider
References:	08-09-026 08-09-069 08-09-093 08-09-100 08-09-112 08-09-116 08-09-125 08-09-138 08-09-140
Keywords:	OOP, performance
Posted-Date:	03 Oct 2008 08:33:07 EDT

On Mon, 29 Sep 2008 05:46:30 -0700 (PDT), Christoffer Lernv
<lerno@dragonascendant.com> wrote:

>> George Neuner <gneun...@comcast.net> wrote:
>>> On Thu, 25 Sep 2008 04:33:41 -0700 (PDT), <lerno@dragonascendant.com>
wrote:
>... discussing a single dispatch scenario ...
>
>Would it make sense to switch between table and tree dispatch
>depending on dispatch type?

I would seriously consider switching methods not on the type of the
dispatch but rather based on the number of functions involved ...
using discrimination for a small number of functions and switching to
table when that number grows too large.

A table implementation pays one branch penalty for each checked
argument plus a much larger penalty for calling the function
indirectly.

A discrimination tree pays a branch penalty for each decision point -
which in the worst case of MD is #arguments * #functions.

There's a break-even point where the cumulative penalties of the
discrimination tree become larger than the cumulative penalties of the
table. It's hardware dependent, but you can add up the branches
involved easily enough and you can time direct and indirect function
calls to ballpark that figure.

If your implementation language is OO (Java, C#, etc.), you'll also to
make sure all the function overloads are either class functions or
that you have an object of the appropriate type available ... else
you'll incur an additional calling delay to construct the object.

>> That said, even in large frameworks, it's unusual for a method to be
>> overridden by every subclass. Doing so, in fact, defeats the purpose
>> of using super/sub classification in the first place. Generally a
>> large fraction of classes will inherit a large fraction of their
>> functionality. So, in general, a dispatch function is likely to need
>> tests for only a fraction of the classes of the hierarchy.
>
>But even without taking subclassing into account I can see function
>name clashes that will increase search-depth, esp. really simple
>function names like "add" "append"
>
>So we get
> add(LinkedList l, Object o)
> add(Vector l, Object o)
> add(Integer i1, Integer i2)
> add(Float f1, Floar f2)

I don't understand ... what "name clash" are you worried about?

The point of creating a generic function is to take source like the 4
unique overloads of "add" above and hide them behind a _single_
generic "add" dispatch function. All the overloads must somehow be
made private so that only the public dispatch "add" can call them
directly.

A function called "a" will have nothing whatsoever to do with the
function called "add". It will have its own entry point and dispatch
machinery.

>>>> The basic idea is right ... you create a dispatch table and somehow
>>>> index it based on the run time argument types. The problem in Java
>>>> (and in Lisp, see below) is that the tables have to be constructed at
>>>> run time.
>>>
>>> But doesn't discrimination trees mean corresponding difficulties when
>>> compiling the language, since a precompiled generic function will need
>>> to be recompiled when extended, right?
>>
>> No more difficulty than any other dispatch method. If the class
>> hierarchy or the set of overloads changes, a table or hash also has to
>> be reconstructed and the code that uses it updated and recompiled.
>
>So when we build is something like this:
>
> void myGenericFunction(ObjectA a, ObjectB b) { ... }
> void myGenericFunction(ObjectC c, ObjectB b) { ... }
>
>We will create the stub
>
> void myGenericFunction(Object o1, Object o2);
>
>and the specific functions:
>
> void myGenericFunctionAB(ObjectA a, ObjectB b) { ... }
> void myGenericFunctionCB(ObjectC c, ObjectB b) { ... }
>
>Using the generic function will call the stub instead of any generic
>functions:
>
>...
>
>myGenericFunction(x, y); // calling the stub.
>
>...
>
>When linking the program, the stub is expanded to a real function:
>
>void myGenericFunction(Object o1, Object o2)
>{
> if (o1 isa ObjectA && o2 isa Object B)
> {
> myGenericFunctionAB(o1, o2);
> }
> else if (o1 isa ObjectC && o2 isa Object B)
> {
> myGenericFunctionCB(o1, o2);
> }
> // throw some error
>}

Yes. But consider that, in your example, you are doing too much work
because the type of the second argument is not relevant ... in both
cases it is B. You want to do as few type tests as possible, so, in
the general case, it is better to use hierarchical tests. Read up on
discrimination networks for tips on implementing optimal tests.

But also consider how to leverage the implementation language.
For example, suppose you want your Java-like language to have class
based functions like

      LinkedList.add( Object o ),
      Vector.add( Object o ),
      Integer.add( Integer i ),
      Float.add( Float f ),

*and* also a generic syntax "add( o1, o2 )" that just does the right
thing depending on its arguments. If you use Java as the
implementation language, the "generic" function in this case is just:

      o1.add( o2 );

which lets Java do its normal class method dispatch. This approach
can be generalized for additional arguments.

>
>In some languages I'd have to generate an init method that adds all my
>classes:
>
>C c = addClass("Object", definition);
>addMethod(c, "someMethod", function1);
>addMethod(c, "someOtherMetho", function2);
>... and so on ...
>
>Is that what you're referring to?

Not quite. You don't want to be comparing strings at dispatch time or
using reflection (except maybe to initialize the table).

I can't write this in Java off the cuff - I'd need to spend some
quality time with the API manuals - so I hope you can follow the C++
and transliterate. I'm assuming there is a some class Object which is
the root of the user type hierarchy and there is some function - which
I'm calling typeof() - which returns a unique integer ID for each
runtime class. What I showing as static tables below would have to be
dynamic tables constructed at init time.

ex. R f( A a, D d )
          R f( A a, E e )
          R f( B b, D d )
          R f( C c, D d )
          R f( C c, E e )

  where R,A,B,C,D,E are types

becomes

          R f_ad( A a, D d )
          R f_ae( A a, E e )
          R f_bd( B b, D d )
          R f_cd( C c, D d )
          R f_ce( C c, E e )
          R f_err( Object& o1, Object& o2 )

          f( Object& o1, Object& o2 )
          {
          int index1[] = { typeof(A), typeof(B), typeof(C) };
          int index2[] = { typeof(D), typeof(E) };
          R (*func)[3][2] = { f_ad, f_ae,
                                                  f_bd, f_err
                                                  f_cd, f_ce };

          int t1, t2, i1, i2;

                t1 = typeof(o1);
                for ( i1 = 0; i1 < 3; ++i1 )
                      if ( index1[i1] == t1 )
                            break;

                t2 = typeof(o2);
                for ( i2 = 0; i2 < 2; ++i2 )
                      if ( index1[i2] == t2 )
                            break;

                if ( t1 == 3 || t2 == 2 )
                      f_err( o1, o2 );
                else
                      (*func[i1][i2])( o1, o2 );
          }

>But what about supporting dynamic loading of classes? It looks like a
>table/hash would easily support it, but it would require a runtime
>recompilation with a tree, unless that tree is constructed during
>runtime - in which case it seems like a table would be many times
>faster.

Unless Java provides a way for an object to be informed whenever a new
class file is loaded, you'll probably have to accommodate that
yourself through some kind of registration API invoked by the new
class.

And you thought it would be simple 8-)

>>>> FWIW, I have been told off-line that some of the heavyweight Lisps do
>>>> now use table dispatch ... but those particular Lisps all have CLOS
>>>> functionality built into them rather than as a separate library so
>>>> they can implement the functionality however they please.
>>>
>>> Table dispatch for performance reasons or are there other reasons?
>> I think mostly for performance of their own object GUI frameworks.
>> GUI widget frameworks tend to be large and have many overloaded
>> methods. Under such circumstances the cumulative branch penalties of
>> discrimination overwhelm the prefetch penalties of indirect function
>> calls.
>
>That makes sense. So is that a reason for not using generic functions?
>- That GUI frameworks would be slow?

No. The issue was not the generic functions but rather how they were
implemented. Native Lisp does not have objects - CLOS is an object
library written in (mostly portable) Lisp.

GUI frameworks tend to exhibit worst case behavior for discrimination.
The portable discrimination dispatch implementation was not quick
enough to deal with a huge GUI. A portable table implementation
wouldn't be much better because Lisp has no pointers and can't
directly refer to a function. To implement table dispatch most
efficiently, the implementors turned to non-portable, implementation
specific solutions and had to build-in some of CLOS's functions.

George

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Best multimethods/multiple dispatch implementations?

George Neuner <gneuner2@comcast.net>Wed, 01 Oct 2008 04:09:22 -0400

George Neuner <gneuner2@comcast.net>
Wed, 01 Oct 2008 04:09:22 -0400