Re: How to implement dynamic typing?

George Neuner <>
Fri, 23 Apr 2010 03:50:58 -0400

          From comp.compilers

Related articles
[20 earlier articles]
Re: How to implement dynamic typing? (George Neuner) (2010-04-18)
Re: How to implement dynamic typing? (bartc) (2010-04-18)
Re: How to implement dynamic typing? (George Neuner) (2010-04-20)
Re: How to implement dynamic typing? (Mike Pall) (2010-04-21)
Re: How to implement dynamic typing? (bartc) (2010-04-21)
Re: How to implement dynamic typing? (George Neuner) (2010-04-22)
Re: How to implement dynamic typing? (George Neuner) (2010-04-23)
Re: How to implement dynamic typing? (BGB) (2010-04-23)
Re: How to implement dynamic typing? (bartc) (2010-04-24)
Re: How to implement dynamic typing? (BGB / cr88192) (2010-04-26)
Re: How to implement dynamic typing? (George Neuner) (2010-05-11)
| List of all articles for this month |

From: George Neuner <>
Newsgroups: comp.compilers
Date: Fri, 23 Apr 2010 03:50:58 -0400
Organization: A noiseless patient Spider
References: 10-04-009 10-04-028 10-04-031 10-04-036 10-04-037 10-04-044 10-04-046
Keywords: types
Posted-Date: 23 Apr 2010 23:20:42 EDT

On Sun, 18 Apr 2010 14:46:37 +0100, "bartc" <> wrote:
>In my experience, most data is one-dimensional, sometimes two-dimensional,
>and rarely three (and, with dynamic typing, quite often a mixture so that
>dimensionality is meaningless)

It depends a lot on the type of applications you write. I used to get
into 4 .. 6 dimensions pretty routinely.

>And if the dimensions are not known until runtime, any method will
>need additional data (there was a recent discussion on exactly how
>much extra overhead was imposed by array-of-array vs. 2D-array in
>comp.lang.misc, "how to form type descriptors")
>(For bounds known at compile-time, data of any dimensions could use a
>flat block of memory needing just one descriptor. Runtime bound checks
>may be still be needed, but the bounds data will be implicit.)

The issue is how effectively multiple dimensions are handled when
necessary. If you are bound checking there can be a quite large
performance difference between a vector-of-vector and a
big-block-of-memory implementation.

In a vector-of-vector implementation the elements can't necessarily be
assumed to be contiguous, so limits for each dimension must be checked
separately and base address computations can't produce an address that
is an illegal index in any given dimension. This can considerably
constrain things like loop optimizations. With a big-block-of-memory,
address computations can alter the apparent dimensionality of the
array as needed while requiring only a single check to see that an
address or index remains within the block.

Of course, older languages like C, C++, Pascal, Modula-2 & 3, etc.
provide n-dimensional arrays (all except Pascal allow arrays to be
accessed via differently "shaped" aliases, Modula-2 and Modula 3 allow
this to be done safely). Lisp and a few FP languages provide
n-dimensional array objects (vectors are just 1-D arrays) and Lisp
also provides shaped aliases. C# provides (stack restricted) array
objects using a different syntax than for vector objects. There are
some languages (I can't come up with an example just now) that
implement n-dimensional arrays using the big-block model plus
auxiliary pointer vectors corresponding to the dimension bases. There
are more examples I can't think of just now, but you can see that a
lot of language designers consider the big-block-of-memory model to be

>> Jython and JRuby are many times faster than the languages' standard
>> implementations. I understand Ruby 1.9 now has a JIT compiler but
>> JRuby is still much faster.
>I've just tried Jruby. It's about half the speed of my Ruby 1.9.

Could be. I suspect it depends heavily on the application.

>>>but [my interpreter] uses a method for operator dispatching (based
>>>on the types of two operands) that is not easily scaleable.
>> You might look at generic function dispatch mechanism in Lisp.
>I think this is also known as multiple dispatch. I've had a brief look
>at Lisp, but it's more about how it looks from the programmer's point
>of view that the actual details of how they achieve it.. I'll try
>looking harder.

There are 2 basic approaches - table dispatch and discrimination tree.
Table dispatch is obvious - just create an array of function pointers
with a dimension for each discriminating argument - one that has
different types in different overloads of the function. An argument
which always has the same type is non-discriminating and does not have
to be included. To call the function you determine the argument
types, search the table and call whatever function you find. To make
it easy you can fill the slots for invalid argument combinations with
a pointer to your error function.

Discrimination tree is a bit trickier. The tree is constructed of
nested tests which check each discriminating argument in turn. Again,
non-discriminating arguments can be ignored. You can implement the
tree in code with nested if/else tests or in data using search vectors
linked into the appropriate tree structure. The trick with
discrimination trees is to determine which representation works best
in the given situation (usually a code complexity issue) because the
dispatch code needs to be custom generated by the compiler for each
overloaded set of functions.

There are a fair number of papers on multimethod and generic function
implementation as well as papers on optimizing discrimination
networks, so I imagine it shouldn't be too hard to find something


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.