Related articles |
---|
Dealing with Multiple Inheritence in an Object-Oriented Language eadengle@watcgl.uwaterloo.ca (Ed Cynwrig Dengler) (1992-09-01) |
Re: Dealing with Multiple Inheritence matthew@ntl02.uk.tele.nokia.fi (1992-09-03) |
Newsgroups: | comp.compilers |
From: | matthew@ntl02.uk.tele.nokia.fi (Matthew Faupel) |
Organization: | Nokia Telecommunications, Cambridge, UK |
Date: | Thu, 3 Sep 1992 00:12:42 GMT |
Keywords: | OOP, design |
References: | 92-09-015 |
Ed Cynwrig Dengler writes:
> I am trying to find references to articles that deal with the problems of
> dealing with and representing multiple inheritence in a compiled program.
> I am also interested in any articles that deal with how large typical
> object-oriented classes get (ie. how many features/routines does a typical
> class define, how many classes are typically inherited from, how deep the
> inheritence tree gets, how often a feature is renamed).
> The only reference I have currently is "An Object Addressing Mechanism for
> Statically Typed Languages with Multiple Inheritence" by R.C.H. Conner,
> A. Dearle, R. Morrison, and A.L. Brown.
> The reason for this request is that I am doing a project evaluating
> modification of the "map" system presented in the above article, and would
> like to compare the original "map" system and the modified one to any
> other methods out there.
...
> I am trying to find answers to the following questions:
> a) What other schemes are there?
> b) How is multiple inheritence handled (this includes what restrictions
> the scheme forces on multiple inheritence)?
> c) If an object is assigned to a variable that is of an inherited class
> type, how is the object changed to allow correct addressing of the
> fields/features/routines (eg. an object x of class D is assigned
> to variable y of class C, how is y.c handled)?
> d) Approximately how long would it take to do the assignment and/or
> find a given feature in a variable?
> e) How much space does the cheme take up?
Bjarne Stroustrup's paper on the implementation of the C++ multiple
inheritance scheme can be found in:
UNIX System V, AT&T C++ Language System Release 2.0
Selected Readings. Select Code 307-144
though I don't have the original paper identification.
In addition, when I first came across Eiffel (as described in Bernard
Meyer's "Object Oriented Software Construction") I was intrigued by the
claim that multiple inheritance could be implemented with a fixed time
being taken for looking up methods and attributes of an object. As a
result I had a go at devising schemes for implementing the Eiffel
inheritance system for which this was true.
The following two schemes are what I came up with; I'm afraid I haven't
read the reference that you give so I don't know whether either of them is
the "map" system you mention or whether they are a duplication of already
well known work, but here they are anyway. If anyone has any comments,
they'd be appreciated.
I considered using the class scheme you gave in your message as the example,
however there are three details it doesn't give:
o whether the attributes are data attributes or methods,
o if the attributes are methods, which (if any) are redefined, and
o which is the main line of descent (i.e. if both B and C redefine a
feature of A, D inherits both, and then a D is assigned to a reference
of type A, which of B or C's attribute would be used?)
So the example below (given in (possibly slightly incorrect) Eiffel) gives
all of these; it also illustrates the Eiffel (2.x) trick of merging back
together features that arrive by two different paths from a root class,
but still under the same name; in this example, A.b and A.g are never
renamed so D inherits just one version of each; A.a and A.f are renamed
and so D inherits one copy from B and another from C.
class A
feature
a: INTEGER;
b: INTEGER;
f: INTEGER is do Result := 1 end;
g: INTEGER is do Result := 2 end
end;
class B
inherit A rename a as Ba, f as Bf redefine Bf
feature
Bf: INTEGER is do Result := 3 end
end;
class C
inherit A rename a as Ca, f as Cf redefine Cf
feature
Cf: INTEGER is do Result := 4 end
end;
class D
inherit B, C
-- D now effectively has:
-- Ba: INTEGER;
-- Ca: INTEGER;
-- b: INTEGER;
-- Bf: INTEGER is ... end;
-- Cf: INTEGER is ... end;
-- g: INTEGER is ... end
-- In Eiffel 2.x there is no way of indicating which parent takes priority,
-- but we'll assume for this example that B does i.e.
-- anA: A; !D!anA; anA.f;
-- will end up calling Bf.
end;
Both schemes follow the same basic pattern: for each class there is a
class descriptor set up by the compiler which contains pointers to the
methods for that class and offsets to the data attributes from the start
of any object of that class. Objects of that class contain the data
attributes of the object plus a pointer to the class descriptor.
This works fine for single inheritance however for multiple inheritance,
something a bit more tricksy is required. What the class descriptor
becomes is a set of the above type of descriptor; one for each route that
can be found from a common ancestor to the current class (sort of). To
give a more concrete example, if A is the descriptor for class A; B the
extra information needed over and above A to describe B and so on, then
the descriptors for A, B, C and D are:
A: A B: AB C: AC D:ABACD
^
The trick is that when an object of type D is being referred to through a
reference of type A or B (main line) the whole descriptor is used; when it
is referred to through a descriptor of type C (or of type A after being
assigned through a sequence such as: aC = aD; anA = aC) the subsidiary
part of the descriptor is used.
This begs the question, how do we know to use the subsidiary part? This
is where the two solutions diverge. In solution a) each reference to an
object is made up of a pointer to the object and an offset into the class
structure; the object itself contains a pointer to the class structure and
this added to the offset provides the correct part to use. In solution b)
the reference is a pointer alone, but each object has as many pointers as
there are subparts to the class and the reference points to the
appropriate pointer in the object.
The following shows the memory setup under the two schemes when an object
of type D is allocated and then referred to through two references, one of
type C and the other of type D. A class header of size 2 is assumed.
a)
Reference Object data Class data Method code
aD: +-------+ 0+-------+ 0+-------+
|Ptr: ^------+----->| ^------------->|Class |
+-------+ | 1+-------+ |Header |
|Off: 2 | | |Ba: ? | 2+-------+ +-------+
+-------+ | 2+-------+ |OBa: 1 | /------>| Bf |
| |b: ? | +-------+ | | Code |
aC: +-------+ | 3+-------+ |Ob: 2 | | +-------+
|Ptr: ^------/ |Ca: ? | +-------+ |
+-------+ +-------+ |Bf: ^----/ +-------+
|Off: 6 | +-------+ /--->| g |<-\
+-------+ |g: ^-------/ | Code | |
6+-------+ +-------+ |
|OCa: 3 | |
+-------+ +-------+ |
|Ob: 2 | /------>| Cf | |
+-------+ | | Code | |
|Cf: ^----/ +-------+ |
+-------+ |
|g: ^-----------------------/
+-------+
The code (in C ignoring casts) for some example actions using this object:
aD.b -> *(aD.Ptr + *(*aD.Ptr+aD.Off+CLASS_D_B_OFF));
aD.g -> *(**aD.Ptr+aD.Off+CLASS_D_G_OFF) (aD.Ptr,aD.Off+D_TO_A_CAST);
aD.Cf -> *(**aD.Ptr+aD.Off+CLASS_D_CF_OFF) (aD.Ptr,aD.OFF+D_TO_C_CAST);
aC := aD -> aC.Ptr = aD.Ptr; aC.Off = aD.Off+D_TO_C_CAST;
aC = aD -> aC.Ptr == aD.Ptr;
The parameter to the function calls is a "self" reference of the type the
method is expecting passed as a pointer and offset.
The various constants used can be determined at compile time and are:
#define CLASS_D_B_OFF 3 // Offset of b's offset in D's class data
#define CLASS_D_G_OFF 5 // Offset of pointer to g in D's class data
#define CLASS_D_CF_OFF 8 // Offset of pointer to Cf in D's class data
#define D_TO_A_CAST 0 // Ref.Off change required to make a D an A
#define D_TO_C_CAST 4 // Ref.Off change required to make a D a C
Advantages (compared to solution b)
Reference comparison is easier.
Assignment is easier.
New object initialisation is quicker.
Object points to class header => easy to determine type of object without
storing extra info.
Potentially less space is taken by each object, however as references
take more and many objects contain references, this advantage is probably
lost.
Disadvantages
References require 2 values => using registers is more tricky => code
is not as efficient.
Extra work is required to resolve attributes (due to adding in the
offset from the reference).
b)
Reference Object data Class data Method code
aD: +-------+ 0+-------+ 0+-------+
| ^-------------->| ^----------\ |Class |<-\
+-------+ 1+-------+ | |Header | |
/---->| ^-------\ | 2+-------+ |
| 2+-------+ | \-->|H: ^----+
| |Ba: ? | | +-------+ |
aC: +-------+ | 3+-------+ | |O: 0 | |
| ^---------/ |b: ? | | +-------+ |
+-------+ 4+-------+ | |OBa: 2 | | +-------+
|Ca: ? | | +-------+ | /--->| Bf |
+-------+ | |Ob: 3 | | | | Code |
| +-------+ | | +-------+
| |Bf: ^-------/
| +-------+ | +-------+
| |g: ^----------->| g |<-\
| 8+-------+ | | Code | |
\----->|H: ^----/ +-------+ |
+-------+ |
|O: 1 | +-------+ |
+-------+ /--->| Cf | |
|OCa: 3 | | | Code | |
+-------+ | +-------+ |
|Ob: 2 | | |
+-------+ | |
|Cf: ^-------/ |
+-------+ |
|g: ^-----------------------/
+-------+
The code (in C ignoring casts) for some example actions using this object:
aD.b -> *(aD + *(*aD+CLASS_D_B_OFF));
aD.g -> *(**aD+CLASS_D_G_OFF) (aD+D_TO_A_CAST);
aD.Cf -> *(**aD+CLASS_D_CF_OFF) (aD+D_TO_C_CAST);
aC := aD -> aC = (aD?aD+D_TO_C_CAST:0)
aC = aD -> aC == (aD?aD+D_TO_C_CAST:0)
The various constants used can be determined at compile time and are:
#define CLASS_D_B_OFF 5 // Offset of b's offset in D's class data
#define CLASS_D_G_OFF 7 // Offset of pointer to g in D's class data
#define CLASS_D_CF_OFF 12 // Offset of pointer to Cf in D's class data
#define D_TO_A_CAST 0 // Pointer change required to make a *D a *A
#define D_TO_C_CAST 1 // Pointer change required to make a *D a *C
Advantages (compared to solution a)
Attribute resolution is easier.
A reference is a single value, which is easier to use efficiently.
Disadvantages
Pointers to the same object can have different values.
The pointer doesn't neccesarily point to the head of the object, so
extra information has to be stored in the class data to allow the
object head to be found (the O field is the offset from the head of
the object).
The pointer(s) to the class data from the object do not point to the
head of the class, hence extra information (the pointer H) is needed
to determine where the class data head is.
Object initialisation time increases as the number of routes to a class
through the inheritance tree increases (though this can be mitigated by
setting up a template object and block copying it).
General advantages of both methods
Minimum space is used to store attributes (i.e. no unnecessary duplication
of attributes when a parent is inherited twice).
Code is shared where appropriate, not duplicated.
The following inheritance features are supported:
o Multiple inheritance (including from the same class more than once)
o Renaming and redefining
o Only those multiply included parent features that need to be
duplicated are duplicated
o Replacement of a method by an attribute is possible (though I'll
leave that as an exercise for the reader)
o The ability to define the main line of descent when inheriting from
the same class more than once (I believe Eiffel 3 supports this on a
per attribute basis; I don't think the above schemes can handle that,
though I can think of a tacky modification to make them do so :-)
Apologies for the long post; hope someone finds this interesting!
Matthew
--
matthew@uk.tele.nokia.fi *---
matthew@ntl02.decnet.nokia.fi *------------------* Matthew Faupel *---
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.