Re: Pointers to "why C behaves like that ?"

"jacob navia" <jacob@jacob.remcomp.fr>
26 Nov 2002 21:49:00 -0500

          From comp.compilers

Related articles
[30 earlier articles]
Re: Pointers to "why C behaves like that ?" thp@cs.ucr.edu (2002-11-24)
Re: Pointers to "why C behaves like that ?" thp@cs.ucr.edu (2002-11-24)
Re: Pointers to "why C behaves like that ?" stephen@dino.dnsalias.com (Stephen J. Bevan) (2002-11-24)
Re: Pointers to "why C behaves like that ?" cgweav@aol.com (Clayton Weaver) (2002-11-24)
Re: Pointers to "why C behaves like that ?" joachim_d@gmx.de (Joachim Durchholz) (2002-11-24)
Re: Pointers to "why C behaves like that ?" joachim_d@gmx.de (Joachim Durchholz) (2002-11-24)
Re: Pointers to "why C behaves like that ?" jacob@jacob.remcomp.fr (jacob navia) (2002-11-26)
Re: Pointers to "why C behaves like that ?" jacob@jacob.remcomp.fr (jacob navia) (2002-11-26)
Re: Pointers to "why C behaves like that ?" jacob@jacob.remcomp.fr (jacob navia) (2002-11-26)
Re: Pointers to "why C behaves like that ?" david.thompson1@worldnet.att.net (David Thompson) (2002-11-26)
Re: Pointers to "why C behaves like that ?" ajo@andrew.cmu.edu (Arthur J. O'Dwyer) (2002-11-26)
Re: Pointers to "why C behaves like that ?" thp@cs.ucr.edu (2002-11-26)
Re: Pointers to "why C behaves like that ?" thp@cs.ucr.edu (2002-11-26)
[31 later articles]
| List of all articles for this month |
From: "jacob navia" <jacob@jacob.remcomp.fr>
Newsgroups: comp.compilers
Date: 26 Nov 2002 21:49:00 -0500
Organization: Wanadoo, l'internet avec France Telecom
References: 02-11-095 02-11-103 02-11-128 02-11-150
Keywords: types, design
Posted-Date: 26 Nov 2002 21:49:00 EST

OK Joachim.
Point by point:


"Joachim Durchholz" <joachim_d@gmx.de> wrote in message
news:02-11-150@comp.compilers...
> jacob navia wrote:
> >
> > fn(a)
> > {
> > b = a;
> > }
> >
> > How can you know what "a" is?
>
> "a" is of type ANY, or Object, or Equ, or whatever happens to be the
> most general type that allows comparisons for equality. Type inference
> will not produce a more specific definition of "a" or "b" at this
> point. Actually that's OK - there's no need to actually nail the type
> down at this point.


Of course not, it will be done at run time by the run time assignment
operator. This is a perfectly valid way of programming but it is not
C.


The original question was why C "behaves like that". Well it is those
run time costs that C doesn't want to hear about. That's why the
generated code is fast. All "Object" "ANYs" or whatever, must be
checked at run time!


[snip]


> Hindley-Milner typing is the most common way to set up things so that
> "the most general type that has operations x, y, and z" is a unique
> description of an existing type. (These types need not be declared.
> There's always the sum type, i.e. "type A or type B" - equivalent to a
> typesafe version of C's unions or Pascal's variant records. This
> covers up the most obvious holes, and makes type inference feasible.)


Yes but those unions must be disambiguated at run time. There is no free
lunch.


> > If you examine the whole program (this precludes a separate compilation
> > model like C) you can find two calls to "fn" one with:
> > fn(2)
> > fn(6)
> >
> > Is the type of "b" an integer, a double, an unsigned char or a long
long?
> >
> > All those answers are correct.
>
> There are different answers.
> Some languages say it's Integer, because you'd have to write 2. or 6. to
> get a Float.
> Other languages say it's a Numeric, the common supertype of Integer,
> Float, and any of their higher-precision variants.


Well, it can be anything so you pass it as "Numeric" say, to the
function "fn". That function must expect therefore "numeric"
arguments and to do the operations it needs to do with the data, it
will have to find out what is the precise type of data.


The function cannot be statically constrained to efficiently use the
data since it has to expect "anything" and this means that most
functions will be very weakly typed, their arguments being the union
of all types where they have calls.


This costs at run time in run time checks and dispatching.
And this is what C wants to avoid.


> (No language says it could be char, signed or unsigned; characters
> aren't Integers, multiplying a Character with a Character would be a
> meaningless operation.


My example presented a function call, not a multiplication please. Why
would be unfeasible to pass the character 2 or the character 6 to a
function?




> The conflation of characters and integers in C is
> one of the worst disservices that a programming languge did to the
> programmer communities, IMHO.)
>


You are right in this point. I am not a C "believer" and I see the
language's weakness. But the example shown wasn't any multiplication.
Besides, if you read some discussion groups in languages that lack
signed/unsigned numbers like Eiffel does, you will see how many people
demand that feature!


> > Yes, that can be maybe in principle be done but with an associated
> > enormous cost in compiler complexity and problems.
>
> No. Hindley-Milner typing is simple and efficient. The algorithms is
> just a page of paper (unfortunately, it's too large to fit on the
> margin of this post *g*), but I have it upstairs.


I know, but we will obtain that the type inference mechanisms produce
types that are never (by design) as efficient as a statically built
program with no runtime overhead. I am not saying that that is BAD or
whatever. C has chosen another way, what has disadvantages (you type
more) and advantages (your code runs faster).


> It's about as complex as the most primitive dataflow analysis
> algorithm that one can imagine. This is well within the reach of
> current-day compiler technology.


Of course, but it costs at run time unless the exact type is deduced
in most cases!


[snip]
> So far for complexity.
>
> Problems? There are no problems.


Of course. Run time cost is not a problem then. But this is not the emphasis
in C.
[snip]
>
> > When you support several types of integers and floating point data
> > like in C this kind of inference becomes extremely complex.
>
> No.
> You might need a declaration to nail down the data to a specific type,
> to avoid having a descriptor attached with every piece of numeric data.
> (Which is one reason why most languages with type inference don't have
> so many numeric types. Often enough, it's Integer for arbitrary-length
> integers and Int for machine integers, all integer literals being of
> type Integer. This avoids all the hassles with correctly interpreting
> -32768 on a 16-bit machine and similar borderline syndromes.)
>


But C allows you to specify exactly the type and have many numeric
data types. C99 gives you


integers:
8/16/32/64 bits
floats
32/64/80 or 128/ bits




> > What do you do with arrays?
> >
> > Finding the expression
> > table[i] = b+8.9;
> >
> > You can (maybe) deduce that table contains floating point data but how
big
> > do you dimension it?
>
> You don't dimension it.
> Or you dimension it explicitly. Most programs don't have many tables.


Of course. Then, at run time, you discover the length of the array,
resize it dynamically, etc. And saying "most programs don't have many
tables" is quite... well I do not know; There isn't a single important
program I have written in the last 15 years where I do not have a
table somewhere I am sorry.


> > Yes, of course, you dimension it to have at least "i" elements. And each
> > time "i" becomes greater than the size of "table" you redimension it
copying
> > the data in the new table.
> >
> > This way you generate
> >
> > for (i=0; i< b;i++) {
> > table[i] = i;
> > }
> >
> > If "b" is 10 000 at run time you resize the table 10 000 times. NICE!
>
> Pu-leeease. Take a look at modern data structures. There are ways to
> implement arrays of that kind with O(log N) characteristic for
> resizing, updating, accessing, and iterating arrays.


Well of course, but that is not as efficient as an operation that
takes one or two machine registers and access the data in a few
cycles. C must be efficient so that all the languages you are talking
about can be implemented... in C!




> If you really want constant access time, or if the constant overhead
> with these logN arrays is too much to bear (e.g. in weather
> simulations), you just declare the array. Once. All other arrays take
> their type from that master declaration, by type inference. This is
> still a major win.
>
Well I hope that when I declare an
int array[10000]


all other arrays do NOT take that form unless really is necessary... :-)


> Type inference does not mean that you never declare a type. It means
> that you can leave out 99.9% of all type declarations, without loss of
> functionality or efficiency, and tremendous gain in flexibility.


Yes, it is adapted at other languages than C where run time efficiency is
not important.


jacob


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.