From: | Bart <bc@freeuk.com> |
Newsgroups: | comp.compilers |
Date: | Mon, 6 May 2019 13:07:30 +0100 |
Organization: | virginmedia.com |
References: | 19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-044 19-04-047 19-05-004 19-05-006 19-05-016 19-05-020 19-05-024 19-05-025 19-05-028 19-05-031 |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="35679"; mail-complaints-to="abuse@iecc.com" |
Keywords: | standards, types, C |
Posted-Date: | 06 May 2019 10:44:31 EDT |
In-Reply-To: | 19-05-031 |
Content-Language: | en-GB |
On 05/05/2019 22:38, George Neuner wrote:
> On Sun, 5 May 2019 11:14:51 +0100, Bart <bc@freeuk.com> wrote:
>> You intend p to refer to the 4-element slice A[3..6], but how does the
>> language know that? How can it stop code from writing to p[5]?
>
> You declare 'p' as int (*p)[4] and then the compiler could check the
> use. Theoretically at least, I'm not sure it actually is done in many
> situations.
I declare pointers to arrays as T(*)[] when generating C code. But
you're right in that no one else does that when writing C.
Note that this is an open bound; usually the bound will be dynamic, and
held in a separate variable, which the language does not know is the bound.
C has something called VLAs, which is really a type where any bounds are
defined as a runtime expression. If you had a loop which extracted
different slices on each iteration, you would obliged to declare 'p'
within the loop, so it has a slightly different type (with different
bounds) each time around.
But this is very restrictive (for example I don't like using local block
scopes). It is also a rather heavyweight feature just to allow the
possibility of bounds checking.
(Also something I haven't implemented in my own C compiler; I just don't
know how to approach it. And I don't like the feature.)
Proper slicing (since we are not restricted to C or other existing
languages) is simpler and better.
>> struct {int a,b,c,d;} S;
>>
>> p = &S.a;
>>
>> You intend p to be used to access a,b,c,d as an int[4] array, but p's
>> bounds will say it's only one element long.
>
> The larger problem is that C even permits that.
I was half-expecting someone to say it was undefined behaviour. I
suppose you will say the way to declare that pointer is as:
int (*p)[4] = (int(*)[4])&S.a;
The problem is that if you want to make C a safer, checked language,
none of this stops people writing it the wrong way.
> If you want the
> struct elements also to be available as an array, you should have used
> a union.
Maybe the struct is defined elsewhere and is not your code to change. Or
maybe the struct is {int a,b,c[20];}, and you want to treat a, b, c[0],
c[1] as an array.
The fact is that this is a low level language. You need to be able to do
stuff like this.
> C has a lot of warts, no question ... but its biggest problem is that
> the routine (ab)use of pointers in, so-called, "idiomatic" C in a real
> sense is working against the compiler - making it's job much harder.
So hard that I wouldn't even attempt it. Creating a more restrictive,
safer (or easier to check) language would be easier (IMO).
Return to the
comp.compilers page.
Search the
comp.compilers archives again.