Re: Bounds checking, Optimization techniques and undefined behavior

Bart <bc@freeuk.com>
Mon, 6 May 2019 13:07:30 +0100

From comp.compilers

Related articles
[14 earlier articles]
Re: Bounds checking, Optimization techniques and undefined behavior DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior gneuner2@comcast.net (George Neuner) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior gneuner2@comcast.net (George Neuner) (2019-05-05)
Re: Bounds checking, Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-05-06)
*Re: Bounds checking, Optimization techniques and undefined behavior bc@freeuk.com (Bart)* (2019-05-06)**
Re: Bounds checking, Optimization techniques and undefined behavior 0xe2.0x9a.0x9b@gmail.com (Jan Ziak) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
[14 later articles]

| List of all articles for this month |

From:	Bart <bc@freeuk.com>
Newsgroups:	comp.compilers
Date:	Mon, 6 May 2019 13:07:30 +0100
Organization:	virginmedia.com
References:	19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-044 19-04-047 19-05-004 19-05-006 19-05-016 19-05-020 19-05-024 19-05-025 19-05-028 19-05-031
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="35679"; mail-complaints-to="abuse@iecc.com"
Keywords:	standards, types, C
Posted-Date:	06 May 2019 10:44:31 EDT
In-Reply-To:	19-05-031
Content-Language:	en-GB

On 05/05/2019 22:38, George Neuner wrote:
> On Sun, 5 May 2019 11:14:51 +0100, Bart <bc@freeuk.com> wrote:

>> You intend p to refer to the 4-element slice A[3..6], but how does the
>> language know that? How can it stop code from writing to p[5]?
>
> You declare 'p' as int (*p)[4] and then the compiler could check the
> use. Theoretically at least, I'm not sure it actually is done in many
> situations.

I declare pointers to arrays as T(*)[] when generating C code. But
you're right in that no one else does that when writing C.

Note that this is an open bound; usually the bound will be dynamic, and
held in a separate variable, which the language does not know is the bound.

C has something called VLAs, which is really a type where any bounds are
defined as a runtime expression. If you had a loop which extracted
different slices on each iteration, you would obliged to declare 'p'
within the loop, so it has a slightly different type (with different
bounds) each time around.

But this is very restrictive (for example I don't like using local block
scopes). It is also a rather heavyweight feature just to allow the
possibility of bounds checking.

(Also something I haven't implemented in my own C compiler; I just don't
know how to approach it. And I don't like the feature.)

Proper slicing (since we are not restricted to C or other existing
languages) is simpler and better.

>> struct {int a,b,c,d;} S;
>>
>> p = &S.a;
>>
>> You intend p to be used to access a,b,c,d as an int[4] array, but p's
>> bounds will say it's only one element long.
>
> The larger problem is that C even permits that.

I was half-expecting someone to say it was undefined behaviour. I
suppose you will say the way to declare that pointer is as:

int (*p)[4] = (int(*)[4])&S.a;

The problem is that if you want to make C a safer, checked language,
none of this stops people writing it the wrong way.

> If you want the
> struct elements also to be available as an array, you should have used
> a union.

Maybe the struct is defined elsewhere and is not your code to change. Or
maybe the struct is {int a,b,c[20];}, and you want to treat a, b, c[0],
c[1] as an array.

The fact is that this is a low level language. You need to be able to do
stuff like this.

> C has a lot of warts, no question ... but its biggest problem is that
> the routine (ab)use of pointers in, so-called, "idiomatic" C in a real
> sense is working against the compiler - making it's job much harder.

So hard that I wouldn't even attempt it. Creating a more restrictive,
safer (or easier to check) language would be easier (IMO).

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Bounds checking, Optimization techniques and undefined behavior

Bart <bc@freeuk.com>Mon, 6 May 2019 13:07:30 +0100

Bart <bc@freeuk.com>
Mon, 6 May 2019 13:07:30 +0100