Re: Bounds checking, Optimization techniques and undefined behavior

David Brown <david.brown@hesbynett.no>
Tue, 7 May 2019 11:04:04 +0200

          From comp.compilers

Related articles
[18 earlier articles]
Re: Bounds checking, Optimization techniques and undefined behavior DrDiettrich1@netscape.net (Hans-Peter Diettrich) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior christopher.f.clark@compiler-resources.com (Christopher F Clark) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior 0xe2.0x9a.0x9b@gmail.com (Jan Ziak) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-06)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior david.brown@hesbynett.no (David Brown) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior nuno.lopes@ist.utl.pt (Nuno Lopes) (2019-05-07)
Re: Bounds checking, Optimization techniques and undefined behavior bc@freeuk.com (Bart) (2019-05-08)
Re: Bounds checking, Optimization techniques and undefined behavior anw@cuboid.co.uk (Andy Walker) (2019-05-08)
[10 later articles]
| List of all articles for this month |
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.compilers
Date: Tue, 7 May 2019 11:04:04 +0200
Organization: A noiseless patient Spider
References: 19-04-021 19-04-023 19-04-037 19-04-039 19-04-042 19-04-044 19-04-047 19-05-004 19-05-006 19-05-016 19-05-020 19-05-024 19-05-025 19-05-028 19-05-029 19-05-034
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="97371"; mail-complaints-to="abuse@iecc.com"
Keywords: C, errors, debug, comment
Posted-Date: 07 May 2019 17:28:33 EDT
Content-Language: en-GB

On 06/05/2019 10:15, Hans-Peter Diettrich wrote:
> Am 05.05.2019 um 20:44 schrieb Hans-Peter Diettrich:
>> Am 05.05.2019 um 12:14 schrieb Bart:
>>
>>> But how do they get there? Take this:
>>>
>>>     int A[10], *p;
>>>     p = &A[3];
>>>
>>> You intend p to refer to the 4-element slice A[3..6], but how does the
>>> language know that? How can it stop code from writing to p[5]?
>>
>> Not pointers are bad, but pointer arithmetic is. It should be allowed
>> only with objects of known bounds.
>>
>> DoDi
>> [In this case the bounds look known to me. -John]
>
> The bounds are known for the array, so that the pointer is guaranteed
> valid by compile time check. But p[x] or p+x can not be guaranteed valid
> without considerable runtime and bounds storage overhead. Also simple
> p++ operation or the like requires an update of the bounds information.


That depends on x, and what other operations are done on the pointers.


For example, given this:


for (int x = 0; x < 7; x++) {
p[x] = 10;
}


the compiler can see that the access is valid. Change the limit to 8,
and the compiler can tell that it is invalid. (gcc can spot this one.)


But of course if x comes from elsewhere, it is going to take very
advanced analysis to figure out anything about its validity.




> That's what I consider pointer arithmetic, not above &A[3].
>


&A[3] is pointer arithmetic too. A more useful distinction might be for
pointer calculations where the compiler knows the addresses or ranges at
compile time, however they might be expressed.




> I learned to love the Pascal indexing instead of pointers, because a
> loop like
>   for i := 1 to 10 do sum := sum + A[i];
> can be optimized safely by the Pascal/Delphi compiler into pointer and
> auto increment, so that no speed penalty exists vs. explicit pointer
> usage. But in a C for loop the index variable can be changed in code, so
> that even above code would execute slower with bounds checks.
>


It will not be slower in C - because the compiler knows that "i" is
never changed in the loop. (I'd like some way to have that enforced in
C, but I don't know of any good method.)


Just for a laugh, try this code:
#include <stdio.h>


int main(void) {
                int i;
                int j;
                int *pi = &i;
                int *pj = &j;
                int sum = 0;


                printf("pi = %p, pj[-1] = %p\n", (void*) pi, (void*) &pj[-1]);


                for (i = 0; i < 11; i++) {
                                if (i == 3) {
                                                pj[-1] = 5;
                                }
                                sum += i;
                }
                printf("Sum is %i\n", sum);
}


Depending on the compiler, you might need to swap "int i;" and "int j;".
  The first printf shows that "pi" and "pj[-1]" point to the same
address, that of "i". With no optimisation and a literal translation of
the code, the result will be 48. With optimisation, the compiler knows
that access through "pj" cannot possibly affect "i" without invoking
undefined behaviour - so it can simplify the loop to "sum = 55;" and
shows that result.


[For the loop with the unchangable index, you can declare the index
const and cast the places you change it, but ugh.


                const int i;


                for(*(int*)&i = 0; i < 10; (*(int*)&i)++) {
                                a[i] = i;
                                i++; /* will be diagnosed as an error */
                }
-John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.