Re: Possible ANSI C Optimisation Done in Practice?

"Rodney M. Bates" <rbates@southwind.net>
15 Dec 2001 00:45:52 -0500

          From comp.compilers

Related articles
Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (2001-12-11)
Re: Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (Ralph Corderoy) (2001-12-15)
Re: Possible ANSI C Optimisation Done in Practice? rsherry8@home.com (Robert Sherry) (2001-12-15)
Re: Possible ANSI C Optimisation Done in Practice? rbates@southwind.net (Rodney M. Bates) (2001-12-15)
Re: Possible ANSI C Optimisation Done in Practice? nmm1@cus.cam.ac.uk (2001-12-19)
Re: Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (2001-12-20)
Re: Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (2001-12-20)
Re: Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (2001-12-20)
Re: Possible ANSI C Optimisation Done in Practice? RLWatkins@CompuServe.Com (R. L. Watkins) (2001-12-20)
Re: Possible ANSI C Optimisation Done in Practice? ralph@inputplus.demon.co.uk (2001-12-20)
[10 later articles]
| List of all articles for this month |
From: "Rodney M. Bates" <rbates@southwind.net>
Newsgroups: comp.compilers
Date: 15 Dec 2001 00:45:52 -0500
Organization: EarthLink Inc. -- http://www.EarthLink.net
References: 01-12-050
Keywords: C, optimize, standards
Posted-Date: 15 Dec 2001 00:45:51 EST

Ralph Corderoy wrote:
>
> Hi,
>
> Given this ANSI C source file
>
> #include <string.h>
>
> void foo(char *s)
> {
> char tmp[10];
> char *t;
> int i;
>
> t = tmp;
> for (i = 0; i < strlen(s); i++) {
> *t++ = s[i + 1];
> }
>
> return;
> }
>
> it seems to a group of us that the compiler could determine that
> `strlen(s)' is invariant within the loop and hence just call strlen()
> once. This is because, AIUI, the object pointed to by s cannot overlap
> with that of tmp,


The following, sleazy program makes s overlap tmp:


#include <stdio.h>
#include <string.h>


char * cp;


void bar(char *s)
    { char tmp[10];
        char *t;
        int i;


        cp = tmp;
    }


char * m1 = "equal";
char * m2 = "unequal";


void foo(char *s)
    {
        char tmp[10];
        char *t;
        int i;


        t = tmp;
        if (s == t)
            printf("%s\n",m1);
        else
            printf("%s\n",m2);
    }


int main ()
    { bar ( m1 ) ;
        foo ( cp ) ;
    } ;


when compiled by gcc, x86, Linux. I would expect this behavior to
be portable (if you will excuse this abuse of the word) across
the majority of C implementations.


Not surprisingly, ANSI C says the behaviour is undefined when
you do this. (6.2.4(2,5)) A compliant compiler is under no obligation
to tell you about it, at compile time or runtime. But it can still
make optimizations which depend on non-overlap of the strings and
claim to be compliant.


IMHO, a reasonable criterion for optimizations is that they should
not change the behavior of any program that compiles without error,
but I believe all the standard requires is that it not change any
behavior that is defined by the language. C is so very weak in its
static rules and so full of opportunities to write code that is
undefined but undetected by the language implementation, that it's
hard to imagine there being a lot of optimizations that would
satisfy my stricter criterion.


> and since t is initially assigned to tmp, and it isn't legal for t
> to proceed past tmp + 10, assigning to *t can't be changing s.


There is no way the rules of the language could prevent t from
proceeding past tmp + 10, because the type system has already
forgotten, by the time t = tmp; has been executed, that t points
into array tmp. It's now just a pointer to a character, located
anywere, within an array that is infinite in both directions.
Just call the original foo with a string longer than 10. Something
will get stepped on, and you won't be able to explain it in
C terms. Only a machine-level (and machine-dependent ) model
will explain it. (This is also true of my program above.)
--
Rodney M. Bates
[If you're not allowed to change the behavior of any program that compiles
without error, you can't do any optimization at all. You can't even do
safe stuff like dead code elimination, since you never know when a wacky
pointer computed at runtime might call into the dead code. One of the
reasons there are standards for languages like C is to draw a line around
the semantics that compilers provide. -John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.