Re: Undefined Behavior Optimizations in C

anton@mips.complang.tuwien.ac.at (Anton Ertl)
Sun, 22 Jan 2023 09:56:22 GMT

          From comp.compilers

Related articles
[26 earlier articles]
Re: Undefined Behavior Optimizations in C alexfrunews@gmail.com (Alexei A. Frounze) (2023-01-19)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-20)
Re: Undefined Behavior Optimizations in C tkoenig@netcologne.de (Thomas Koenig) (2023-01-20)
Re: Undefined Behavior Optimizations in C Keith.S.Thompson+u@gmail.com (Keith Thompson) (2023-01-20)
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (2023-01-21)
Re: Undefined Behavior Optimizations in C 864-117-4973@kylheku.com (Kaz Kylheku) (2023-01-22)
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (2023-01-22)
Re: Undefined Behavior Optimizations in C martin@gkc.org.uk (Martin Ward) (2023-01-23)
Re: Undefined Behavior Optimizations in C gah4@u.washington.edu (gah4) (2023-01-23)
Re: Undefined Behavior Optimizations in C dave_thompson_2@comcast.net (2023-01-28)
| List of all articles for this month |

From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Date: Sun, 22 Jan 2023 09:56:22 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
References: 23-01-027 <sympa.1673343321.1624.383@lists.iecc.com> 23-01-031 23-01-041 23-01-062 23-01-065 23-01-067 23-01-069
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="77450"; mail-complaints-to="abuse@iecc.com"
Keywords: C, optimize
Posted-Date: 22 Jan 2023 12:42:35 EST

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>AMD64 specifies zero-extension for both signed
>and unsigned ints (and has instructions that generate zero-extended
>results).


Looking at <https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf>, I
find no such specification. However, compilers certainly behave in
that way. E.g., for


int add (int a, int b)
{
        return a+b;
}


gcc generates:


      0: 8d 04 37 lea (%rdi,%rsi,1),%eax
      3: c3 retq


which zero-extends the result. This certainly rules out an ABI that
requires sign-extension for signed integers.


One interesting case is:


long add (unsigned a, long b)
{
        return a+b;
}


which gcc compiles into


      0: 89 ff mov %edi,%edi
      2: 48 8d 04 37 lea (%rdi,%rsi,1),%rax
      6: c3 retq


What's the point of the MOV instruction here? It performs a
32->64-bit zero extension of %rdi. So gcc apparently assumes that
passed operands are garbage-extended on AMD64. Or maybe gcc is just
cautious here. Another test:


unsigned bar(int x);


unsigned long foo(long x)
{
    return bar(x);
}


gcc -O compiles this to:


      0: 48 83 ec 08 sub $0x8,%rsp
      4: e8 00 00 00 00 callq 9 <foo+0x9>
      9: 89 c0 mov %eax,%eax
      b: 48 83 c4 08 add $0x8,%rsp
      f: c3 retq


There is no zero or sign-extension on passing x to bar(), so the value
is passed garbage-extended. There is a zero extension for converting
the return value unsigned long, so gcc assumes that the return value
of bar is not necessarily zero-extended.


Conclusion: In the System V ABI for AMD64, values are passed around
garbage-extended (in the general case).


- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.