[PATCH] D16907: [X86] Don't zero/sign-extend i1 or i8 return values to 32 bits (PR22532)

Hans Wennborg via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 5 18:17:16 PST 2016

hans added a comment.

In http://reviews.llvm.org/D16907#345419, @jyknight wrote:

> Why do you say GCC extends 16-bit numbers?
> Given:
>   short global1, global2;
>   short bar() {
>     return global1 + global2;
>   }
> $ gcc -march=i386 -O2 -S -o - -m32
> Results in:
>   bar:
>         movw    global2, %ax
>         addw    global1, %ax
>         ret
> If you use -march=i686, then it results in:
>   bar:
>         movzwl  global2, %eax
>         addw    global1, %ax
>         ret

D'oh, I was holding it wrong.

I was using the same test case I used for bool and char, something like "return x == y", and GCC would use "sete" and then extend that to 32-bit, because that's easy and they have to extend the result anyway. Your test case is obviously better.

David and I also observed MSVC not extending 16-bit return values for this code:

  unsigned short f(unsigned short x) {
    return x;

where they generate:

  00000000: 66 8B 44 24 04     mov         ax,word ptr [esp+4]
  00000005: C3                 ret 

i.e. they're leaving the high 16 bits of eax undefined.

I'll update the patch to do shorts as well.

(Another interesting note is that MSVC's behaviour is contradictory to what they say in this document: https://msdn.microsoft.com/en-us/library/984x0h58.aspx "On x86 plaftorms, all arguments are widened to 32 bits when they are passed. Return values are also *widened to 32 bits and returned in the EAX register*.")

In http://reviews.llvm.org/D16907#345407, @spatel wrote:

> If I'm reading H.J.'s proposal correctly, we should treat shorts the same as char/bool. Should it all be fixed in one shot?

Yes, since James pointed out my doubts about i16 behaviour were unfounded, let's do them all in one shot.

Comment at: test/CodeGen/X86/tail-call-attrs.ll:13-15
@@ -12,5 +12,5 @@
 ; Here, there's more zero extension to be done between the call and the return,
 ; so a tail call is impossible (well, according to current Clang practice
 ; anyway. The AMD64 ABI isn't crystal clear on the matter).
 declare zeroext i32 @give_i32()
spatel wrote:
> This comment doesn't apply anymore?
Right. I'll add a FIXME here. The tail call lowering needs an update.


More information about the llvm-commits mailing list