[PATCH] D16907: [X86] Don't zero/sign-extend i1 or i8 return values to 32 bits (PR22532)

Zia Ansari via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 7 00:35:54 PST 2016


zansari added a comment.

In http://reviews.llvm.org/D16907#345277, @spatel wrote:

> cc'ing some Intel folks both for clarity on the ABI doc and for the potential perf impact.


Well, there's always the potential for the non-extending move into the partial return register to create a false dependency with any prior writes to the whole register, however, you could argue that this isn't the place to deal with that since that's a more general issue ... maybe.

For example, take something like this:

  short foo(void);
  int bar(short, short);
  short x, y = 123;
  int A;
  
  main()
  {
    int i;
    for (i = 0; i <= 0xfffffff; i++) {
      A = bar(x, y);
      x = foo();
    }
    printf("\nA = %d, x = %x.\n", A, x);
    return 0;
  }
  -----------------------------
  extern short x;
  extern short y;
  
  int bar(short a, short b)
  {
    return a / b;
  }
  
  short foo(void)
  {
    return x + y;
  }

Foo will be slowed down a bit on a write to %al, whereas, there will be no dependency with a movzbl into %eax (I tried this real quick on an IVB, and it's ~50% faster with the movz, or an xor of eax before the movb).

We could play it safe and only do this for opt/min-size, or go ahead with this if the impact is low and deal with any potential performance issue in a more general way. Have we done any perf tests on this to see if there's any impact?


http://reviews.llvm.org/D16907





More information about the llvm-commits mailing list