[PATCH] D16907: [X86] Don't zero/sign-extend i1 or i8 return values to 32 bits (PR22532)
Zia Ansari via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 7 00:35:54 PST 2016
zansari added a comment.
In http://reviews.llvm.org/D16907#345277, @spatel wrote:
> cc'ing some Intel folks both for clarity on the ABI doc and for the potential perf impact.
Well, there's always the potential for the non-extending move into the partial return register to create a false dependency with any prior writes to the whole register, however, you could argue that this isn't the place to deal with that since that's a more general issue ... maybe.
For example, take something like this:
short foo(void);
int bar(short, short);
short x, y = 123;
int A;
main()
{
int i;
for (i = 0; i <= 0xfffffff; i++) {
A = bar(x, y);
x = foo();
}
printf("\nA = %d, x = %x.\n", A, x);
return 0;
}
-----------------------------
extern short x;
extern short y;
int bar(short a, short b)
{
return a / b;
}
short foo(void)
{
return x + y;
}
Foo will be slowed down a bit on a write to %al, whereas, there will be no dependency with a movzbl into %eax (I tried this real quick on an IVB, and it's ~50% faster with the movz, or an xor of eax before the movb).
We could play it safe and only do this for opt/min-size, or go ahead with this if the impact is low and deal with any potential performance issue in a more general way. Have we done any perf tests on this to see if there's any impact?
http://reviews.llvm.org/D16907
More information about the llvm-commits
mailing list