[PATCH] D34769: [X86] X86::CMOV to Branch heuristic based optimization
Amjad Aboud via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 11 10:19:55 PDT 2017
aaboud added a comment.
In https://reviews.llvm.org/D34769#797714, @spatel wrote:
> I haven't looked at the patch yet, but for reference I filed:
> https://bugs.llvm.org//show_bug.cgi?id=33013 (although the comments veered off to a different topic)
> ...and mentioned:
> https://reviews.llvm.org/rL292154
>
> If the example(s) in the bug report are already here, then great. If not, you might want to consider those cases.
The optimization in this patch is triggered on inner loops only, assuming that hotspots will exist in these inner loops.
Moreover, if I modify the example on the code by calling the function from a loop like this:
static int foo(float x) {
if (x < 42.0f)
return x;
return 12;
}
int bar(float *a, float *b, int n) {
int sum = 0;
#pragma clang loop vectorize(disable)
for (int i = 0; i < n; ++i) {
float c = a[i] + b[i];
sum += foo(c);
}
return sum;
}
Then the patch will indeed convert the CMOV into branch.
LBB0_4: # %for.body
# =>This Inner Loop Header: Depth=1
movss (%esi), %xmm1 # xmm1 = mem[0],zero,zero,zero
addss (%edx), %xmm1
ucomiss %xmm1, %xmm0
ja LBB0_5
# BB#6: # %for.body
# in Loop: Header=BB0_4 Depth=1
movl $12, %eax
jmp LBB0_7
.p2align 4, 0x90
LBB0_5: # in Loop: Header=BB0_4 Depth=1
cvttss2si %xmm1, %eax
LBB0_7: # %for.body
# in Loop: Header=BB0_4 Depth=1
addl %edi, %eax
addl $4, %esi
addl $4, %edx
decl %ecx
movl %eax, %edi
jne LBB0_4
https://reviews.llvm.org/D34769
More information about the llvm-commits
mailing list