[PATCH] D23446: [X86] Enable setcc to srl(ctlz) transformation on btver2 architectures.
pierre gousseau via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 16 07:01:57 PDT 2016
pgousseau added a comment.
In https://reviews.llvm.org/D23446#542525, @RKSimon wrote:
> Any update on the performance investigations?
Hi Simon/Sanjay,
Sorry for the delayed follow-up!
I have ran more tests and it seems the regressions in performances I was observing with SPEC's h264 are within the noise now so I cant tell if this patch is improving or degrading perfomances in SPEC's h264 benchmark.
I am more confident the OR case brings better performances because we will be replacing
48 85 FF test rdi,rdi
0F 94 C0 sete al
48 85 F6 test rsi,rsi
0F 94 C1 sete cl
08 C1 or cl,al
0F B6 C1 movzx eax,cl
C3 ret
by this:
F3 0F BD CE lzcnt ecx,esi
F3 0F BD C7 lzcnt eax,edi
09 C8 or eax,ecx
C1 E8 05 shr eax,5
C3 ret
My plan now is to make the patch to handle the OR case only, what do you guys think?
Would X86ISelLowering still be the best place if only supporting the OR case?
Thanks!
================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3588
@@ -3586,1 +3587,3 @@
+ else
+ return DAG.getNode(ISD::ZERO_EXTEND, dl, ExtTy, Scc);
}
----------------
RKSimon wrote:
> These can be replaced with DAG.getZExtOrTrunc(Scc, dl, ExtTy);
Will do thanks!
https://reviews.llvm.org/D23446
More information about the llvm-commits
mailing list