[PATCH] D23446: [X86] Enable setcc to srl(ctlz) transformation on btver2 architectures.

pierre gousseau via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 16 07:01:57 PDT 2016


pgousseau added a comment.

In https://reviews.llvm.org/D23446#542525, @RKSimon wrote:

> Any update on the performance investigations?


Hi Simon/Sanjay,

Sorry for the delayed follow-up!
I have ran more tests and it seems the regressions in performances I was observing with SPEC's h264 are  within the noise now so I cant tell if this patch is improving or degrading perfomances in SPEC's h264 benchmark.
I am more confident the OR case brings better performances because we will be replacing

  48 85 FF                test     rdi,rdi
  0F 94 C0                sete     al
  48 85 F6                test     rsi,rsi
  0F 94 C1                sete     cl
  08 C1                   or       cl,al
  0F B6 C1                movzx    eax,cl
  C3                      ret

by this:

  F3 0F BD CE             lzcnt    ecx,esi
  F3 0F BD C7             lzcnt    eax,edi
  09 C8                   or       eax,ecx
  C1 E8 05                shr      eax,5
  C3                      ret

My plan now is to make the patch to handle the OR case only, what do you guys think?
Would X86ISelLowering still be the best place if only supporting the OR case?

Thanks!


================
Comment at: lib/CodeGen/SelectionDAG/TargetLowering.cpp:3588
@@ -3586,1 +3587,3 @@
+      else
+        return DAG.getNode(ISD::ZERO_EXTEND, dl, ExtTy, Scc);
     }
----------------
RKSimon wrote:
> These can be replaced with DAG.getZExtOrTrunc(Scc, dl, ExtTy);
Will do thanks!


https://reviews.llvm.org/D23446





More information about the llvm-commits mailing list