[PATCH] D22975: Compute the Newton series natively

Sanjay Patel via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 4 12:59:02 PDT 2016


spatel added inline comments.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14669
@@ +14668,3 @@
+      Est = DAG.getNode(VT.isVector() ? ISD::VSELECT : ISD::SELECT, DL, VT,
+                        ZeroCmp, Op, Est);
+      AddToWorklist(Est.getNode());
----------------
Ah, I see the diff now. But this is a target-independent transform, so isn't using 'Zero' in the select the more specific, and therefore the better, construct? This suggests that AArch64 is missing a fold that checks if an operand of a select is a zero; x86 must have this somewhere to allow the transform from blendv to andn?

================
Comment at: llvm/test/CodeGen/X86/sqrt-fastmath.ll:42-45
@@ -41,6 +41,6 @@
 ; ESTIMATE-NEXT:    vmulss %xmm1, %xmm2, %xmm1
 ; ESTIMATE-NEXT:    vxorps %xmm2, %xmm2, %xmm2
-; ESTIMATE-NEXT:    vcmpeqss %xmm2, %xmm0, %xmm0
-; ESTIMATE-NEXT:    vandnps %xmm1, %xmm0, %xmm0
+; ESTIMATE-NEXT:    vcmpeqss %xmm2, %xmm0, %xmm2
+; ESTIMATE-NEXT:    vblendvps %xmm2, %xmm0, %xmm1, %xmm0
 ; ESTIMATE-NEXT:    retq
   %call = tail call float @__sqrtf_finite(float %f) #1
----------------
No worries. Note that I've used a modified version of that script to generate checks for targets besides x86 - in case anyone would like to enhance the script and make test generation easier for AArch64. :)


Repository:
  rL LLVM

https://reviews.llvm.org/D22975





More information about the llvm-commits mailing list