[PATCH] D22975: Compute the Newton series natively
Sanjay Patel via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 4 12:59:02 PDT 2016
spatel added inline comments.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14669
@@ +14668,3 @@
+ Est = DAG.getNode(VT.isVector() ? ISD::VSELECT : ISD::SELECT, DL, VT,
+ ZeroCmp, Op, Est);
+ AddToWorklist(Est.getNode());
----------------
Ah, I see the diff now. But this is a target-independent transform, so isn't using 'Zero' in the select the more specific, and therefore the better, construct? This suggests that AArch64 is missing a fold that checks if an operand of a select is a zero; x86 must have this somewhere to allow the transform from blendv to andn?
================
Comment at: llvm/test/CodeGen/X86/sqrt-fastmath.ll:42-45
@@ -41,6 +41,6 @@
; ESTIMATE-NEXT: vmulss %xmm1, %xmm2, %xmm1
; ESTIMATE-NEXT: vxorps %xmm2, %xmm2, %xmm2
-; ESTIMATE-NEXT: vcmpeqss %xmm2, %xmm0, %xmm0
-; ESTIMATE-NEXT: vandnps %xmm1, %xmm0, %xmm0
+; ESTIMATE-NEXT: vcmpeqss %xmm2, %xmm0, %xmm2
+; ESTIMATE-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
; ESTIMATE-NEXT: retq
%call = tail call float @__sqrtf_finite(float %f) #1
----------------
No worries. Note that I've used a modified version of that script to generate checks for targets besides x86 - in case anyone would like to enhance the script and make test generation easier for AArch64. :)
Repository:
rL LLVM
https://reviews.llvm.org/D22975
More information about the llvm-commits
mailing list