[PATCH] D22975: Compute the Newton series natively
Evandro Menezes via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 4 12:27:19 PDT 2016
evandro marked an inline comment as done.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14500
@@ -14499,2 +14499,3 @@
-SDValue DAGCombiner::BuildReciprocalEstimate(SDValue Op, SDNodeFlags *Flags) {
+/// Newton iteration for a function: F(X) is X_{i+1} = X_i - F(X_i)/F'(X_i)
+/// For the reciprocal, we need to find the zero of the function:
----------------
t.p.northover wrote:
> OK, I see that change now. But what about the block of 0-handling code moved from buildSqrtEstimate to buildSqrtEstimateImpl? I still don't see how that's an improvement.
For now it does improve grouping, IMO. But I'm mulling whether the handling code could be profitably moved inside `getSqrtEstimate`.
================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14669
@@ +14668,3 @@
+ Est = DAG.getNode(VT.isVector() ? ISD::VSELECT : ISD::SELECT, DL, VT,
+ ZeroCmp, Op, Est);
+ AddToWorklist(Est.getNode());
----------------
@spatel I changed Zero for Op here, since Op is then known to be 0. The motivation was that since AArch64 has an immediate version that implements SETEQ and materializing a 0 can be expensive. This probably caused the side effect of changing `andnps` for `vblendvps` in X86.
================
Comment at: llvm/test/CodeGen/X86/sqrt-fastmath.ll:42-45
@@ -41,6 +41,6 @@
; ESTIMATE-NEXT: vmulss %xmm1, %xmm2, %xmm1
-; ESTIMATE-NEXT: vxorps %xmm2, %xmm2, %xmm2
-; ESTIMATE-NEXT: vcmpeqss %xmm2, %xmm0, %xmm0
-; ESTIMATE-NEXT: vandnps %xmm1, %xmm0, %xmm0
-; ESTIMATE-NEXT: retq
+; ESTIMATE: vxorps %xmm2, %xmm2, %xmm2
+; ESTIMATE: vcmpeqss %xmm2, %xmm0, %xmm2
+; ESTIMATE: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; ESTIMATE: retq
%call = tail call float @__sqrtf_finite(float %f) #1
----------------
spatel wrote:
> vblendvps is a regression vs. the vandnps that we had. Do you know what caused that?
>
> Also, why did the "-NEXT" part of the assertions disappear? For any x86 codegen test changes, please use the script that is noted on the first line of the test file to avoid that problem.
Sorry about that.
Repository:
rL LLVM
https://reviews.llvm.org/D22975
More information about the llvm-commits
mailing list