[PATCH] D22975: Compute the Newton series natively

Thu Aug 4 12:27:19 PDT 2016

evandro marked an inline comment as done.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14500
@@ -14499,2 +14499,3 @@
 
-SDValue DAGCombiner::BuildReciprocalEstimate(SDValue Op, SDNodeFlags *Flags) {
+/// Newton iteration for a function: F(X) is X_{i+1} = X_i - F(X_i)/F'(X_i)
+/// For the reciprocal, we need to find the zero of the function:
----------------
t.p.northover wrote:
> OK, I see that change now. But what about the block of 0-handling code moved from buildSqrtEstimate to buildSqrtEstimateImpl? I still don't see how that's an improvement.
For now it does improve grouping, IMO.  But I'm mulling whether the handling code could be profitably moved inside `getSqrtEstimate`.

================
Comment at: llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:14669
@@ +14668,3 @@
+      Est = DAG.getNode(VT.isVector() ? ISD::VSELECT : ISD::SELECT, DL, VT,
+                        ZeroCmp, Op, Est);
+      AddToWorklist(Est.getNode());
----------------
@spatel I changed Zero for Op here, since Op is then known to be 0.  The motivation was that since AArch64 has an immediate version that implements SETEQ and materializing a 0 can be expensive.  This probably caused the side effect of changing `andnps` for `vblendvps` in X86.

================
Comment at: llvm/test/CodeGen/X86/sqrt-fastmath.ll:42-45
@@ -41,6 +41,6 @@
 ; ESTIMATE-NEXT:    vmulss %xmm1, %xmm2, %xmm1
-; ESTIMATE-NEXT:    vxorps %xmm2, %xmm2, %xmm2
-; ESTIMATE-NEXT:    vcmpeqss %xmm2, %xmm0, %xmm0
-; ESTIMATE-NEXT:    vandnps %xmm1, %xmm0, %xmm0
-; ESTIMATE-NEXT:    retq
+; ESTIMATE:         vxorps %xmm2, %xmm2, %xmm2
+; ESTIMATE:         vcmpeqss %xmm2, %xmm0, %xmm2
+; ESTIMATE:         vblendvps %xmm2, %xmm0, %xmm1, %xmm0
+; ESTIMATE:         retq
   %call = tail call float @__sqrtf_finite(float %f) #1
----------------
spatel wrote:
> vblendvps is a regression vs. the vandnps that we had. Do you know what caused that?
> 
> Also, why did the "-NEXT" part of the assertions disappear? For any x86 codegen test changes, please use the script that is noted on the first line of the test file to avoid that problem.
Sorry about that.


Repository:
  rL LLVM

https://reviews.llvm.org/D22975