[PATCH] D22975: Compute the Newton series natively
Evandro Menezes via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 9 15:30:23 PDT 2016
evandro added inline comments.
================
Comment at: llvm/test/CodeGen/X86/sqrt-fastmath.ll:42-45
@@ -41,6 +41,6 @@
; ESTIMATE-NEXT: vmulss %xmm1, %xmm2, %xmm1
; ESTIMATE-NEXT: vxorps %xmm2, %xmm2, %xmm2
-; ESTIMATE-NEXT: vcmpeqss %xmm2, %xmm0, %xmm0
-; ESTIMATE-NEXT: vandnps %xmm1, %xmm0, %xmm0
+; ESTIMATE-NEXT: vcmpeqss %xmm2, %xmm0, %xmm2
+; ESTIMATE-NEXT: vblendvps %xmm2, %xmm0, %xmm1, %xmm0
; ESTIMATE-NEXT: retq
%call = tail call float @__sqrtf_finite(float %f) #1
----------------
RKSimon wrote:
> spatel wrote:
> > No worries. Note that I've used a modified version of that script to generate checks for targets besides x86 - in case anyone would like to enhance the script and make test generation easier for AArch64. :)
> As Sanjay said, the use of vblendvps over vandnps is a regression that could affect throughput quite badly.
@t.p.northover, is Sanjay onto something that AArch64 could use a folding instead? Otherwise, I could move the check for 0.0 inside `getSqrtEstimate()`.
Repository:
rL LLVM
https://reviews.llvm.org/D22975
More information about the llvm-commits
mailing list