[PATCH] D21379: [X86] Heuristic to selectively build Newton-Raphson SQRT estimation

Nikolai Bozhenov via llvm-commits llvm-commits at lists.llvm.org
Mon Jul 25 07:35:25 PDT 2016


n.bozhenov updated this revision to Diff 65344.
n.bozhenov added a comment.

An updated version of the patch is uploaded. After more careful benchmarking and
analysis I found a performance problem in a corner case when both SQRT(x) and
RSQRT(x) are required. Indeed, if this is the case the compiler may build a
plain SQRTSS instruction to calculate SQRT(x) and a RSQRTSS followed by
refinement to calculate RSQRT(x). So, I've added an additional check to
`X86TargetLowering::isFsqrtCheap` to avoid building both SQRT and RSQRT
instructions for the same input value.


https://reviews.llvm.org/D21379

Files:
  include/llvm/Target/TargetLowering.h
  lib/CodeGen/SelectionDAG/DAGCombiner.cpp
  lib/CodeGen/TargetLoweringBase.cpp
  lib/Target/AMDGPU/AMDGPUISelLowering.cpp
  lib/Target/AMDGPU/AMDGPUISelLowering.h
  lib/Target/X86/X86.td
  lib/Target/X86/X86ISelLowering.cpp
  lib/Target/X86/X86ISelLowering.h
  lib/Target/X86/X86Subtarget.cpp
  lib/Target/X86/X86Subtarget.h
  test/CodeGen/X86/sqrt-fastmath-tune.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D21379.65344.patch
Type: text/x-patch
Size: 11126 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160725/f86593e5/attachment.bin>


More information about the llvm-commits mailing list