[PATCH] D153472: AMDGPU: Correctly expand f64 sqrt intrinsic

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 21 14:40:39 PDT 2023


arsenm created this revision.
arsenm added reviewers: AMDGPU, jhuber6, rampitec, Pierre-vh, foad.
Herald added subscribers: StephenFan, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

rocm-device-libs and llpc were avoiding using f64 sqrt
intrinsics in favor of their own expansions. Port the
expansion into the backend. Both of these users should be
updated to call the intrinsic instead.

      

The library and llpc expansions are slightly different.
llpc uses an ldexp to do the scale; the library uses a multiply.

      

Use ldexp to do the scale instead of the multiply.
I believe v_ldexp_f64 and v_mul_f64 are always the same number of
cycles, but it's cheaper to materialize the 32-bit integer constant
than the 64-bit double constant.

      

The libraries have another fast version of sqrt which will
be handled separately.

      

I am tempted to do this in an IR expansion instead. In the IR
we could take advantage of computeKnownFPClass to avoid
the 0-or-inf argument check.


https://reviews.llvm.org/D153472

Files:
  llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
  llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
  llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.h
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.h
  llvm/lib/Target/AMDGPU/VOP1Instructions.td
  llvm/test/Analysis/CostModel/AMDGPU/arith-fp.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fsqrt.mir
  llvm/test/CodeGen/AMDGPU/fsqrt.f64.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll
  llvm/test/CodeGen/AMDGPU/rsq.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D153472.533408.patch
Type: text/x-patch
Size: 135799 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230621/f1b852f6/attachment-0001.bin>


More information about the llvm-commits mailing list