[PATCH] D158099: AMDGPU: Fix more unsafe rsq formation

Matt Arsenault via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 16 10:18:13 PDT 2023


arsenm created this revision.
arsenm added reviewers: AMDGPU, foad, rampitec.
Herald added subscribers: StephenFan, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

Introducing rsq contract flags is wrong, and also requires some level
of approximate functions. AMDGPUCodeGenPrepare already should handle
the f32 cases with appropriate flags, and I don't see how new
situations to handle would arise during legalization (other than cases
involving the rcp intrinsic, which instcombine tries to
handle). AMDGPUCodeGenPrepare does need to learn better handling of
rcp/rsq for f64 though, which we never bothered to handle well.

      

Removes another obstacle to correctly lowering sqrt.


https://reviews.llvm.org/D158099

Files:
  llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.cpp
  llvm/lib/Target/AMDGPU/SIISelLowering.h
  llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.mir
  llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
  llvm/test/CodeGen/AMDGPU/fdiv.f16.ll
  llvm/test/CodeGen/AMDGPU/fdiv_flags.f32.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll
  llvm/test/CodeGen/AMDGPU/rsq.f32.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D158099.550799.patch
Type: text/x-patch
Size: 38930 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230816/a9ef440e/attachment.bin>


More information about the llvm-commits mailing list