[PATCH] D158099: AMDGPU: Fix more unsafe rsq formation
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 16 10:18:13 PDT 2023
arsenm created this revision.
arsenm added reviewers: AMDGPU, foad, rampitec.
Herald added subscribers: StephenFan, kerbowa, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl.
Herald added a project: All.
arsenm requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.
Introducing rsq contract flags is wrong, and also requires some level
of approximate functions. AMDGPUCodeGenPrepare already should handle
the f32 cases with appropriate flags, and I don't see how new
situations to handle would arise during legalization (other than cases
involving the rcp intrinsic, which instcombine tries to
handle). AMDGPUCodeGenPrepare does need to learn better handling of
rcp/rsq for f64 though, which we never bothered to handle well.
Removes another obstacle to correctly lowering sqrt.
https://reviews.llvm.org/D158099
Files:
llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.h
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-rsq.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
llvm/test/CodeGen/AMDGPU/fdiv.f16.ll
llvm/test/CodeGen/AMDGPU/fdiv_flags.f32.ll
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.rcp.ll
llvm/test/CodeGen/AMDGPU/rsq.f32.ll
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D158099.550799.patch
Type: text/x-patch
Size: 38930 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230816/a9ef440e/attachment.bin>
More information about the llvm-commits
mailing list