[llvm] c8eeee2 - AMDGPU: Drop unsafe 1/sqrt -> rsq combine
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Aug 16 05:52:22 PDT 2023
Author: Matt Arsenault
Date: 2023-08-16T08:52:17-04:00
New Revision: c8eeee2be2db2bd7a469065878852a5833add9b9
URL: https://github.com/llvm/llvm-project/commit/c8eeee2be2db2bd7a469065878852a5833add9b9
DIFF: https://github.com/llvm/llvm-project/commit/c8eeee2be2db2bd7a469065878852a5833add9b9.diff
LOG: AMDGPU: Drop unsafe 1/sqrt -> rsq combine
AMDGPUCodeGenPrepare implements a safer version of this that handles
denormals correctly.
https://reviews.llvm.org/D158032
Added:
Modified:
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 08869abb0ea522..72ff2029b81112 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -4532,8 +4532,6 @@ bool AMDGPULegalizerInfo::legalizeFastUnsafeFDIV(MachineInstr &MI,
return true;
}
- // TODO: Match rsq
-
// -1 / x -> RCP( FNEG(x) )
if (CLHS->isExactlyValue(-1.0)) {
auto FNeg = B.buildFNeg(ResTy, RHS, Flags);
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 0008cedb832211..5a8d45269fc620 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -9299,11 +9299,6 @@ SDValue SITargetLowering::lowerFastUnsafeFDIV(SDValue Op,
// XXX - Is UnsafeFPMath sufficient to do this for f64? The maximum ULP
// error seems really high at 2^29 ULP.
-
- // XXX - do we need afn for this or is arcp sufficent?
- if (RHS.getOpcode() == ISD::FSQRT)
- return DAG.getNode(AMDGPUISD::RSQ, SL, VT, RHS.getOperand(0));
-
// 1.0 / x -> rcp(x)
return DAG.getNode(AMDGPUISD::RCP, SL, VT, RHS);
}
More information about the llvm-commits
mailing list