[PATCH] D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set
Changpeng Fang via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 21 08:36:07 PST 2020
cfang marked 5 inline comments as done.
cfang added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:612-614
+ float ULP = FPOp->getFPAccuracy();
+ if (ULP < 2.5f)
+ DoFDivFast = false;
----------------
arsenm wrote:
> It would be clearer to do something like
> bool NeedHighAccuracy = !FPMath || FPMath->getFPAccuracy() < 2.5
Is < 2.5 ulp the limiting factor that we can not do 1/x -> rcp(x) ?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:619
+ bool NoRCPIntrinsic = !UnsafeDiv &&
+ (Ty->isFloatTy() && (HasFP32Denormals || !FPMath));
+
----------------
arsenm wrote:
> FPMath should be checked once, and in relation to it's value only. Checking for the lack of metadata here is imprecise
Do you mean here we should check like this:
(Ty->isFloatTy() && (HasFP32Denormals || NeedHighAccuracy));
where NeedHighAccuracy is checked like a previous comment?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71293/new/
https://reviews.llvm.org/D71293
More information about the llvm-commits
mailing list