[PATCH] D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set

Tue Jan 21 08:36:07 PST 2020

cfang marked 5 inline comments as done.
cfang added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:612-614
+  float ULP = FPOp->getFPAccuracy();
+  if (ULP < 2.5f)
+    DoFDivFast = false;
----------------
arsenm wrote:
> It would be clearer to do something like 
> bool NeedHighAccuracy = !FPMath || FPMath->getFPAccuracy() < 2.5
Is < 2.5 ulp the limiting factor that we can not do 1/x -> rcp(x) ?  

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:619
+  bool NoRCPIntrinsic = !UnsafeDiv &&
+                        (Ty->isFloatTy() && (HasFP32Denormals || !FPMath));
+
----------------
arsenm wrote:
> FPMath should be checked once, and in relation to it's value only. Checking for the lack of metadata here is imprecise
Do you mean here we should check like this:
  (Ty->isFloatTy() && (HasFP32Denormals || NeedHighAccuracy));

where NeedHighAccuracy is checked like a previous comment?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71293/new/

https://reviews.llvm.org/D71293