[PATCH] D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set

Changpeng Fang via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 10 09:53:19 PST 2020


cfang marked 2 inline comments as done.
cfang added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:568-569
+    // is enabled.
+    if (!HasCorrectlyRoundedDivideSqrt)
+      return false;
+    SafeFast = false;
----------------
arsenm wrote:
> The attribute should not be considered at all. Only the fpmath metadata matters. If -cl-fp32-correctly-rounded-divide-sqrt is specified, the regular fdiv instruction should behave correctly.
Do you mean that in AMDGPUCodeGenPrepare, we should check the fpmath metadata to keep regular fdiv (instead of an intrinsic) when  -cl-fp32-correctly-rounded-divide-sqrt is specified?

The issue is, when -cl-fp32-correctly-rounded-divide-sqrt is specified, a simple v_rcp is generated for a fdiv. Apparently the codegen produces the wrong sequence of code for a "regular" fdiv. 



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71293/new/

https://reviews.llvm.org/D71293





More information about the llvm-commits mailing list