[PATCH] D71293: AMDGPU: Generate the correct sequence of code for FDIV32 when correctly-rounded-divide-sqrt is set
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 10 10:59:55 PST 2020
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp:568-569
+ // is enabled.
+ if (!HasCorrectlyRoundedDivideSqrt)
+ return false;
+ SafeFast = false;
----------------
cfang wrote:
> arsenm wrote:
> > The attribute should not be considered at all. Only the fpmath metadata matters. If -cl-fp32-correctly-rounded-divide-sqrt is specified, the regular fdiv instruction should behave correctly.
> Do you mean that in AMDGPUCodeGenPrepare, we should check the fpmath metadata to keep regular fdiv (instead of an intrinsic) when -cl-fp32-correctly-rounded-divide-sqrt is specified?
>
> The issue is, when -cl-fp32-correctly-rounded-divide-sqrt is specified, a simple v_rcp is generated for a fdiv. Apparently the codegen produces the wrong sequence of code for a "regular" fdiv.
>
Yes. We shouldn't even have an IR attribute for this flag. The interpretation of the flag is entirely represented in the use of the !fpmath metadata.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D71293/new/
https://reviews.llvm.org/D71293
More information about the llvm-commits
mailing list