[llvm] [AMDGPU] Use native sqrt when flushing denorm is allowed (PR #114173)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 30 07:02:36 PDT 2024
ruiling wrote:
> Pretty sure this is wrong and won't pass OpenCL conformance with FTZ on. The sqrt instruction is still 1ulp, not 0.5 ulp. We should already account for denormal mode and !fpmath metadata in AMDGPUCodeGenPrepare
Thanks for the feedback @arsenm, I did not take close look the real issue before. I have a look the code in AMDGPUCodeGenPrepare, we did not set the fast math flag properly for the case. But it sounds surprising OpenCL conformance needs correctly rounded sqrt. I think it requires 3ulp for sqrt?
https://github.com/llvm/llvm-project/pull/114173
More information about the llvm-commits
mailing list