[llvm] AMDGPU: Replace sqrt OpenCL libcalls with llvm.sqrt (PR #74197)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Tue Dec 12 14:17:01 PST 2023
rampitec wrote:
I have checked our current llvm.sqrt.f32 lowering. To get the same result as native_sqrt would give either a call needs to have afn attribute, or fpmath metadata has to be attached to the call requesting 2ulp or lower accuracy.
On the other hand current folding is done if either 'fast' flag is set on the call or "unsafe-fp-math" attribute is set on a caller function. So the question is: will conditions from the first list be satisfied if any one of the conditions from the second list is met? I.e. does it have a potential for regression?
For instance I do not see checks for 'call fast float @llvm.sqrt.f32' in the fsqrt.f32.ll.
https://github.com/llvm/llvm-project/pull/74197
More information about the llvm-commits
mailing list