[llvm] AMDGPU: Replace sqrt OpenCL libcalls with llvm.sqrt (PR #74197)

Thu Jan 4 07:27:23 PST 2024

arsenm wrote:

> Then if we are not relying on the attribute anymore, either afn or !fpmath >= 2.0 shall be set on a call site with the same list > of options. I am not sure if and why a call to @_Z4sqrtf would have any of these.

OpenCL by default / without -cl-fp32-correctly-rounded-divide-sqrt emits !fpmath 2.5 on these. Otherwise it's missing. afn is set by -fapprox-func, and any of the superset/alias flags like -cl-unsafe-math-optimizations/-cl-fast-relaxed-math/-ffast-math

> > nsz doesn't do anything for sqrt lowering.
> 
> I know it does not. Why not remove it?

Because it almost does something, but isn't quite enough to actually simplify the f64 lowering case. This way it's captured if somebody attempts the same thing later.

> 
> I will feel much more confident if this one is fixed from nsz nonsense and extra 'fast' test added, because who knows what fast will mean tomorrow? 

fast can only add more flags. We only need to test the minimum that we need today. Fast just makes the intention of the test less clear 

> Anyway, formally to have no regressions fast must resolve to a native op, so this test is formally needed.

The point of this change is that the libcall is a pointless wrapper. There is no codegen change. Fast or not is irrelevant, we're swapping out any recognized sqrt calls to the raw intrinsic, and excluding the odd case where we should not specially recognize the call.

All of the lowering cases are already tested. Any codegen logic is already captured in the variety of large sqrt lowering tests we already have. It's simply not useful to codegen any of these libcall recognition tests. An end-to-end test, which does include the library linking, would be useful but belongs in the device libs tests.

> Moreover, I wish it to be this change, so that 10 years from now git blame will point to this discussion and linked tests.

That's in all of the sqrt lowering patches 

https://github.com/llvm/llvm-project/pull/74197