[llvm] [AMDGPU] Disable atomic optimization of fadd/fsub with result (PR #96479)

Mon Jun 24 05:06:55 PDT 2024

jayfoad wrote:

> I thought everything worked as long as we used -0 as the fadd identity value for the initial value

As far as I know everything works for getting the correct result in memory. It's only the extra step to work out the per-lane result of the atomicrmw instruction that has this problem. And _maybe_ it's only wrong when %y is uniform -- I have not thought too much about the "scan" path when %y is divergent.

Anyway I want to get this correct before worrying too much about reoptimising it. Currently it is causing some Vulkan CTS tests to fail.

https://github.com/llvm/llvm-project/pull/96479