[llvm] [AMDGPU] Fix code sequence for barrier start in GFX10+ CU Mode (PR #160501)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 1 03:06:32 PDT 2025
Pierre-vh wrote:
After thinking about this a bit more, I think you're both right. There's a bug. I think in the absence of waits, we could have the following situation:
```
Thread 0:
store atomic A relaxed
store atomic B release syncscope("workgroup")
```
Then another thread in the workgroup could see the store to B without having the certainty that the store to A is done as well.
The store to A could be held into another memory channel for example.
It'd require some unfortunate timing to see that happen without barriers (hence why we never observed it until now), but as proved here it's possible to see it when using barriers.
We can fix it by using the same waits for WGP/CU mode.
Does everyone agree with that?
https://github.com/llvm/llvm-project/pull/160501
More information about the llvm-commits
mailing list