[llvm] [AMDGPU] Fix code sequence for barrier start in GFX10+ CU Mode (PR #160501)

Pierre van Houtryve via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 1 03:06:32 PDT 2025


Pierre-vh wrote:

After thinking about this a bit more, I think you're both right. There's a bug. I think in the absence of waits, we could have the following situation:

```
Thread 0:
  store atomic A relaxed
  store atomic B release syncscope("workgroup")
```

Then another thread in the workgroup could see the store to B without having the certainty that the store to A is done as well.
The store to A could be held into another memory channel for example.

It'd require some unfortunate timing to see that happen without barriers (hence why we never observed it until now), but as proved here it's possible to see it when using barriers.

We can fix it by using the same waits for WGP/CU mode.
Does everyone agree with that?

https://github.com/llvm/llvm-project/pull/160501


More information about the llvm-commits mailing list