[llvm] [AMDGPU] Fix code sequence for barrier start in GFX10+ CU Mode (PR #160501)

Sameer Sahasrabuddhe via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 1 02:00:40 PDT 2025


ssahasra wrote:

> > Do you think we should instead pessimize all workgroup release fences in CU mode so they have a wait on storecnt?
> 
> Is it a pessimization? I don't think so. Isn't the example @perlfu gave offline evidence that if a release fence intends to fence global memory, then a storecnt wait is pretty much unavoidable?

I agree. It's a bug fix, not a pessimization. On the other hand, the programmer may know that a certain part of the program only cares about synchronization within the workgroup. For such a program, opting out of transitivity is an optimization, which needs a way to be expressed in LLVM IR.

https://github.com/llvm/llvm-project/pull/160501


More information about the llvm-commits mailing list