mareko added a comment. We are going to insert amdgcn.s.waitcnt instead of a fence because fences wait for memory stores, which we don't want. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D120544/new/ https://reviews.llvm.org/D120544