[PATCH] D120544: [AMDGPU] Omit unnecessary waitcnt before barriers
Austin Kerbow via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon May 2 17:54:33 PDT 2022
kerbowa added a comment.
In D120544#3487040 <https://reviews.llvm.org/D120544#3487040>, @mareko wrote:
> We are going to insert amdgcn.s.waitcnt instead of a fence because fences wait for memory stores, which we don't want.
Without this patch, the compiler will wait for everything, so I'm not sure how you will be differentiating between loads and stores.
> What does the release fence add? Waiting for only LDS or everything?
It depends on the scope of the fence and the HW. See: https://llvm.org/docs/AMDGPUUsage.html#amdgpu-amdhsa-memory-model
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D120544/new/
https://reviews.llvm.org/D120544
More information about the llvm-commits
mailing list