[llvm] AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (PR #82996)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 28 06:23:59 PST 2024
================
@@ -589,6 +593,16 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool setScope(const MachineBasicBlock::iterator MI,
AMDGPU::CPol::CPol Value) const;
+ // Stores with system scope (SCOPE_SYS) need to wait for:
+ // - loads or atomics(returning) - wait for {LOAD|SAMPLE|BVH|KM}CNT==0
+ // - non-returning-atomics - wait for STORECNT==0
+ // TODO: SIInsertWaitcnts will not always be able to remove STORECNT waits
+ // since it does not distinguish atomics-with-return from regular stores.
+ // There is no need to wait if memory is cached (mtype != UC).
+ // For example shader-visible memory is cached.
----------------
jayfoad wrote:
I don't understand the statement that "shader-visible memory is cached". Surely we are compiling a shader, so any memory the shader refers to is "shader-visible", so why do we need to worry about uncached memory?
https://github.com/llvm/llvm-project/pull/82996
More information about the llvm-commits
mailing list