[llvm] [AMDGPU][SIMemoryLegalizer] Fix order of GL0/1_INV on GFX10/11 (PR #81450)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 12 00:32:35 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Pierre van Houtryve (Pierre-vh)
<details>
<summary>Changes</summary>
Fixes SWDEV-443292
---
Full diff: https://github.com/llvm/llvm-project/pull/81450.diff
2 Files Affected:
- (modified) llvm/docs/AMDGPUUsage.rst (+16-16)
- (modified) llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp (+4-1)
``````````diff
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index ebc7fda804207a..7c78c0907be5e1 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -12139,8 +12139,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
before invalidating
the caches.
- 3. buffer_gl0_inv;
- buffer_gl1_inv
+ 3. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -12169,8 +12169,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
before invalidating
the caches.
- 3. buffer_gl0_inv;
- buffer_gl1_inv
+ 3. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -12276,8 +12276,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
invalidating the
caches.
- 3. buffer_gl0_inv;
- buffer_gl1_inv
+ 3. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -12307,8 +12307,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
invalidating the
caches.
- 3. buffer_gl0_inv;
- buffer_gl1_inv
+ 3. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -12503,8 +12503,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
the
fence-paired-atomic.
- 2. buffer_gl0_inv;
- buffer_gl1_inv
+ 2. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before any
following global/generic
@@ -13217,8 +13217,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
invalidating the
caches.
- 4. buffer_gl0_inv;
- buffer_gl1_inv
+ 4. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -13292,8 +13292,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
invalidating the
caches.
- 4. buffer_gl0_inv;
- buffer_gl1_inv
+ 4. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
@@ -13520,8 +13520,8 @@ table :ref:`amdgpu-amdhsa-memory-model-code-sequences-gfx10-gfx11-table`.
requirements of
release.
- 2. buffer_gl0_inv;
- buffer_gl1_inv
+ 2. buffer_gl1_inv;
+ buffer_gl0_inv
- Must happen before
any following
diff --git a/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp b/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
index 84b9330ef9633e..f62e808b33e42b 100644
--- a/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+++ b/llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
@@ -2030,8 +2030,11 @@ bool SIGfx10CacheControl::insertAcquire(MachineBasicBlock::iterator &MI,
switch (Scope) {
case SIAtomicScope::SYSTEM:
case SIAtomicScope::AGENT:
- BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
+ // The order of invalidates matter here. We must invalidate "outer in"
+ // so L1 -> L0 to avoid L0 pulling in stale data from L1 when it is
+ // invalidated.
BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL1_INV));
+ BuildMI(MBB, MI, DL, TII->get(AMDGPU::BUFFER_GL0_INV));
Changed = true;
break;
case SIAtomicScope::WORKGROUP:
``````````
</details>
https://github.com/llvm/llvm-project/pull/81450
More information about the llvm-commits
mailing list