[PATCH] D114351: [AMDGPU] Add SIMemoryLegalizer comments to clarify bit usage (NFC)

Mon Nov 22 10:45:49 PST 2021

t-tye added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:843
+  /// bypassed, and the GLC bit is instead used to indicate if they are
+  /// return or no-return.

----------------
Please add back:

  /// There is no bypass control for the L2 cache at the isa level.

The modified comment is only explaining the L1 cache and both caches are involved for system scope.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:865-866
   if (IsVolatile) {
+    // Request L1 MISS_EVICT for load instructions.
+    // Stores will automatically propagate to L2 (write-combine).
     if (Op == SIMemOp::LOAD)
----------------
How about:

    // Request L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache bypass policy at the isa level.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:882-883
   if (IsNonTemporal) {
-    // Request L1 MISS_EVICT and L2 STREAM for load and store instructions.
+    // Request L1 MISS_EVICT for load and store instructions. This makes
+    // L1 write-through for store instructions.
     Changed |= enableGLCBit(MI);
----------------
I had not remembered that the GLC and SLC bits are used together to set the L1 cache policy. So how about:

    // Setting GLC and SLC both to 1 sets the L1 cache policy to MISS_EVICT for both loads and stores, and the L2 cache policy to STREAM.

and delete the comment below.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1106
     case SIAtomicScope::AGENT:
+      // Request L1 MISS_EVICT.
       Changed |= enableGLCBit(MI);
----------------
MISS_EVICT is a policy that only applies when both SLC and GLC are set. Here only GLC is being set which specifies MISS_LRU and L2 LRU. How about:

  // Set the L1 cache policy to MISS_LRU. Note that there is no L2 cache bypass policy at the isa level.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1216-1217
   if (IsVolatile) {
+    // Request L1 MISS_EVICT for load instructions.
+    // Stores will automatically propagate to L2 (write-combine).
     if (Op == SIMemOp::LOAD)
----------------
How about:

// Request L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache bypass policy at the isa level.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1233-1236
+    // Request L1 MISS_EVICT for load and store instructions. This makes
+    // L1 write-through for store instructions.
     Changed |= enableGLCBit(MI);
+    // Request L2 STREAM.
----------------
How about:

  // Setting GLC and SLC both to 1 sets the L1 cache policy to MISS_EVICT for both loads and stores, and the L2 cache policy to STREAM.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1397
     case SIAtomicScope::AGENT:
+      // Request L0 (GLC) and L1 (DLC) MISS_EVICT.
       Changed |= enableGLCBit(MI);
----------------
How about:

  // Set the L0 and L1 cache policies to MISS_EVICT. Note that there is no L2 cache bypass policy at the isa level.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1446-1448
+    // Request L0 (GLC) and L1 (DLC) MISS_EVICT for load instructions.
+    // Stores will automatically propagate to L2 with default policy for L0
+    // being MISS_LRU (write-combine) and L1 always bypassed.
----------------
How about:

  // Request L0 and L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache coherent bypass policy at the isa level.

================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1465-1467
+    // Request L0 and L1 HIT_EVICT for load instructions, and L2 STREAM for
+    // load and store instructions. L0 will still be MISS_LRU for store
+    // instructions unless GLC is set elsewhere.
----------------
It appears that this should be setting GLC=1 for stores so that L0 will be HIT_EVICT instead of MISS_EVICT. This must not be done for loads as that would make Lo MISS_EVICT. How about:

// For loads setting GLC to 1 sets the L0 and L1 cache policy to HIT_EVICT and the L2 cache policy to STREAM. For stores setting GLC and SLC both to 1 sets the L0 and L1 cache policy to MISS_EVICT and the L2 cache policy to STREAM.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114351/new/

https://reviews.llvm.org/D114351