[PATCH] D114351: [AMDGPU] Add SIMemoryLegalizer comments to clarify bit usage (NFC)
Tony Tye via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 22 10:45:49 PST 2021
t-tye added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:843
+ /// bypassed, and the GLC bit is instead used to indicate if they are
+ /// return or no-return.
----------------
Please add back:
/// There is no bypass control for the L2 cache at the isa level.
The modified comment is only explaining the L1 cache and both caches are involved for system scope.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:865-866
if (IsVolatile) {
+ // Request L1 MISS_EVICT for load instructions.
+ // Stores will automatically propagate to L2 (write-combine).
if (Op == SIMemOp::LOAD)
----------------
How about:
// Request L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache bypass policy at the isa level.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:882-883
if (IsNonTemporal) {
- // Request L1 MISS_EVICT and L2 STREAM for load and store instructions.
+ // Request L1 MISS_EVICT for load and store instructions. This makes
+ // L1 write-through for store instructions.
Changed |= enableGLCBit(MI);
----------------
I had not remembered that the GLC and SLC bits are used together to set the L1 cache policy. So how about:
// Setting GLC and SLC both to 1 sets the L1 cache policy to MISS_EVICT for both loads and stores, and the L2 cache policy to STREAM.
and delete the comment below.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1106
case SIAtomicScope::AGENT:
+ // Request L1 MISS_EVICT.
Changed |= enableGLCBit(MI);
----------------
MISS_EVICT is a policy that only applies when both SLC and GLC are set. Here only GLC is being set which specifies MISS_LRU and L2 LRU. How about:
// Set the L1 cache policy to MISS_LRU. Note that there is no L2 cache bypass policy at the isa level.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1216-1217
if (IsVolatile) {
+ // Request L1 MISS_EVICT for load instructions.
+ // Stores will automatically propagate to L2 (write-combine).
if (Op == SIMemOp::LOAD)
----------------
How about:
// Request L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache bypass policy at the isa level.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1233-1236
+ // Request L1 MISS_EVICT for load and store instructions. This makes
+ // L1 write-through for store instructions.
Changed |= enableGLCBit(MI);
+ // Request L2 STREAM.
----------------
How about:
// Setting GLC and SLC both to 1 sets the L1 cache policy to MISS_EVICT for both loads and stores, and the L2 cache policy to STREAM.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1397
case SIAtomicScope::AGENT:
+ // Request L0 (GLC) and L1 (DLC) MISS_EVICT.
Changed |= enableGLCBit(MI);
----------------
How about:
// Set the L0 and L1 cache policies to MISS_EVICT. Note that there is no L2 cache bypass policy at the isa level.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1446-1448
+ // Request L0 (GLC) and L1 (DLC) MISS_EVICT for load instructions.
+ // Stores will automatically propagate to L2 with default policy for L0
+ // being MISS_LRU (write-combine) and L1 always bypassed.
----------------
How about:
// Request L0 and L1 cache policy to be MISS_EVICT for load instructions and MISS_LRU for store instructions. Note that there is no L2 cache coherent bypass policy at the isa level.
================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1465-1467
+ // Request L0 and L1 HIT_EVICT for load instructions, and L2 STREAM for
+ // load and store instructions. L0 will still be MISS_LRU for store
+ // instructions unless GLC is set elsewhere.
----------------
It appears that this should be setting GLC=1 for stores so that L0 will be HIT_EVICT instead of MISS_EVICT. This must not be done for loads as that would make Lo MISS_EVICT. How about:
// For loads setting GLC to 1 sets the L0 and L1 cache policy to HIT_EVICT and the L2 cache policy to STREAM. For stores setting GLC and SLC both to 1 sets the L0 and L1 cache policy to MISS_EVICT and the L2 cache policy to STREAM.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D114351/new/
https://reviews.llvm.org/D114351
More information about the llvm-commits
mailing list