[PATCH] D114351: [AMDGPU] Add SIMemoryLegalizer comments to clarify bit usage (NFC)

Tony Tye via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 23 00:16:39 PST 2021


t-tye added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:843
+  /// bypassed, and the GLC bit is instead used to indicate if they are
+  /// return or no-return.
 
----------------
critson wrote:
> t-tye wrote:
> > Please add back:
> > 
> >   /// There is no bypass control for the L2 cache at the isa level.
> > 
> > The modified comment is only explaining the L1 cache and both caches are involved for system scope.
> I deleted that text because there is a bypass for L2 stores and atomics on GFX10: SLC=0 DLC=1.
> I can put it back but only contextualised for targets before GFX10?
> (And the same for all the similar references in comments below.)
In GFX10 the bypass is only available for stores and not loads, and is not coherent so cannot be used anyway. That is why I added the word "coherent" in one of the comments below. So probably should do that here too.


================
Comment at: llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp:1465-1467
+    // Request L0 and L1 HIT_EVICT for load instructions, and L2 STREAM for
+    // load and store instructions. L0 will still be MISS_LRU for store
+    // instructions unless GLC is set elsewhere.
----------------
critson wrote:
> t-tye wrote:
> > It appears that this should be setting GLC=1 for stores so that L0 will be HIT_EVICT instead of MISS_EVICT. This must not be done for loads as that would make Lo MISS_EVICT. How about:
> > 
> > // For loads setting GLC to 1 sets the L0 and L1 cache policy to HIT_EVICT and the L2 cache policy to STREAM. For stores setting GLC and SLC both to 1 sets the L0 and L1 cache policy to MISS_EVICT and the L2 cache policy to STREAM.
> Do you have MISS_EVICT and HIT_EVICT flipped in your description?
> 
> Do you mean:
> // For loads setting **SLC** to 1 sets the L0 and L1 cache policy to HIT_EVICT and the L2 cache policy to STREAM. For stores setting GLC and SLC both to 1 sets the **L0 cache policy** to MISS_EVICT and the L2 cache policy to STREAM. **L1 is always bypassed for stores.**
> 
> I can add the GLC bit for stores and this ceases to be NFC.
I believe I have it right according to the hardware GFX10 memory model spec.

  // For loads setting SLC to 1 sets the L0 and L1 cache policy to HIT_EVICT and the L2 cache policy to STREAM. For stores setting GLC and SLC both to 1 sets the L0 and L1 cache policy to MISS_EVICT and the L2 cache policy to STREAM.

We have to state the policy for L1 too even though the hardware documentation does not state it. The L1 MUST be evict or a subsequent load could see stale data.

Yes this changes ceases to be NFC and will need thorough testing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114351/new/

https://reviews.llvm.org/D114351



More information about the llvm-commits mailing list