[llvm] [AMDGPU] Enable unaligned scratch accesses (PR #110219)

Fabian Ritter via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 30 05:28:09 PDT 2024


================
@@ -1831,26 +1831,17 @@ bool SITargetLowering::allowsMisalignedMemoryAccessesImpl(
            Subtarget->hasUnalignedDSAccessEnabled();
   }
 
-  if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS) {
-    bool AlignedBy4 = Alignment >= Align(4);
-    if (IsFast)
-      *IsFast = AlignedBy4;
-
-    return AlignedBy4 ||
-           Subtarget->enableFlatScratch() ||
-           Subtarget->hasUnalignedScratchAccess();
-  }
-
   // FIXME: We have to be conservative here and assume that flat operations
   // will access scratch.  If we had access to the IR function, then we
   // could determine if any private memory was used in the function.
-  if (AddrSpace == AMDGPUAS::FLAT_ADDRESS &&
-      !Subtarget->hasUnalignedScratchAccess()) {
+  if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS ||
+      AddrSpace == AMDGPUAS::FLAT_ADDRESS) {
     bool AlignedBy4 = Alignment >= Align(4);
     if (IsFast)
       *IsFast = AlignedBy4;
 
-    return AlignedBy4;
+    return AlignedBy4 || Subtarget->enableFlatScratch() ||
----------------
ritter-x2a wrote:

Note that we don't do that in the current trunk state: see the old [line 1840](https://github.com/llvm/llvm-project/pull/110219/files#diff-e78d2fbd64648d787707fd3d4e7e5b5d2f00fb9c09972937718a11237933c597L1840). Removing `Subtarget->enableFlatScratch()` from the OR would make code generation more pessimistic for cases where unaligned scratch is not explicitly enabled but flat scratch is, e.g., gfx12. Should we change that anyway?

https://github.com/llvm/llvm-project/pull/110219


More information about the llvm-commits mailing list