[llvm] [AMDGPU] Support merging 16-bit and 8-bit TBUFFER load/store instruction (PR #145078)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 20 01:46:45 PDT 2025


================
@@ -1049,24 +1056,51 @@ bool SILoadStoreOptimizer::offsetsCanBeCombined(CombineInfo &CI,
 
     const llvm::AMDGPU::GcnBufferFormatInfo *Info0 =
         llvm::AMDGPU::getGcnBufferFormatInfo(CI.Format, STI);
-    if (!Info0)
-      return false;
     const llvm::AMDGPU::GcnBufferFormatInfo *Info1 =
         llvm::AMDGPU::getGcnBufferFormatInfo(Paired.Format, STI);
-    if (!Info1)
-      return false;
 
     if (Info0->BitsPerComp != Info1->BitsPerComp ||
         Info0->NumFormat != Info1->NumFormat)
       return false;
 
-    // TODO: Should be possible to support more formats, but if format loads
-    // are not dword-aligned, the merged load might not be valid.
-    if (Info0->BitsPerComp != 32)
+    // For 8-bit or 16-bit formats there is no 3-component variant.
+    // If NumCombinedComponents is 3, try the 4-component format and use XYZ.
+    // Example:
+    //   tbuffer_load_format_x + tbuffer_load_format_x + tbuffer_load_format_x
+    //   ==> tbuffer_load_format_xyz with format:[BUF_FMT_16_16_16_16_SNORM]
+    unsigned NumCombinedComponents = CI.Width + Paired.Width;
+    unsigned CombinedBufferFormat =
+        getBufferFormatWithCompCount(CI.Format, NumCombinedComponents, STI);
+    if (CombinedBufferFormat == 0 && NumCombinedComponents == 3 &&
+        CI.EltSize <= 2) {
+      unsigned TryFormat = getBufferFormatWithCompCount(CI.Format, 4, STI);
+      if (!TryFormat)
+        return false;
+      CombinedBufferFormat = TryFormat;
+      NumCombinedComponents = 4;
+    }
----------------
jayfoad wrote:

I meant:
```suggestion
    if (NumCombinedComponents == 3 && CI.EltSize <= 2)
      NumCombinedComponents = 4;
    unsigned CombinedBufferFormat =
        getBufferFormatWithCompCount(CI.Format, NumCombinedComponents, STI);
```
This is shorter, and only calls `getBufferFormatWithCompCount` once, and uses exactly the same logic as `mergeTBufferLoadPair` and `mergeTBufferStorePair`.

https://github.com/llvm/llvm-project/pull/145078


More information about the llvm-commits mailing list