[llvm] [AMDGPU] Support merging 16-bit and 8-bit TBUFFER load/store instruction (PR #145078)

Wed Jul 23 23:13:44 PDT 2025

================
@@ -839,8 +839,16 @@ void SILoadStoreOptimizer::CombineInfo::setMI(MachineBasicBlock::iterator MI,
     Offset = I->getOperand(OffsetIdx).getImm();
   }
 
-  if (InstClass == TBUFFER_LOAD || InstClass == TBUFFER_STORE)
+  if (InstClass == TBUFFER_LOAD || InstClass == TBUFFER_STORE) {
     Format = LSO.TII->getNamedOperand(*I, AMDGPU::OpName::format)->getImm();
+    const AMDGPU::GcnBufferFormatInfo *Info =
+        AMDGPU::getGcnBufferFormatInfo(Format, *LSO.STM);
+
+    // Use 2-byte element size if the tbuffer format is 16-bit.
+    // Use 1-byte element size if the tbuffer format is 8-bit.
+    if (Info)
----------------
harrisonGPU wrote:

Hi Jay, I tried changing `setMI` to return a `bool`, but it causes some `DS_Load` tests to fail. Also, if we want to check `getGcnBufferFormatInfo` before calling `setMI`, we would need to check whether the instruction is a tbuffer load/store first , otherwise it could also affect `DS_Load` behavior.
I believe the current approach is still safe. Even if `Info` is null and the instruction is added to the mergeable list, we still check whether the format info is valid before merging. If it's null, the instruction won't be merged.
So from a correctness standpoint, I think the current logic is safe.

https://github.com/llvm/llvm-project/pull/145078