[llvm] [AMDGPU] CodeGen for GFX12 8/16-bit SMEM loads (PR #77633)

Wed Jan 10 19:48:12 PST 2024

================
@@ -5894,6 +5894,56 @@ void SITargetLowering::ReplaceNodeResults(SDNode *N,
       }
       return;
     }
+    case Intrinsic::amdgcn_s_buffer_load: {
+      // Lower llvm.amdgcn.s.buffer.load.(i8, u8) intrinsics. First, we generate
+      // s_buffer_load_u8 for signed and unsigned load instructions. Next, DAG
+      // combiner tries to merge the s_buffer_load_u8 with a sext instruction
+      // (performSignExtendInRegCombine()) and it replaces s_buffer_load_u8 with
+      // s_buffer_load_i8.
+      assert(Subtarget->hasScalarSubwordLoads() &&
----------------
arsenm wrote:

Instead of asserting this should just let the intrinsic pass through so it fails to select 

https://github.com/llvm/llvm-project/pull/77633