[PATCH] D144313: [AMDGPU] Improve the lowering of raw_buffer_load_{i8,i16} and struct_buffer_load_{i8,i16} intrinsics

Mon Feb 20 02:14:16 PST 2023

foad added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp:399
+      SubwordBufferLoad->getOpcode() == AMDGPU::G_AMDGPU_BUFFER_LOAD_USHORT) {
+    // Generate a signed subword buffer load instruction using the arguments of
+    // the existing one.
----------------
You need to check that SubwordBufferLoad has no other uses.

Instead of creating a new instruction, you can modify it in-place using SubwordBufferLoad->setDesc and SubwordBufferLoad->getOperand(0).setReg.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp:375-376
     return Helper.tryCombineShiftToUnmerge(MI, 32);
+  case TargetOpcode::G_SEXT_INREG:
+    return Helper.applyCombineSignExtendInReg(MI, B);
   }
----------------
Instead of calling applyCombineSignExtendInReg here, please declare the new combine in AMDGPUCombine.td so it will be called from the autogenerated tryCombineAll.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D144313/new/

https://reviews.llvm.org/D144313