[PATCH] D144313: [AMDGPU] Improve the lowering of raw_buffer_load_{i8,i16} and struct_buffer_load_{i8,i16} intrinsics
Jay Foad via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 20 02:14:16 PST 2023
foad added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUCombinerHelper.cpp:399
+ SubwordBufferLoad->getOpcode() == AMDGPU::G_AMDGPU_BUFFER_LOAD_USHORT) {
+ // Generate a signed subword buffer load instruction using the arguments of
+ // the existing one.
----------------
You need to check that SubwordBufferLoad has no other uses.
Instead of creating a new instruction, you can modify it in-place using SubwordBufferLoad->setDesc and SubwordBufferLoad->getOperand(0).setReg.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUPostLegalizerCombiner.cpp:375-376
return Helper.tryCombineShiftToUnmerge(MI, 32);
+ case TargetOpcode::G_SEXT_INREG:
+ return Helper.applyCombineSignExtendInReg(MI, B);
}
----------------
Instead of calling applyCombineSignExtendInReg here, please declare the new combine in AMDGPUCombine.td so it will be called from the autogenerated tryCombineAll.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144313/new/
https://reviews.llvm.org/D144313
More information about the llvm-commits
mailing list