[llvm] [AMDGPU][True16][CodeGen] build_vector pattern in true16 (PR #118904)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 6 06:55:04 PST 2024
================
@@ -700,9 +700,22 @@ bool AMDGPUInstructionSelector::selectG_BUILD_VECTOR(MachineInstr &MI) const {
return true;
// TODO: This should probably be a combine somewhere
- // (build_vector $src0, undef) -> copy $src0
MachineInstr *Src1Def = getDefIgnoringCopies(Src1, *MRI);
if (Src1Def->getOpcode() == AMDGPU::G_IMPLICIT_DEF) {
+ if (Subtarget->useRealTrue16Insts() && IsVector) {
+ // (vecTy (DivergentBinFrag<build_vector> Ty:$src0, (Ty undef))),
+ // -> (vecTy (INSERT_SUBREG (IMPLICIT_DEF), VGPR_16:$src0, lo16))
+ Register Undef = MRI->createVirtualRegister(&AMDGPU::VGPR_32RegClass);
+ BuildMI(*BB, &MI, DL, TII.get(AMDGPU::IMPLICIT_DEF), Undef);
+ BuildMI(*BB, &MI, DL, TII.get(TargetOpcode::INSERT_SUBREG), Dst)
+ .addReg(Undef)
+ .addReg(Src0)
+ .addImm(AMDGPU::lo16);
----------------
arsenm wrote:
The tablegen uses REG_SEQUENCE, but this uses INSERT_SUBREG. Both should be consistent.
Also the pattern should work, why do you need the manual selection?
https://github.com/llvm/llvm-project/pull/118904
More information about the llvm-commits
mailing list