[llvm] [AMDGPU] Generate COPY for each use-constraint instead of constraining the register class (PR #182104)
Chinmay Deshpande via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 18 11:04:10 PST 2026
================
@@ -8354,10 +8334,33 @@ void SIInstrInfo::moveToVALUImpl(SIInstrWorklist &Worklist,
llvm_unreachable("failed to constrain register");
Inst.eraseFromParent();
- // Legalize t16 operand since replaceReg is called after addUsersToVALU
- for (MachineOperand &MO :
+
+ const TargetRegisterClass *NewDstRegRC = MRI.getRegClass(NewDstReg);
+ for (MachineOperand &UseMO :
make_early_inc_range(MRI.use_operands(NewDstReg))) {
- legalizeOperandsVALUt16(*MO.getParent(), MRI);
+ MachineInstr &UseMI = *UseMO.getParent();
+
+ // Legalize t16 operands since replaceReg is called after
+ // addUsersToVALU.
+ legalizeOperandsVALUt16(UseMI, MRI);
+
+ // If a user operand requires a narrower register class than
+ // NewDstReg (e.g., VGPR_32_Lo256 for WMMA scale operands), emit
+ // a COPY to a new register with the correct class.
+ unsigned OpIdx = UseMI.getOperandNo(&UseMO);
+ const TargetRegisterClass *OpRC =
+ getRegClass(UseMI.getDesc(), OpIdx);
----------------
chinmaydd wrote:
```suggestion
const TargetRegisterClass *OpRC = getRegClass(UseMI.getDesc(), OpIdx);
```
https://github.com/llvm/llvm-project/pull/182104
More information about the llvm-commits
mailing list