[llvm] [AMDGPU] SIPeepholeSDWA: Handle V_CNDMASK_B32_e64 (PR #137930)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri May 2 07:41:37 PDT 2025
================
@@ -1061,6 +1063,62 @@ void SIPeepholeSDWA::pseudoOpConvertToVOP2(MachineInstr &MI,
MISucc.substituteRegister(CarryIn->getReg(), TRI->getVCC(), 0, *TRI);
}
+/// Try to convert an \p MI in VOP3 which takes an src2 carry-in
+/// operand into the corresponding VOP2 form which expects the
+/// argument in VCC. To this end, either try to change the definition
+/// of the carry-in operand to write to VCC or add an instruction that
+/// copies from the carry-in to VCC. The conversion will only be
+/// applied if \p MI can be shrunk to VOP2 and if VCC can be proven to
+/// be dead before \p MI.
+void SIPeepholeSDWA::convertVcndmaskToVOP2(MachineInstr &MI,
+ const GCNSubtarget &ST) const {
+ assert(MI.getOpcode() == AMDGPU::V_CNDMASK_B32_e64);
+
+ LLVM_DEBUG(dbgs() << "Attempting VOP2 conversion: " << MI);
+ if (!TII->canShrink(MI, *MRI)) {
+ LLVM_DEBUG(dbgs() << "Cannot shrink instruction\n");
+ return;
+ }
+
+ const MachineOperand &CarryIn =
+ *TII->getNamedOperand(MI, AMDGPU::OpName::src2);
+ Register CarryReg = CarryIn.getReg();
+ MachineInstr *CarryDef = MRI->getVRegDef(CarryReg);
+ if (!CarryDef) {
+ LLVM_DEBUG(dbgs() << "Missing carry-in operand definition\n");
+ return;
+ }
+
+ // Make sure VCC or its subregs are dead before MI.
+ MCRegister Vcc = TRI->getVCC();
+ MachineBasicBlock &MBB = *MI.getParent();
+ MachineBasicBlock::LivenessQueryResult Liveness =
+ MBB.computeRegisterLiveness(TRI, Vcc, MI);
+ if (Liveness != MachineBasicBlock::LQR_Dead) {
+ LLVM_DEBUG(dbgs() << "VCC not known to be dead before instruction\n");
+ return;
+ }
+
+ // Change destination of compare instruction to VCC
+ // or copy to VCC if carry-in is not a compare inst.
+ if (CarryDef->isCompare() && TII->isVOP3(*CarryDef) &&
+ MRI->hasOneUse(CarryIn.getReg()))
----------------
arsenm wrote:
This will miss the v_cmp_class case, it's not a simple compare. This also isn't shrinking the compare to the vop2 form. Does it work well enough to just always do the copy to vcc and let the other passes take care of this?
https://github.com/llvm/llvm-project/pull/137930
More information about the llvm-commits
mailing list