[llvm] [AMDGPU] SIPeepholeSDWA: Add REG_SEQUENCE support (PR #133087)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Fri Mar 28 08:36:00 PDT 2025
================
@@ -382,6 +390,97 @@ uint64_t SDWASrcOperand::getSrcMods(const SIInstrInfo *TII,
return Mods;
}
+// The following functions are helpers for dealing with REG_SEQUENCE
+// instructions. Those instructions are used to represent copies to
+// subregisters in SSA form.
+//
+// This pass should be able to peak through REG_SEQUENCE
+// instructions. An access to a subregister of a register defined
+// by a REG_SEQUENCE should be handled as if the register
+// that is being copied to the subregister was accessed.
+// Consider the following example:
+// %1:vgpr_32 = V_ADD_U32_e64 %0, 10, 0
+// %2:vgpr_32 = V_ADD_U32_e64 %0, 20, 0
+// %3:sreg_32 = S_MOV_B32 255
+// %4:vgpr_32 = V_AND_B32_e64 %2, %3
+// %5:vgpr_32, %6:sreg_64_xexec = V_ADD_CO_U32_e64 %1, %4, 0
+//
+// The V_ADD_CO_U32_e64 instructions will be combined with the
+// V_AND_B32_e64 into an SDWA instruction.
+//
+// If one or more of the operands of V_ADD_CO_U32_e64 are accessed
+// through the subregisters of a REG_SEQUENCE as in the following
+// variation of the previous example, the optimization should still be
+// able to proceed in the same way:
+//
+// [...]
+// %4:vreg_64 = REG_SEQUENCE %1, %subreg.sub0, %3, %subreg.sub1
+// %5:sreg_32 = S_MOV_B32 255
+// %6:vgpr_32 = V_AND_B32_e64 %2, %5
+// %7:vreg_64 = REG_SEQUENCE %6, %subreg.sub0, %3, %subreg.sub1
+// %8:vgpr_32, %9:sreg_64_xexec = V_ADD_CO_U32_e64 %4.sub0, %7.sub0, 0
+//
+// To this end, the SDWASrcOperand implementation uses the following
+// functions to find out the register that is used as the source of
+// the subregister value and it uses this register directly instead of
+// the REG_SEQUENCE subregister.
+
+/// Return the subregister of the REG_SEQUENCE \p RegSeq
+/// which is copied from \p Op, i.e. the operand following
+/// \p Op in the operands of \p RegSeq, or nullopt if the
+/// the \p Op is not an operand of \p RegSeq.
+static std::optional<unsigned> regSequenceFindSubreg(const MachineInstr &RegSeq,
+ Register Reg) {
+ if (!RegSeq.isRegSequence())
+ return {};
+
+ auto *End = RegSeq.operands_end();
+ // Operand pair at indices (i+1, i+2) is (register, subregister)
+ for (auto *It = RegSeq.operands_begin() + 1; It != End; It += 2) {
+ if (isSameReg(*It, Reg))
+ return (It + 1)->getImm();
+ }
+
+ return {};
+}
+
+/// Return the single use of \p RegSeq which accesses the subregister
+/// that copies from \p Reg. Returns nullptr if \p Reg is not used by
+/// exactly one operand of \p RegSeq.
+static MachineInstr *regSequenceFindSingleSubregUse(MachineInstr &RegSeq,
+ Register Reg,
+ MachineRegisterInfo *MRI) {
+ Register SeqReg = RegSeq.getOperand(0).getReg();
+ unsigned SubReg = *regSequenceFindSubreg(RegSeq, Reg);
+
+ MachineInstr *SingleUse = nullptr;
+ for (MachineInstr &UseMI : MRI->use_nodbg_instructions(SeqReg))
+ for (auto &Op : UseMI.operands())
+ if (Op.isReg() && Op.getReg() == SeqReg && Op.getSubReg() == SubReg) {
+ if (SingleUse)
+ return nullptr;
+ SingleUse = &UseMI;
+ }
----------------
arsenm wrote:
MachineRegisterInfo should have getOneNonDBGUse and getOneNonDBGUser, not sure how we don't have them. I think I've seen that floating around in an open PR recently
https://github.com/llvm/llvm-project/pull/133087
More information about the llvm-commits
mailing list