[llvm] [AMDGPU] SIPeepholeSDWA: Add REG_SEQUENCE support (PR #133087)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Fri Mar 28 08:36:00 PDT 2025


================
@@ -382,6 +390,97 @@ uint64_t SDWASrcOperand::getSrcMods(const SIInstrInfo *TII,
   return Mods;
 }
 
+// The following functions are helpers for dealing with REG_SEQUENCE
+// instructions. Those instructions are used to represent copies to
+// subregisters in SSA form.
+//
+// This pass should be able to peak through REG_SEQUENCE
+// instructions. An access to a subregister of a register defined
+// by a REG_SEQUENCE should be handled as if the register
+// that is being copied to the subregister was accessed.
+// Consider the following example:
+//    %1:vgpr_32 = V_ADD_U32_e64 %0, 10, 0
+//    %2:vgpr_32 = V_ADD_U32_e64 %0, 20, 0
+//    %3:sreg_32 = S_MOV_B32 255
+//    %4:vgpr_32 = V_AND_B32_e64 %2, %3
+//    %5:vgpr_32, %6:sreg_64_xexec = V_ADD_CO_U32_e64 %1, %4, 0
+//
+// The V_ADD_CO_U32_e64 instructions will be combined with the
+// V_AND_B32_e64 into an SDWA instruction.
+//
+// If one or more of the operands of V_ADD_CO_U32_e64 are accessed
+// through the subregisters of a REG_SEQUENCE as in the following
+// variation of the previous example, the optimization should still be
+// able to proceed in the same way:
+//
+//    [...]
+//    %4:vreg_64 = REG_SEQUENCE %1, %subreg.sub0, %3, %subreg.sub1
+//    %5:sreg_32 = S_MOV_B32 255
+//    %6:vgpr_32 = V_AND_B32_e64 %2, %5
+//    %7:vreg_64 = REG_SEQUENCE %6, %subreg.sub0, %3, %subreg.sub1
+//    %8:vgpr_32, %9:sreg_64_xexec = V_ADD_CO_U32_e64 %4.sub0, %7.sub0, 0
+//
+// To this end, the SDWASrcOperand implementation uses the following
+// functions to find out the register that is used as the source of
+// the subregister value and it uses this register directly instead of
+// the REG_SEQUENCE subregister.
+
+/// Return the subregister of the REG_SEQUENCE \p RegSeq
+/// which is copied from \p Op, i.e. the operand following
+/// \p Op in the operands of \p RegSeq, or nullopt if the
+/// the \p Op is not an operand of \p RegSeq.
+static std::optional<unsigned> regSequenceFindSubreg(const MachineInstr &RegSeq,
+                                                     Register Reg) {
+  if (!RegSeq.isRegSequence())
+    return {};
+
+  auto *End = RegSeq.operands_end();
+  // Operand pair at indices (i+1, i+2) is (register, subregister)
+  for (auto *It = RegSeq.operands_begin() + 1; It != End; It += 2) {
+    if (isSameReg(*It, Reg))
+      return (It + 1)->getImm();
+  }
+
+  return {};
+}
+
+/// Return the single use of \p RegSeq which accesses the subregister
+/// that copies from \p Reg. Returns nullptr if \p Reg is not used by
+/// exactly one operand of \p RegSeq.
+static MachineInstr *regSequenceFindSingleSubregUse(MachineInstr &RegSeq,
+                                                    Register Reg,
+                                                    MachineRegisterInfo *MRI) {
+  Register SeqReg = RegSeq.getOperand(0).getReg();
+  unsigned SubReg = *regSequenceFindSubreg(RegSeq, Reg);
+
+  MachineInstr *SingleUse = nullptr;
+  for (MachineInstr &UseMI : MRI->use_nodbg_instructions(SeqReg))
+    for (auto &Op : UseMI.operands())
+      if (Op.isReg() && Op.getReg() == SeqReg && Op.getSubReg() == SubReg) {
+        if (SingleUse)
+          return nullptr;
+        SingleUse = &UseMI;
+      }
----------------
arsenm wrote:

MachineRegisterInfo should have getOneNonDBGUse and getOneNonDBGUser, not sure how we don't have them. I think I've seen that floating around in an open PR recently 

https://github.com/llvm/llvm-project/pull/133087


More information about the llvm-commits mailing list