[llvm-branch-commits] [llvm] [InlineSpiller][AMDGPU] Implement subreg reload during RA spill (PR #175002)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Jan 8 07:46:02 PST 2026


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-backend-msp430

Author: Christudasan Devadasan (cdevadas)

<details>
<summary>Changes</summary>

Currently, when a virtual register is partially used, the
entire tuple is restored from the spilled location, even if
only a subset of its sub-registers is needed. This patch
introduces support for partial reloads by analyzing actual
register usage and restoring only the required sub-registers.
This improvement enhances register allocation efficiency,
particularly for cases involving tuple virtual registers.
For AMDGPU, this change brings considerable improvements
in workloads that involve matrix operations, large vectors,
and complex control flows.

---

Patch is 906.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/175002.diff


73 Files Affected:

- (modified) llvm/include/llvm/CodeGen/TargetInstrInfo.h (+1) 
- (modified) llvm/include/llvm/CodeGen/TargetRegisterInfo.h (+10) 
- (modified) llvm/lib/CodeGen/InlineSpiller.cpp (+51-6) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AArch64/AArch64InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.cpp (+20-6) 
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.h (+1) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp (+14) 
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.h (+5) 
- (modified) llvm/lib/Target/ARC/ARCInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARC/ARCInstrInfo.h (+1) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/ARMBaseInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb1InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/ARM/Thumb2InstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/AVR/AVRInstrInfo.h (+1) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/BPF/BPFInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/CSKY/CSKYInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/CSKY/CSKYInstrInfo.h (+1) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Hexagon/HexagonInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Lanai/LanaiInstrInfo.h (+1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/LoongArch/LoongArchInstrInfo.h (+1) 
- (modified) llvm/lib/Target/M68k/M68kInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/M68k/M68kInstrInfo.h (+1) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/MSP430/MSP430InstrInfo.h (+1) 
- (modified) llvm/lib/Target/Mips/MipsInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/NVPTX/NVPTXInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/PowerPC/PPCInstrInfo.h (+1-2) 
- (modified) llvm/lib/Target/RISCV/RISCVFrameLowering.cpp (+3-3) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/RISCV/RISCVInstrInfo.h (+1) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Sparc/SparcInstrInfo.h (+1-1) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/SystemZ/SystemZInstrInfo.h (+1-2) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/VE/VEInstrInfo.h (+1-2) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/X86/X86InstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/XCore/XCoreInstrInfo.h (+2-3) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.cpp (+1-1) 
- (modified) llvm/lib/Target/Xtensa/XtensaInstrInfo.h (+1-1) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll (+3278-3956) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll (+26-52) 
- (modified) llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll (+24-26) 
- (modified) llvm/test/CodeGen/AMDGPU/dummy-regalloc-priority-advisor.mir (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll (+14-8) 
- (modified) llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll (+35-56) 
- (modified) llvm/test/CodeGen/AMDGPU/inflated-reg-class-snippet-copy-use-after-free.mir (+1) 
- (modified) llvm/test/CodeGen/AMDGPU/infloop-subrange-spill-inspect-subrange.mir (+3-2) 
- (modified) llvm/test/CodeGen/AMDGPU/infloop-subrange-spill.mir (+1-1) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i16.ll (+1-5) 
- (modified) llvm/test/CodeGen/AMDGPU/load-global-i8.ll (+6-11) 
- (modified) llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir (+40-40) 
- (modified) llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir (+82-103) 
- (modified) llvm/test/CodeGen/AMDGPU/regpressure-mitigation-with-subreg-reload.mir (+42-8) 
- (added) llvm/test/CodeGen/AMDGPU/skip-partial-reload-for-16bit-regaccess.mir (+91) 
- (modified) llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll (+14-20) 
- (modified) llvm/test/CodeGen/AMDGPU/splitkit-copy-bundle.mir (+8-6) 
- (modified) llvm/test/CodeGen/AMDGPU/splitkit-copy-live-lanes.mir (+36-36) 
- (modified) llvm/test/CodeGen/AMDGPU/splitkit-nolivesubranges.mir (+2-2) 
- (modified) llvm/test/CodeGen/AMDGPU/swdev502267-use-after-free-last-chance-recoloring-alloc-succeeds.mir (+9-13) 
- (removed) llvm/test/CodeGen/AMDGPU/use-after-free-after-cleanup-failed-vreg.ll (-16) 


``````````diff
diff --git a/llvm/include/llvm/CodeGen/TargetInstrInfo.h b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
index 11adc190b2a62..045372fad4567 100644
--- a/llvm/include/llvm/CodeGen/TargetInstrInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetInstrInfo.h
@@ -1215,6 +1215,7 @@ class LLVM_ABI TargetInstrInfo : public MCInstrInfo {
   virtual void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register DestReg,
       int FrameIndex, const TargetRegisterClass *RC, Register VReg,
+      unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const {
     llvm_unreachable("Target didn't implement "
                      "TargetInstrInfo::loadRegFromStackSlot!");
diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index 5c35cd338feb6..281053cd65922 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -1220,6 +1220,11 @@ class LLVM_ABI TargetRegisterInfo : public MCRegisterInfo {
     return true;
   }
 
+  /// To enable subreg reload of register tuples during RA. This would
+  /// eventually improve the register allocation for the functions that involve
+  /// subreg uses instead of the entire tuple.
+  virtual bool shouldEnableSubRegReload(unsigned SubReg) const { return false; }
+
   /// When prioritizing live ranges in register allocation, if this hook returns
   /// true then the AllocationPriority of the register class will be treated as
   /// more important than whether the range is local to a basic block or global.
@@ -1243,6 +1248,11 @@ class LLVM_ABI TargetRegisterInfo : public MCRegisterInfo {
   bool checkAllSuperRegsMarked(const BitVector &RegisterSet,
       ArrayRef<MCPhysReg> Exceptions = ArrayRef<MCPhysReg>()) const;
 
+  virtual const TargetRegisterClass *
+  getConstrainedRegClass(const TargetRegisterClass *RC) const {
+    return RC;
+  }
+
   virtual const TargetRegisterClass *
   getConstrainedRegClassForOperand(const MachineOperand &MO,
                                    const MachineRegisterInfo &MRI) const {
diff --git a/llvm/lib/CodeGen/InlineSpiller.cpp b/llvm/lib/CodeGen/InlineSpiller.cpp
index 68370303a3aef..c567b88f66a7c 100644
--- a/llvm/lib/CodeGen/InlineSpiller.cpp
+++ b/llvm/lib/CodeGen/InlineSpiller.cpp
@@ -217,7 +217,8 @@ class InlineSpiller : public Spiller {
   bool coalesceStackAccess(MachineInstr *MI, Register Reg);
   bool foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>>,
                          MachineInstr *LoadMI = nullptr);
-  void insertReload(Register VReg, SlotIndex, MachineBasicBlock::iterator MI);
+  void insertReload(Register VReg, unsigned SubReg, SlotIndex,
+                    MachineBasicBlock::iterator MI);
   void insertSpill(Register VReg, bool isKill, MachineBasicBlock::iterator MI);
 
   void spillAroundUses(Register Reg);
@@ -1112,14 +1113,14 @@ foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>> Ops,
   return true;
 }
 
-void InlineSpiller::insertReload(Register NewVReg,
+void InlineSpiller::insertReload(Register NewVReg, unsigned SubReg,
                                  SlotIndex Idx,
                                  MachineBasicBlock::iterator MI) {
   MachineBasicBlock &MBB = *MI->getParent();
 
   MachineInstrSpan MIS(MI, &MBB);
   TII.loadRegFromStackSlot(MBB, MI, NewVReg, StackSlot,
-                           MRI.getRegClass(NewVReg), Register());
+                           MRI.getRegClass(NewVReg), Register(), SubReg);
 
   LIS.InsertMachineInstrRangeInMaps(MIS.begin(), MI);
 
@@ -1248,10 +1249,51 @@ void InlineSpiller::spillAroundUses(Register Reg) {
 
     // Create a new virtual register for spill/fill.
     // FIXME: Infer regclass from instruction alone.
-    Register NewVReg = Edit->createFrom(Reg);
+
+    unsigned SubReg = 0;
+    LaneBitmask CoveringLanes = LaneBitmask::getNone();
+    // If the subreg liveness is enabled, identify the subreg use(s) to try
+    // subreg reload. Skip if the instruction also defines the register.
+    // For copy bundles, get the covering lane masks.
+    if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+      for (auto [MI, OpIdx] : Ops) {
+        const MachineOperand &MO = MI->getOperand(OpIdx);
+        assert(MO.isReg() && MO.getReg() == Reg);
+        if (MO.isUse()) {
+          SubReg = MO.getSubReg();
+          if (SubReg)
+            CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+        }
+      }
+    }
+
+    if (MI.isBundled() && CoveringLanes.any()) {
+      CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
+      // Obtain the covering subregister index, including any missing indices
+      // within the identified small range. Although this may be suboptimal due
+      // to gaps in the subregisters that are not part of the copy bundle, it is
+      // benificial when components outside this range of the original tuple can
+      // be completely skipped from the reload.
+      SubReg = TRI.getSubRegIdxFromLaneMask(CoveringLanes);
+    }
+
+    // If the target doesn't support subreg reload, fallback to restoring the
+    // full tuple.
+    if (SubReg && !TRI.shouldEnableSubRegReload(SubReg))
+      SubReg = 0;
+
+    const TargetRegisterClass *OrigRC = MRI.getRegClass(Reg);
+    const TargetRegisterClass *NewRC =
+        SubReg ? TRI.getSubRegisterClass(OrigRC, SubReg) : nullptr;
+
+    // Check if the target needs to constrain the RC further.
+    if (NewRC)
+      NewRC = TRI.getConstrainedRegClass(NewRC);
+
+    Register NewVReg = Edit->createFrom(Reg, NewRC);
 
     if (RI.Reads)
-      insertReload(NewVReg, Idx, &MI);
+      insertReload(NewVReg, SubReg, Idx, &MI);
 
     // Rewrite instruction operands.
     bool hasLiveDef = false;
@@ -1259,7 +1301,10 @@ void InlineSpiller::spillAroundUses(Register Reg) {
       MachineOperand &MO = OpPair.first->getOperand(OpPair.second);
       MO.setReg(NewVReg);
       if (MO.isUse()) {
-        if (!OpPair.first->isRegTiedToDefOperand(OpPair.second))
+        if (SubReg && !MI.isBundled())
+          MO.setSubReg(0);
+        if (!OpPair.first->isRegTiedToDefOperand(OpPair.second) ||
+            (SubReg && !MI.isBundled()))
           MO.setIsKill();
       } else {
         if (!MO.isDead())
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 7e1dd8f16b337..0d35fb439fd7b 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -6117,7 +6117,7 @@ void AArch64InstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                             MachineBasicBlock::iterator MBBI,
                                             Register DestReg, int FI,
                                             const TargetRegisterClass *RC,
-                                            Register VReg,
+                                            Register VReg, unsigned SubReg,
                                             MachineInstr::MIFlag Flags) const {
   MachineFunction &MF = *MBB.getParent();
   MachineFrameInfo &MFI = MF.getFrameInfo();
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.h b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
index 30943533f3667..2ccde3e661de5 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.h
@@ -368,7 +368,7 @@ class AArch64InstrInfo final : public AArch64GenInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
       Register DestReg, int FrameIndex, const TargetRegisterClass *RC,
-      Register VReg,
+      Register VReg, unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   // This tells target independent code that it is okay to pass instructions
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
index 387edd84cfbed..9d14f8db6c93a 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
@@ -1878,7 +1878,7 @@ void SIInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                        MachineBasicBlock::iterator MI,
                                        Register DestReg, int FrameIndex,
                                        const TargetRegisterClass *RC,
-                                       Register VReg,
+                                       Register VReg, unsigned SubReg,
                                        MachineInstr::MIFlag Flags) const {
   MachineFunction *MF = MBB.getParent();
   SIMachineFunctionInfo *MFI = MF->getInfo<SIMachineFunctionInfo>();
@@ -1886,12 +1886,23 @@ void SIInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
   const DebugLoc &DL = MBB.findDebugLoc(MI);
   unsigned SpillSize = RI.getSpillSize(*RC);
 
+  unsigned SubRegIdx = 0;
+  if (SubReg) {
+    uint64_t Mask = RI.getSubRegIndexLaneMask(SubReg).getAsInteger();
+    assert(llvm::popcount(Mask) % 2 == 0 &&
+           "expected only 32-bit subreg access");
+
+    // For subreg reload, identify the start offset. Each 32-bit register
+    // consists of two regunits and eventually two bits in the Lanemask.
+    SubRegIdx = llvm::countr_zero(Mask) / 2;
+  }
+
   MachinePointerInfo PtrInfo
     = MachinePointerInfo::getFixedStack(*MF, FrameIndex);
 
-  MachineMemOperand *MMO = MF->getMachineMemOperand(
-      PtrInfo, MachineMemOperand::MOLoad, FrameInfo.getObjectSize(FrameIndex),
-      FrameInfo.getObjectAlign(FrameIndex));
+  MachineMemOperand *MMO =
+      MF->getMachineMemOperand(PtrInfo, MachineMemOperand::MOLoad, SpillSize,
+                               FrameInfo.getObjectAlign(FrameIndex));
 
   if (RI.isSGPRClass(RC)) {
     MFI->setHasSpilledSGPRs();
@@ -1911,19 +1922,22 @@ void SIInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
       FrameInfo.setStackID(FrameIndex, TargetStackID::SGPRSpill);
     BuildMI(MBB, MI, DL, OpDesc, DestReg)
         .addFrameIndex(FrameIndex) // addr
-        .addImm(0)                 // offset
+        .addImm(SubRegIdx)         // offset
         .addMemOperand(MMO)
         .addReg(MFI->getStackPtrOffsetReg(), RegState::Implicit);
 
     return;
   }
 
+  // Convert the subreg index to stack offset.
+  SubRegIdx *= 4;
+
   unsigned Opcode = getVectorRegSpillRestoreOpcode(VReg ? VReg : DestReg, RC,
                                                    SpillSize, *MFI);
   BuildMI(MBB, MI, DL, get(Opcode), DestReg)
       .addFrameIndex(FrameIndex)           // vaddr
       .addReg(MFI->getStackPtrOffsetReg()) // scratch_offset
-      .addImm(0)                           // offset
+      .addImm(SubRegIdx)                   // offset
       .addMemOperand(MMO);
 }
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.h b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
index 9373cdb199e29..5134ae780d13b 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.h
@@ -315,6 +315,7 @@ class SIInstrInfo final : public AMDGPUGenInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register DestReg,
       int FrameIndex, const TargetRegisterClass *RC, Register VReg,
+      unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   bool expandPostRAPseudo(MachineInstr &MI) const override;
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
index f21ad7b59cb23..10c2190b0aa5c 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
@@ -3782,6 +3782,15 @@ bool SIRegisterInfo::isAGPR(const MachineRegisterInfo &MRI,
   return RC && isAGPRClass(RC);
 }
 
+bool SIRegisterInfo::shouldEnableSubRegReload(unsigned SubReg) const {
+  // Disable lo16 and hi16 (16-bit) accesses as they are subreg views of the
+  // same 32-bit register and don't represent independent storage. If the number
+  // of bits set in the mask is odd, it indicates the presence of a 16-bit
+  // access as each 32-bit register consists of two Regunits and they take two
+  // bits in the regmask.
+  return getSubRegIndexLaneMask(SubReg).getNumLanes() % 2 == 0;
+}
+
 unsigned SIRegisterInfo::getRegPressureLimit(const TargetRegisterClass *RC,
                                              MachineFunction &MF) const {
   unsigned MinOcc = ST.getOccupancyWithWorkGroupSizes(MF).first;
@@ -3915,6 +3924,11 @@ SIRegisterInfo::getRegClassForSizeOnBank(unsigned Size,
   }
 }
 
+const TargetRegisterClass *
+SIRegisterInfo::getConstrainedRegClass(const TargetRegisterClass *RC) const {
+  return getProperlyAlignedRC(RC);
+}
+
 const TargetRegisterClass *
 SIRegisterInfo::getConstrainedRegClassForOperand(const MachineOperand &MO,
                                          const MachineRegisterInfo &MRI) const {
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
index 1b12eea6dfcc3..daeda09efa202 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.h
@@ -348,6 +348,8 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
   ArrayRef<int16_t> getRegSplitParts(const TargetRegisterClass *RC,
                                      unsigned EltSize) const;
 
+  bool shouldEnableSubRegReload(unsigned SubReg) const override;
+
   unsigned getRegPressureLimit(const TargetRegisterClass *RC,
                                MachineFunction &MF) const override;
 
@@ -371,6 +373,9 @@ class SIRegisterInfo final : public AMDGPUGenRegisterInfo {
     return getRegClassForSizeOnBank(Ty.getSizeInBits(), Bank);
   }
 
+  const TargetRegisterClass *
+  getConstrainedRegClass(const TargetRegisterClass *RC) const override;
+
   const TargetRegisterClass *
   getConstrainedRegClassForOperand(const MachineOperand &MO,
                                  const MachineRegisterInfo &MRI) const override;
diff --git a/llvm/lib/Target/ARC/ARCInstrInfo.cpp b/llvm/lib/Target/ARC/ARCInstrInfo.cpp
index e17ecbf87faae..f43dc550ec5d5 100644
--- a/llvm/lib/Target/ARC/ARCInstrInfo.cpp
+++ b/llvm/lib/Target/ARC/ARCInstrInfo.cpp
@@ -323,7 +323,7 @@ void ARCInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                         MachineBasicBlock::iterator I,
                                         Register DestReg, int FrameIndex,
                                         const TargetRegisterClass *RC,
-                                        Register VReg,
+                                        Register VReg, unsigned SubReg,
                                         MachineInstr::MIFlag Flags) const {
   DebugLoc DL = MBB.findDebugLoc(I);
   MachineFunction &MF = *MBB.getParent();
diff --git a/llvm/lib/Target/ARC/ARCInstrInfo.h b/llvm/lib/Target/ARC/ARCInstrInfo.h
index ebeaf877f8436..03f6977c8f94a 100644
--- a/llvm/lib/Target/ARC/ARCInstrInfo.h
+++ b/llvm/lib/Target/ARC/ARCInstrInfo.h
@@ -76,6 +76,7 @@ class ARCInstrInfo : public ARCGenInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MI, Register DestReg,
       int FrameIndex, const TargetRegisterClass *RC, Register VReg,
+      unsigned subReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   bool
diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
index 02887ce93c525..402a4e30fe3ca 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp
@@ -1212,7 +1212,7 @@ void ARMBaseInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                             MachineBasicBlock::iterator I,
                                             Register DestReg, int FI,
                                             const TargetRegisterClass *RC,
-                                            Register VReg,
+                                            Register VReg, unsigned SubReg,
                                             MachineInstr::MIFlag Flags) const {
   DebugLoc DL;
   if (I != MBB.end()) DL = I->getDebugLoc();
diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
index 04e2ab055cf1a..ab94a113233dc 100644
--- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
+++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.h
@@ -222,7 +222,7 @@ class ARMBaseInstrInfo : public ARMGenInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
       Register DestReg, int FrameIndex, const TargetRegisterClass *RC,
-      Register VReg,
+      Register VReg, unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   bool expandPostRAPseudo(MachineInstr &MI) const override;
diff --git a/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp b/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
index 01f588f0cdc38..c2345f588b937 100644
--- a/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
+++ b/llvm/lib/Target/ARM/Thumb1InstrInfo.cpp
@@ -145,7 +145,7 @@ void Thumb1InstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                            MachineBasicBlock::iterator I,
                                            Register DestReg, int FI,
                                            const TargetRegisterClass *RC,
-                                           Register VReg,
+                                           Register VReg, unsigned SubReg,
                                            MachineInstr::MIFlag Flags) const {
   assert((RC->hasSuperClassEq(&ARM::tGPRRegClass) ||
           (DestReg.isPhysical() && isARMLowRegister(DestReg))) &&
diff --git a/llvm/lib/Target/ARM/Thumb1InstrInfo.h b/llvm/lib/Target/ARM/Thumb1InstrInfo.h
index 289a30a4ca1e4..53add79a508ff 100644
--- a/llvm/lib/Target/ARM/Thumb1InstrInfo.h
+++ b/llvm/lib/Target/ARM/Thumb1InstrInfo.h
@@ -49,7 +49,7 @@ class Thumb1InstrInfo : public ARMBaseInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
       Register DestReg, int FrameIndex, const TargetRegisterClass *RC,
-      Register VReg,
+      Register VReg, unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   bool canCopyGluedNodeDuringSchedule(SDNode *N) const override;
diff --git a/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp b/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
index efb92c9bcac18..76da9692fd9f6 100644
--- a/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
+++ b/llvm/lib/Target/ARM/Thumb2InstrInfo.cpp
@@ -210,7 +210,7 @@ void Thumb2InstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                            MachineBasicBlock::iterator I,
                                            Register DestReg, int FI,
                                            const TargetRegisterClass *RC,
-                                           Register VReg,
+                                           Register VReg, unsigned SubReg,
                                            MachineInstr::MIFlag Flags) const {
   MachineFunction &MF = *MBB.getParent();
   MachineFrameInfo &MFI = MF.getFrameInfo();
diff --git a/llvm/lib/Target/ARM/Thumb2InstrInfo.h b/llvm/lib/Target/ARM/Thumb2InstrInfo.h
index 1e11cb37efc05..c6d38766f6d73 100644
--- a/llvm/lib/Target/ARM/Thumb2InstrInfo.h
+++ b/llvm/lib/Target/ARM/Thumb2InstrInfo.h
@@ -50,7 +50,7 @@ class Thumb2InstrInfo : public ARMBaseInstrInfo {
   void loadRegFromStackSlot(
       MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
       Register DestReg, int FrameIndex, const TargetRegisterClass *RC,
-      Register VReg,
+      Register VReg, unsigned SubReg = 0,
       MachineInstr::MIFlag Flags = MachineInstr::NoFlags) const override;
 
   /// getRegisterInfo - TargetInstrInfo is a superset of MRegister info.  As
diff --git a/llvm/lib/Target/AVR/AVRInstrInfo.cpp b/llvm/lib/Target/AVR/AVRInstrInfo.cpp
index 6c37ba1411dde..3d9fa1854719f 100644
--- a/llvm/lib/Target/AVR/AVRInstrInfo.cpp
+++ b/llvm/lib/Target/AVR/AVRInstrInfo.cpp
@@ -160,7 +160,7 @@ void AVRInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
                                         MachineBasicBlock::iterator MI,
                                         Register DestReg, int FrameIndex,
                                         const TargetRegisterClass *RC,
-                           ...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/175002


More information about the llvm-branch-commits mailing list