[llvm-branch-commits] [llvm] [InlineSpiller][AMDGPU] Implement subreg reload during RA spill (PR #175002)
Christudasan Devadasan via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Sat Jan 24 09:45:48 PST 2026
================
@@ -1248,18 +1249,62 @@ void InlineSpiller::spillAroundUses(Register Reg) {
// Create a new virtual register for spill/fill.
// FIXME: Infer regclass from instruction alone.
- Register NewVReg = Edit->createFrom(Reg);
+
+ unsigned SubReg = 0;
+ LaneBitmask CoveringLanes = LaneBitmask::getNone();
+ // If the subreg liveness is enabled, identify the subreg use(s) to try
+ // subreg reload. Skip if the instruction also defines the register.
+ // For copy bundles, get the covering lane masks.
+ if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+ for (auto [MI, OpIdx] : Ops) {
+ const MachineOperand &MO = MI->getOperand(OpIdx);
+ assert(MO.isReg() && MO.getReg() == Reg);
+ if (MO.isUse()) {
+ SubReg = MO.getSubReg();
+ if (SubReg)
+ CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+ }
+ }
+ }
+
+ if (MI.isBundled() && CoveringLanes.any()) {
+ CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
+ // Obtain the covering subregister index, including any missing indices
+ // within the identified small range. Although this may be suboptimal due
+ // to gaps in the subregisters that are not part of the copy bundle, it is
+ // benificial when components outside this range of the original tuple can
+ // be completely skipped from the reload.
+ SubReg = TRI.getSubRegIdxFromLaneMask(CoveringLanes);
+ }
+
+ // If the target doesn't support subreg reload, fallback to restoring the
+ // full tuple.
+ if (SubReg && !TRI.shouldEnableSubRegReload(SubReg))
----------------
cdevadas wrote:
Yes, the targets can choose to ignore the subreg field passed to them within their loadRegFromStackSlot. However, that info (whether the target implemented subreg reload or not) should be returned to this callsite as we remove the subreg field in the use instruction as part of this optimization when the target truly implements the partial reload by constructing a concrete class for the subreg access. See the transition explained here.
%tuple:VReg_128 = ... ; 128-bit tuple (4x32-bit)
SPILL_V128 %tuple to stack
...
; Later, only need sub1 (second 32-bit component)
; Current implementation - restore full.
%reload:Vreg_128 = RESTORE_V128, ofst:0
%val = USE **%reload.sub1**
; With subreg reload implemented.
%reload:VGPR_32 = RESTORE_V32, ofst:4
%val = USE **%reload** // drop the subreg.
The subreg fields are dropped in the InlineSpiller at https://github.com/llvm/llvm-project/pull/175002/files#diff-855df7e3f96ef7f3f499fdafba308dde780d710f717f19158a2f39059c8a6f5dR1305.
To pass the subreg info always inside `loadRegFromStackSlot` and let the targets decide whether to implement it or not, requires some changes in the way how InlineSpiller and this target hook interact.
https://github.com/llvm/llvm-project/pull/175002
More information about the llvm-branch-commits
mailing list