[llvm-branch-commits] [llvm] [InlineSpiller][AMDGPU] Implement subreg reload during RA spill (PR #175002)
Christudasan Devadasan via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Sat Jan 24 09:45:47 PST 2026
================
@@ -1248,18 +1249,62 @@ void InlineSpiller::spillAroundUses(Register Reg) {
// Create a new virtual register for spill/fill.
// FIXME: Infer regclass from instruction alone.
- Register NewVReg = Edit->createFrom(Reg);
+
+ unsigned SubReg = 0;
+ LaneBitmask CoveringLanes = LaneBitmask::getNone();
+ // If the subreg liveness is enabled, identify the subreg use(s) to try
+ // subreg reload. Skip if the instruction also defines the register.
+ // For copy bundles, get the covering lane masks.
+ if (MRI.subRegLivenessEnabled() && !RI.Writes) {
+ for (auto [MI, OpIdx] : Ops) {
+ const MachineOperand &MO = MI->getOperand(OpIdx);
+ assert(MO.isReg() && MO.getReg() == Reg);
+ if (MO.isUse()) {
+ SubReg = MO.getSubReg();
+ if (SubReg)
+ CoveringLanes |= TRI.getSubRegIndexLaneMask(SubReg);
+ }
+ }
+ }
+
+ if (MI.isBundled() && CoveringLanes.any()) {
+ CoveringLanes = LaneBitmask(bit_ceil(CoveringLanes.getAsInteger()) - 1);
----------------
cdevadas wrote:
Admittedly, Some of the logics I used in this patch with Lanemask manipulations are somewhat hacky. This code here is needed to correctly handle copy bundles where the individual copies may target non-contiguous subregisters of a tuple. For instance, a bundle containing two copies: one covering sub0_sub1 and another covering sub3 of a 256-bit tuple. With the bit_ceil-based compaction, the resulting lane mask becomes contigous sub0_sub1_sub2_sub3 by filling-in sub2, which is a valid subreg index for this tuple, and the `getSubRegIdxFromLaneMask` helper I added returns the correct SubRegIdx.
Originally, I tried to use `getCoveringSubRegIndexes`. However, I found it isn’t fully reliable for my use case. When the covering mask originally represents a contiguous lane-range (say sub0_sub1_sub2), the function fails to produce the correct index. The root cause is the check at this line https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetRegisterInfo.cpp#L558
Given that behavior, the compaction + lane-mask-to-subregIdx approach was an option that consistently returns the correct index for these irregular bundles. I also understand the how the lane mask is interpreted here. The representation is subtle, and I should find a better alternative to correctly gather the requierd info.
https://github.com/llvm/llvm-project/pull/175002
More information about the llvm-branch-commits
mailing list