[llvm] [TRI] Remove reserved registers in getRegPressureSetLimit (PR #118787)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 5 03:00:24 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-powerpc
Author: Pengcheng Wang (wangpc-pp)
<details>
<summary>Changes</summary>
There are two `getRegPressureSetLimit`:
1. `RegisterClassInfo::getRegPressureSetLimit`.
2. `TargetRegisterInfo::getRegPressureSetLimit`.
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.
It seems that we shouldn't use `TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.
However, there exists some passes that use it directly. For example,
`MachineLICM`, `MachineSink`, `MachinePipeliner`, etc. And in these
passes, the register pressure set limits are not adjusted for reserved
registers, which means that the limits are larger than the actual.
These two `getRegPressureSetLimit`s are messy, and easy to confuse
the users. So here we move the logic of adjusting these limits for
reserved registers in `RegisterClassInfo::getRegPressureSetLimit`
to `TargetRegisterInfo::getRegPressureSetLimit`. This makes the previous
one a thin cached wrapper of the later one.
This change helps to reduce the number of spills/reloads as well.
Here are the RISC-V's statistics of spills/reloads on llvm-test-suite
with `-O3 -march=rva23u64`:
```
Metric: regalloc.NumSpills,regalloc.NumReloads
Program regalloc.NumSpills regalloc.NumReloads
baseline after diff baseline after diff
External/S...T2017speed/602.gcc_s/602.gcc_s 11811.00 11349.00 -462.00 26812.00 25793.00 -1019.00
External/S...NT2017rate/502.gcc_r/502.gcc_r 11811.00 11349.00 -462.00 26812.00 25793.00 -1019.00
External/S...te/526.blender_r/526.blender_r 13513.00 13251.00 -262.00 27462.00 27195.00 -267.00
SingleSour...nchmarks/Adobe-C++/loop_unroll 1533.00 1413.00 -120.00 2943.00 2633.00 -310.00
External/S...00.perlbench_s/600.perlbench_s 4398.00 4280.00 -118.00 9745.00 9466.00 -279.00
External/S...00.perlbench_r/500.perlbench_r 4398.00 4280.00 -118.00 9745.00 9466.00 -279.00
External/S...rate/510.parest_r/510.parest_r 43985.00 43888.00 -97.00 87407.00 87330.00 -77.00
MultiSourc...sumer-typeset/consumer-typeset 1222.00 1129.00 -93.00 3048.00 2887.00 -161.00
External/S...ed/638.imagick_s/638.imagick_s 4155.00 4064.00 -91.00 10556.00 10463.00 -93.00
External/S...te/538.imagick_r/538.imagick_r 4155.00 4064.00 -91.00 10556.00 10463.00 -93.00
External/S...rate/511.povray_r/511.povray_r 1734.00 1657.00 -77.00 3410.00 3290.00 -120.00
MultiSourc...e/Applications/ClamAV/clamscan 2120.00 2049.00 -71.00 5041.00 4994.00 -47.00
External/S...23.xalancbmk_s/623.xalancbmk_s 1664.00 1608.00 -56.00 2758.00 2663.00 -95.00
External/S...23.xalancbmk_r/523.xalancbmk_r 1664.00 1608.00 -56.00 2758.00 2663.00 -95.00
MultiSource/Applications/SPASS/SPASS 1442.00 1388.00 -54.00 2954.00 2849.00 -105.00
regalloc.NumSpills regalloc.NumReloads
run baseline after diff baseline after diff
mean 86.864054 85.415094 -1.448960 1173.354136 170.657475 -2.69666
```
---
Patch is 163.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118787.diff
26 Files Affected:
- (modified) llvm/include/llvm/CodeGen/RegisterClassInfo.h (+1-6)
- (modified) llvm/include/llvm/CodeGen/TargetRegisterInfo.h (+7-2)
- (modified) llvm/lib/CodeGen/MachinePipeliner.cpp (-41)
- (modified) llvm/lib/CodeGen/RegisterClassInfo.cpp (-37)
- (modified) llvm/lib/CodeGen/TargetRegisterInfo.cpp (+44)
- (modified) llvm/test/CodeGen/LoongArch/jr-without-ra.ll (+56-56)
- (modified) llvm/test/CodeGen/NVPTX/misched_func_call.ll (+3-4)
- (modified) llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir (-1)
- (modified) llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir (-1)
- (modified) llvm/test/CodeGen/PowerPC/compute-regpressure.ll (+2-2)
- (modified) llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll (+3-2)
- (modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+61-63)
- (modified) llvm/test/CodeGen/Thumb2/mve-gather-increment.ll (+383-405)
- (modified) llvm/test/CodeGen/Thumb2/mve-gather-scatter-optimisation.ll (+70-70)
- (modified) llvm/test/CodeGen/Thumb2/mve-pipelineloops.ll (+32-43)
- (modified) llvm/test/CodeGen/X86/avx512-regcall-Mask.ll (+2-2)
- (modified) llvm/test/CodeGen/X86/avx512-regcall-NoMask.ll (+4-4)
- (modified) llvm/test/CodeGen/X86/sse-regcall.ll (+4-4)
- (modified) llvm/test/CodeGen/X86/sse-regcall4.ll (+4-4)
- (modified) llvm/test/CodeGen/X86/subvectorwise-store-of-vector-splat.ll (+169-166)
- (modified) llvm/test/CodeGen/X86/unfold-masked-merge-vector-variablemask.ll (+294-262)
- (modified) llvm/test/CodeGen/X86/x86-64-flags-intrinsics.ll (+8-8)
- (modified) llvm/test/TableGen/bare-minimum-psets.td (+1-1)
- (modified) llvm/test/TableGen/inhibit-pset.td (+1-1)
- (modified) llvm/unittests/CodeGen/MFCommon.inc (+2-2)
- (modified) llvm/utils/TableGen/RegisterInfoEmitter.cpp (+4-3)
``````````diff
diff --git a/llvm/include/llvm/CodeGen/RegisterClassInfo.h b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
index 800bebea0dddb0..417a1e40d02b95 100644
--- a/llvm/include/llvm/CodeGen/RegisterClassInfo.h
+++ b/llvm/include/llvm/CodeGen/RegisterClassInfo.h
@@ -141,16 +141,11 @@ class RegisterClassInfo {
}
/// Get the register unit limit for the given pressure set index.
- ///
- /// RegisterClassInfo adjusts this limit for reserved registers.
unsigned getRegPressureSetLimit(unsigned Idx) const {
if (!PSetLimits[Idx])
- PSetLimits[Idx] = computePSetLimit(Idx);
+ PSetLimits[Idx] = TRI->getRegPressureSetLimit(*MF, Idx);
return PSetLimits[Idx];
}
-
-protected:
- unsigned computePSetLimit(unsigned Idx) const;
};
} // end namespace llvm
diff --git a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
index 292fa3c94969be..f7cd7cfe1aa15b 100644
--- a/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+++ b/llvm/include/llvm/CodeGen/TargetRegisterInfo.h
@@ -913,9 +913,14 @@ class TargetRegisterInfo : public MCRegisterInfo {
virtual const char *getRegPressureSetName(unsigned Idx) const = 0;
/// Get the register unit pressure limit for this dimension.
- /// This limit must be adjusted dynamically for reserved registers.
+ /// TargetRegisterInfo adjusts this limit for reserved registers.
virtual unsigned getRegPressureSetLimit(const MachineFunction &MF,
- unsigned Idx) const = 0;
+ unsigned Idx) const;
+
+ /// Get the raw register unit pressure limit for this dimension.
+ /// This limit must be adjusted dynamically for reserved registers.
+ virtual unsigned getRawRegPressureSetLimit(const MachineFunction &MF,
+ unsigned Idx) const = 0;
/// Get the dimensions of register pressure impacted by this register class.
/// Returns a -1 terminated array of pressure set IDs.
diff --git a/llvm/lib/CodeGen/MachinePipeliner.cpp b/llvm/lib/CodeGen/MachinePipeliner.cpp
index 7a10bd39e2695d..3ee0ba1fea5079 100644
--- a/llvm/lib/CodeGen/MachinePipeliner.cpp
+++ b/llvm/lib/CodeGen/MachinePipeliner.cpp
@@ -1327,47 +1327,6 @@ class HighRegisterPressureDetector {
void computePressureSetLimit(const RegisterClassInfo &RCI) {
for (unsigned PSet = 0; PSet < PSetNum; PSet++)
PressureSetLimit[PSet] = TRI->getRegPressureSetLimit(MF, PSet);
-
- // We assume fixed registers, such as stack pointer, are already in use.
- // Therefore subtracting the weight of the fixed registers from the limit of
- // each pressure set in advance.
- SmallDenseSet<Register, 8> FixedRegs;
- for (const TargetRegisterClass *TRC : TRI->regclasses()) {
- for (const MCPhysReg Reg : *TRC)
- if (isFixedRegister(Reg))
- FixedRegs.insert(Reg);
- }
-
- LLVM_DEBUG({
- for (auto Reg : FixedRegs) {
- dbgs() << printReg(Reg, TRI, 0, &MRI) << ": [";
- for (MCRegUnit Unit : TRI->regunits(Reg)) {
- const int *Sets = TRI->getRegUnitPressureSets(Unit);
- for (; *Sets != -1; Sets++) {
- dbgs() << TRI->getRegPressureSetName(*Sets) << ", ";
- }
- }
- dbgs() << "]\n";
- }
- });
-
- for (auto Reg : FixedRegs) {
- LLVM_DEBUG(dbgs() << "fixed register: " << printReg(Reg, TRI, 0, &MRI)
- << "\n");
- for (MCRegUnit Unit : TRI->regunits(Reg)) {
- auto PSetIter = MRI.getPressureSets(Unit);
- unsigned Weight = PSetIter.getWeight();
- for (; PSetIter.isValid(); ++PSetIter) {
- unsigned &Limit = PressureSetLimit[*PSetIter];
- assert(
- Limit >= Weight &&
- "register pressure limit must be greater than or equal weight");
- Limit -= Weight;
- LLVM_DEBUG(dbgs() << "PSet=" << *PSetIter << " Limit=" << Limit
- << " (decreased by " << Weight << ")\n");
- }
- }
- }
}
// There are two patterns of last-use.
diff --git a/llvm/lib/CodeGen/RegisterClassInfo.cpp b/llvm/lib/CodeGen/RegisterClassInfo.cpp
index 9312bc03bc522a..976d41a54da56f 100644
--- a/llvm/lib/CodeGen/RegisterClassInfo.cpp
+++ b/llvm/lib/CodeGen/RegisterClassInfo.cpp
@@ -195,40 +195,3 @@ void RegisterClassInfo::compute(const TargetRegisterClass *RC) const {
// RCI is now up-to-date.
RCI.Tag = Tag;
}
-
-/// This is not accurate because two overlapping register sets may have some
-/// nonoverlapping reserved registers. However, computing the allocation order
-/// for all register classes would be too expensive.
-unsigned RegisterClassInfo::computePSetLimit(unsigned Idx) const {
- const TargetRegisterClass *RC = nullptr;
- unsigned NumRCUnits = 0;
- for (const TargetRegisterClass *C : TRI->regclasses()) {
- const int *PSetID = TRI->getRegClassPressureSets(C);
- for (; *PSetID != -1; ++PSetID) {
- if ((unsigned)*PSetID == Idx)
- break;
- }
- if (*PSetID == -1)
- continue;
-
- // Found a register class that counts against this pressure set.
- // For efficiency, only compute the set order for the largest set.
- unsigned NUnits = TRI->getRegClassWeight(C).WeightLimit;
- if (!RC || NUnits > NumRCUnits) {
- RC = C;
- NumRCUnits = NUnits;
- }
- }
- assert(RC && "Failed to find register class");
- compute(RC);
- unsigned NAllocatableRegs = getNumAllocatableRegs(RC);
- unsigned RegPressureSetLimit = TRI->getRegPressureSetLimit(*MF, Idx);
- // If all the regs are reserved, return raw RegPressureSetLimit.
- // One example is VRSAVERC in PowerPC.
- // Avoid returning zero, getRegPressureSetLimit(Idx) assumes computePSetLimit
- // return non-zero value.
- if (NAllocatableRegs == 0)
- return RegPressureSetLimit;
- unsigned NReserved = RC->getNumRegs() - NAllocatableRegs;
- return RegPressureSetLimit - TRI->getRegClassWeight(RC).RegWeight * NReserved;
-}
diff --git a/llvm/lib/CodeGen/TargetRegisterInfo.cpp b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
index 032f1a33e75c43..4cede283a7232c 100644
--- a/llvm/lib/CodeGen/TargetRegisterInfo.cpp
+++ b/llvm/lib/CodeGen/TargetRegisterInfo.cpp
@@ -674,6 +674,50 @@ TargetRegisterInfo::prependOffsetExpression(const DIExpression *Expr,
PrependFlags & DIExpression::EntryValue);
}
+unsigned TargetRegisterInfo::getRegPressureSetLimit(const MachineFunction &MF,
+ unsigned Idx) const {
+ const TargetRegisterClass *RC = nullptr;
+ unsigned NumRCUnits = 0;
+ for (const TargetRegisterClass *C : regclasses()) {
+ const int *PSetID = getRegClassPressureSets(C);
+ for (; *PSetID != -1; ++PSetID) {
+ if ((unsigned)*PSetID == Idx)
+ break;
+ }
+ if (*PSetID == -1)
+ continue;
+
+ // Found a register class that counts against this pressure set.
+ // For efficiency, only compute the set order for the largest set.
+ unsigned NUnits = getRegClassWeight(C).WeightLimit;
+ if (!RC || NUnits > NumRCUnits) {
+ RC = C;
+ NumRCUnits = NUnits;
+ }
+ }
+ assert(RC && "Failed to find register class");
+
+ unsigned NReserved = 0;
+ const BitVector Reserved = MF.getRegInfo().getReservedRegs();
+ for (unsigned PhysReg : RC->getRawAllocationOrder(MF))
+ if (Reserved.test(PhysReg))
+ NReserved++;
+
+ unsigned NAllocatableRegs = RC->getNumRegs() - NReserved;
+ unsigned RegPressureSetLimit = getRawRegPressureSetLimit(MF, Idx);
+ // If all the regs are reserved, return raw RegPressureSetLimit.
+ // One example is VRSAVERC in PowerPC.
+ // Avoid returning zero, RegisterClassInfo::getRegPressureSetLimit(Idx)
+ // assumes this returns non-zero value.
+ if (NAllocatableRegs == 0) {
+ LLVM_DEBUG({
+ dbgs() << "All registers of " << getRegClassName(RC) << " are reserved!";
+ });
+ return RegPressureSetLimit;
+ }
+ return RegPressureSetLimit - getRegClassWeight(RC).RegWeight * NReserved;
+}
+
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
LLVM_DUMP_METHOD
void TargetRegisterInfo::dumpReg(Register Reg, unsigned SubRegIndex,
diff --git a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
index d1c4459aaa6ee0..2bd89dacb2b37a 100644
--- a/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
+++ b/llvm/test/CodeGen/LoongArch/jr-without-ra.ll
@@ -20,101 +20,101 @@ define void @jr_without_ra(ptr %rtwdev, ptr %chan, ptr %h2c, i8 %.pre, i1 %cmp.i
; CHECK-NEXT: st.d $s6, $sp, 24 # 8-byte Folded Spill
; CHECK-NEXT: st.d $s7, $sp, 16 # 8-byte Folded Spill
; CHECK-NEXT: st.d $s8, $sp, 8 # 8-byte Folded Spill
-; CHECK-NEXT: move $s7, $zero
-; CHECK-NEXT: move $s0, $zero
+; CHECK-NEXT: move $s6, $zero
+; CHECK-NEXT: move $s1, $zero
; CHECK-NEXT: ld.d $t0, $sp, 184
-; CHECK-NEXT: ld.d $s2, $sp, 176
-; CHECK-NEXT: ld.d $s1, $sp, 168
-; CHECK-NEXT: ld.d $t1, $sp, 160
-; CHECK-NEXT: ld.d $t2, $sp, 152
-; CHECK-NEXT: ld.d $t3, $sp, 144
-; CHECK-NEXT: ld.d $t4, $sp, 136
-; CHECK-NEXT: ld.d $t5, $sp, 128
-; CHECK-NEXT: ld.d $t6, $sp, 120
-; CHECK-NEXT: ld.d $t7, $sp, 112
-; CHECK-NEXT: ld.d $t8, $sp, 104
-; CHECK-NEXT: ld.d $fp, $sp, 96
+; CHECK-NEXT: ld.d $t1, $sp, 176
+; CHECK-NEXT: ld.d $s2, $sp, 168
+; CHECK-NEXT: ld.d $t2, $sp, 160
+; CHECK-NEXT: ld.d $t3, $sp, 152
+; CHECK-NEXT: ld.d $t4, $sp, 144
+; CHECK-NEXT: ld.d $t5, $sp, 136
+; CHECK-NEXT: ld.d $t6, $sp, 128
+; CHECK-NEXT: ld.d $t7, $sp, 120
+; CHECK-NEXT: ld.d $t8, $sp, 112
+; CHECK-NEXT: ld.d $fp, $sp, 104
+; CHECK-NEXT: ld.d $s0, $sp, 96
; CHECK-NEXT: andi $a4, $a4, 1
-; CHECK-NEXT: alsl.d $a6, $a6, $s1, 4
-; CHECK-NEXT: pcalau12i $s1, %pc_hi20(.LJTI0_0)
-; CHECK-NEXT: addi.d $s1, $s1, %pc_lo12(.LJTI0_0)
-; CHECK-NEXT: slli.d $s3, $s2, 2
-; CHECK-NEXT: alsl.d $s2, $s2, $s3, 1
-; CHECK-NEXT: add.d $s2, $t5, $s2
-; CHECK-NEXT: addi.w $s4, $zero, -41
+; CHECK-NEXT: alsl.d $a6, $a6, $s2, 4
+; CHECK-NEXT: pcalau12i $s2, %pc_hi20(.LJTI0_0)
+; CHECK-NEXT: addi.d $s2, $s2, %pc_lo12(.LJTI0_0)
; CHECK-NEXT: ori $s3, $zero, 1
-; CHECK-NEXT: slli.d $s4, $s4, 3
-; CHECK-NEXT: ori $s6, $zero, 3
-; CHECK-NEXT: lu32i.d $s6, 262144
+; CHECK-NEXT: ori $s4, $zero, 50
+; CHECK-NEXT: ori $s5, $zero, 3
+; CHECK-NEXT: lu32i.d $s5, 262144
; CHECK-NEXT: b .LBB0_4
; CHECK-NEXT: .p2align 4, , 16
; CHECK-NEXT: .LBB0_1: # %sw.bb27.i.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: ori $s8, $zero, 1
+; CHECK-NEXT: ori $s7, $zero, 1
; CHECK-NEXT: .LBB0_2: # %if.else.i106
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: alsl.d $s5, $s0, $s0, 3
-; CHECK-NEXT: alsl.d $s0, $s5, $s0, 1
-; CHECK-NEXT: add.d $s0, $t0, $s0
-; CHECK-NEXT: ldx.bu $s8, $s0, $s8
+; CHECK-NEXT: alsl.d $s8, $s1, $s1, 3
+; CHECK-NEXT: alsl.d $s1, $s8, $s1, 1
+; CHECK-NEXT: add.d $s1, $t0, $s1
+; CHECK-NEXT: ldx.bu $s7, $s1, $s7
; CHECK-NEXT: .LBB0_3: # %phy_tssi_get_ofdm_de.exit
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: st.b $zero, $t5, 0
-; CHECK-NEXT: st.b $s7, $t3, 0
-; CHECK-NEXT: st.b $zero, $t8, 0
-; CHECK-NEXT: st.b $zero, $t1, 0
-; CHECK-NEXT: st.b $zero, $a1, 0
+; CHECK-NEXT: st.b $zero, $t6, 0
+; CHECK-NEXT: st.b $s6, $t4, 0
+; CHECK-NEXT: st.b $zero, $fp, 0
; CHECK-NEXT: st.b $zero, $t2, 0
-; CHECK-NEXT: st.b $s8, $a5, 0
-; CHECK-NEXT: ori $s0, $zero, 1
-; CHECK-NEXT: move $s7, $a3
+; CHECK-NEXT: st.b $zero, $a1, 0
+; CHECK-NEXT: st.b $zero, $t3, 0
+; CHECK-NEXT: st.b $s7, $a5, 0
+; CHECK-NEXT: ori $s1, $zero, 1
+; CHECK-NEXT: move $s6, $a3
; CHECK-NEXT: .LBB0_4: # %for.body
; CHECK-NEXT: # =>This Inner Loop Header: Depth=1
; CHECK-NEXT: beqz $a4, .LBB0_9
; CHECK-NEXT: # %bb.5: # %calc_6g.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: move $s7, $zero
+; CHECK-NEXT: move $s6, $zero
; CHECK-NEXT: bnez $zero, .LBB0_8
; CHECK-NEXT: # %bb.6: # %calc_6g.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: slli.d $s8, $zero, 3
-; CHECK-NEXT: ldx.d $s8, $s8, $s1
-; CHECK-NEXT: jr $s8
+; CHECK-NEXT: slli.d $s7, $zero, 3
+; CHECK-NEXT: ldx.d $s7, $s7, $s2
+; CHECK-NEXT: jr $s7
; CHECK-NEXT: .LBB0_7: # %sw.bb12.i.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: ori $s7, $zero, 1
+; CHECK-NEXT: ori $s6, $zero, 1
; CHECK-NEXT: .LBB0_8: # %if.else58.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: ldx.bu $s7, $a6, $s7
+; CHECK-NEXT: ldx.bu $s6, $a6, $s6
; CHECK-NEXT: b .LBB0_11
; CHECK-NEXT: .p2align 4, , 16
; CHECK-NEXT: .LBB0_9: # %if.end.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: andi $s7, $s7, 255
-; CHECK-NEXT: ori $s5, $zero, 50
-; CHECK-NEXT: bltu $s5, $s7, .LBB0_15
+; CHECK-NEXT: andi $s6, $s6, 255
+; CHECK-NEXT: bltu $s4, $s6, .LBB0_15
; CHECK-NEXT: # %bb.10: # %if.end.i
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: sll.d $s7, $s3, $s7
-; CHECK-NEXT: and $s8, $s7, $s6
-; CHECK-NEXT: move $s7, $fp
-; CHECK-NEXT: beqz $s8, .LBB0_15
+; CHECK-NEXT: sll.d $s6, $s3, $s6
+; CHECK-NEXT: and $s7, $s6, $s5
+; CHECK-NEXT: move $s6, $s0
+; CHECK-NEXT: beqz $s7, .LBB0_15
; CHECK-NEXT: .LBB0_11: # %phy_tssi_get_ofdm_trim_de.exit
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
-; CHECK-NEXT: move $s8, $zero
-; CHECK-NEXT: st.b $zero, $t7, 0
-; CHECK-NEXT: ldx.b $ra, $s2, $t4
+; CHECK-NEXT: move $s7, $zero
+; CHECK-NEXT: st.b $zero, $t8, 0
+; CHECK-NEXT: slli.d $s8, $t1, 2
+; CHECK-NEXT: alsl.d $s8, $t1, $s8, 1
+; CHECK-NEXT: add.d $s8, $t6, $s8
+; CHECK-NEXT: ldx.b $s8, $s8, $t5
; CHECK-NEXT: st.b $zero, $a2, 0
; CHECK-NEXT: st.b $zero, $a7, 0
-; CHECK-NEXT: st.b $zero, $t6, 0
-; CHECK-NEXT: st.b $ra, $a0, 0
+; CHECK-NEXT: st.b $zero, $t7, 0
+; CHECK-NEXT: st.b $s8, $a0, 0
; CHECK-NEXT: bnez $s3, .LBB0_13
; CHECK-NEXT: # %bb.12: # %phy_tssi_get_ofdm_trim_de.exit
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
+; CHECK-NEXT: addi.w $s8, $zero, -41
+; CHECK-NEXT: slli.d $s8, $s8, 3
; CHECK-NEXT: pcalau12i $ra, %pc_hi20(.LJTI0_1)
; CHECK-NEXT: addi.d $ra, $ra, %pc_lo12(.LJTI0_1)
-; CHECK-NEXT: ldx.d $s5, $s4, $ra
-; CHECK-NEXT: jr $s5
+; CHECK-NEXT: ldx.d $s8, $s8, $ra
+; CHECK-NEXT: jr $s8
; CHECK-NEXT: .LBB0_13: # %phy_tssi_get_ofdm_trim_de.exit
; CHECK-NEXT: # in Loop: Header=BB0_4 Depth=1
; CHECK-NEXT: bnez $s3, .LBB0_1
diff --git a/llvm/test/CodeGen/NVPTX/misched_func_call.ll b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
index e036753ce90306..ee6b5869111c6f 100644
--- a/llvm/test/CodeGen/NVPTX/misched_func_call.ll
+++ b/llvm/test/CodeGen/NVPTX/misched_func_call.ll
@@ -17,7 +17,6 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
; CHECK-NEXT: ld.param.u32 %r2, [wombat_param_0];
; CHECK-NEXT: mov.b32 %r10, 0;
; CHECK-NEXT: mov.u64 %rd1, 0;
-; CHECK-NEXT: mov.b32 %r6, 1;
; CHECK-NEXT: $L__BB0_1: // %bb3
; CHECK-NEXT: // =>This Inner Loop Header: Depth=1
; CHECK-NEXT: { // callseq 0, 0
@@ -29,16 +28,16 @@ define ptx_kernel void @wombat(i32 %arg, i32 %arg1, i32 %arg2) {
; CHECK-NEXT: (
; CHECK-NEXT: param0
; CHECK-NEXT: );
+; CHECK-NEXT: ld.param.f64 %fd1, [retval0];
+; CHECK-NEXT: } // callseq 0
; CHECK-NEXT: mul.lo.s32 %r7, %r10, %r3;
; CHECK-NEXT: or.b32 %r8, %r4, %r7;
; CHECK-NEXT: mul.lo.s32 %r9, %r2, %r8;
; CHECK-NEXT: cvt.rn.f64.s32 %fd3, %r9;
-; CHECK-NEXT: ld.param.f64 %fd1, [retval0];
-; CHECK-NEXT: } // callseq 0
; CHECK-NEXT: cvt.rn.f64.u32 %fd4, %r10;
; CHECK-NEXT: add.rn.f64 %fd5, %fd4, %fd3;
; CHECK-NEXT: st.global.f64 [%rd1], %fd5;
-; CHECK-NEXT: mov.u32 %r10, %r6;
+; CHECK-NEXT: mov.b32 %r10, 1;
; CHECK-NEXT: bra.uni $L__BB0_1;
bb:
br label %bb3
diff --git a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
index fba410dc0dafce..7c8a5848b402f4 100644
--- a/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix-csr-alloc.mir
@@ -17,5 +17,4 @@ body: |
...
# CHECK-DAG: AllocationOrder(GPRC) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r0 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
-# CHECK-DAG: AllocationOrder(F4RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
# CHECK-DAG: AllocationOrder(GPRC_and_GPRC_NOR0) = [ $r3 $r4 $r5 $r6 $r7 $r8 $r9 $r10 $r11 $r12 $r31 $r30 $r29 $r28 $r27 $r26 $r25 $r24 $r23 $r22 $r21 $r20 $r19 $r18 $r17 $r16 $r15 $r14 $r13 ]
diff --git a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
index 584b6b0ad46dd9..3617b95b2a6af7 100644
--- a/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
+++ b/llvm/test/CodeGen/PowerPC/aix64-csr-alloc.mir
@@ -16,6 +16,5 @@ body: |
$f1 = COPY %2
BLR8 implicit $lr8, implicit undef $rm, implicit $x3, implicit $f1
...
-# CHECK-DAG: AllocationOrder(VFRC) = [ $vf2 $vf3 $vf4 $vf5 $vf0 $vf1 $vf6 $vf7 $vf8 $vf9 $vf10 $vf11 $vf12 $vf13 $vf14 $vf15 $vf16 $vf17 $vf18 $vf19 $vf31 $vf30 $vf29 $vf28 $vf27 $vf26 $vf25 $vf24 $vf23 $vf22 $vf21 $vf20 ]
# CHECK-DAG: AllocationOrder(G8RC_and_G8RC_NOX0) = [ $x3 $x4 $x5 $x6 $x7 $x8 $x9 $x10 $x11 $x12 $x2 $x31 $x30 $x29 $x28 $x27 $x26 $x25 $x24 $x23 $x22 $x21 $x20 $x19 $x18 $x17 $x16 $x15 $x14 ]
# CHECK-DAG: AllocationOrder(F8RC) = [ $f0 $f1 $f2 $f3 $f4 $f5 $f6 $f7 $f8 $f9 $f10 $f11 $f12 $f13 $f31 $f30 $f29 $f28 $f27 $f26 $f25 $f24 $f23 $f22 $f21 $f20 $f19 $f18 $f17 $f16 $f15 $f14 ]
diff --git a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
index 9a1b057c2e38d4..9d893b8dbebee2 100644
--- a/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
+++ b/llvm/test/CodeGen/PowerPC/compute-regpressure.ll
@@ -1,7 +1,7 @@
; REQUIRES: asserts
-; RUN: llc -debug-only=regalloc < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
+; RUN: llc -debug-only=target-reg-info < %s 2>&1 |FileCheck %s --check-prefix=DEBUG
-; DEBUG-COUNT-1: AllocationOrder(VRSAVERC) = [ ]
+; DEBUG-COUNT-1: All registers of VRSAVERC are reserved!
target triple = "powerpc64le-unknown-linux-gnu"
diff --git a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
index c35f05be304cce..ec2448cb3965f3 100644
--- a/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
+++ b/llvm/test/CodeGen/RISCV/rvv/vxrm-insert-out-of-loop.ll
@@ -489,8 +489,9 @@ define void @test1(ptr nocapture noundef writeonly %dst, i32 noundef signext %i_
; RV64-NEXT: j .LBB0_11
; RV64-NEXT: .LBB0_8: # %vector.ph
; RV64-NEXT: # in Loop: Header=BB0_6 Depth=1
-; RV64-NEXT: slli t6, t0, 28
-; RV64-NEXT: sub t6, t6, t1
+; RV64-NEXT: slli t6, t0, 1
+; RV64-NEXT: slli s0, t0, 28
+; RV64-NEXT: sub t6, s0, t6
; RV64-NEXT: and t6, t6, a6
; RV64-NEXT: csrwi vxrm, 0
; RV64-NEXT: mv s0, a2
diff --git a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
index 7087041e8dace6..6d082802f9cd75 100644
--- a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
@@ -353,8 +353,8 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, r10, r11, lr}
; CHECK-NEXT: .pad #4
; CHECK-NEXT: sub sp, #4
-; CHECK-NEXT: .vsave {d8, d9, d10, d11, d12, d13, d14, d15}
-; CHECK-NEXT: vpush {d8, d9, d10, d11, d12, d13, d14, d15}
+; CHECK-NEXT: .vsave {d8, d9, d10, d11, d12, d13}
+; CHECK-NEXT: vpush {d8, d9, d10, d11, d12, d13}
; CHECK-NEXT: .pad #16
; CHECK-NEXT: sub sp, #16
; CHECK-NEXT: mov lr, r0
@@ -364,50 +364,48 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
; CHECK-NEXT: @ %bb.1: @ %for.cond2.preheader.lr.ph
; CHECK-NEXT: movs r0, #1
; CHECK-NEXT: cmp r2, #1
-; CHECK-NEXT: csel r7, r2, r0, lt
+; CHECK-NEXT: csel r3, r2, r0, lt
; CHECK-NEXT: mov r12, r1
-; CHECK-NEXT: mov r1, r7
-; CHECK-NEXT: cmp r7, #3
+; CHECK-NEXT: m...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/118787
More information about the llvm-commits
mailing list