[llvm] [AMDGPU] Fix GCNRewritePartialRegUses pass: vector regclass is selected instead of scalar. (PR #69957)
via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 23 11:18:56 PDT 2023
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Valery Pykhtin (vpykhtin)
<details>
<summary>Changes</summary>
For the following testcase:
undef %1.sub1:**sgpr_96** = COPY undef %0:sgpr_32
%3:vgpr_32 = V_LSHL_ADD_U32_e64 %1.sub1:sgpr_96, 2, undef %2:vgpr_32, implicit $exec
GCNRewritePartialRegUses produced:
%4:**vgpr_32** = COPY undef %1:sgpr_32
dead %2:vgpr_32 = V_LSHL_ADD_U32_e64 %4, 2, undef %3:vgpr_32, implicit $exec
Register class for %4 is incorrect: there should be _sgpr_32_ instead of _vgpr_32_ because the original %1 had scalar regclass. This happens because %4 participate only in two instructions from which:
- def in COPY has no reglass information from the instruction description.
- use in V_LSHL_ADD_U32_e64 has _VS_32_ class from the instruction description.
Thus GCNRewritePartialRegUses used only _VS_32_ class and selected vector subclass as the largest - what it missed here is that it should take into account the regclass for _%1.sub1:sgpr_96_. This patch fixes that, debug output after the fix:
```
Try to rewrite partial reg %0:SGPR_96
sub1:SGPR_32 ; <- regclass for %1.sub1:sgpr_96
sub1:SGPR_32 & VS_32 = SGPR_32 ; <- common class for %1.sub1:sgpr_96 and VS_32 from V_LSHL_ADD_U32_e64 opnd
Shift 32, reg align 32
sub1:SGPR_32 -> whole reg, num regclasses 1
Success %0:SGPR_96 -> %4:SGPR_32
```
And here starts another story. All of this deduction of final regclass using regclasses from instruction operands description is really not needed because _TRI->getSubRegisterClass_ for _%1.sub1:sgpr_96_ is already returning the right class - _SGPR_32_ as we see in the debug output above but that haven't been always like that. I did a bisect search and found that _TRI->getSubRegisterClass_ behavior has been fixed by the https://github.com/llvm/llvm-project/pull/67245, thanks to @<!-- -->kosarev. Before his fix getSubRegisterClass returned strange classes like:
`SGPR_128:sub0_sub1 -> SReg_1_with_sub0_and_SReg_1_with_lo16_in_SGPR_LO16`
which isn't even allocatable.
I'm going to cleanup the implementation but I would like to do this in a separate commit. I've tested my patch before the https://github.com/llvm/llvm-project/pull/67245 and it also works.
Note that GCNRewritePartialRegUses pass isn't enabled by default yet.
---
Patch is 321.63 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/69957.diff
4 Files Affected:
- (modified) llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.cpp (+30-29)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-dbg.mir (+15-16)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-gen.mir (+1806-1806)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses.mir (+35-25)
``````````diff
diff --git a/llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.cpp b/llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.cpp
index 99db7e4af9fd1c9..c94e5c16fb2cf2e 100644
--- a/llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.cpp
+++ b/llvm/lib/Target/AMDGPU/GCNRewritePartialRegUses.cpp
@@ -101,17 +101,16 @@ class GCNRewritePartialRegUses : public MachineFunctionPass {
/// find new regclass such that:
/// 1. It has subregs obtained by shifting each OldSubReg by RShift number
/// of bits to the right. Every "shifted" subreg should have the same
- /// SubRegRC. SubRegRC can be null, in this case it initialized using
- /// getSubRegisterClass. If CoverSubregIdx is not zero it's a subreg that
- /// "covers" all other subregs in pairs. Basically such subreg becomes a
- /// whole register.
+ /// SubRegRC. If CoverSubregIdx is not zero it's a subreg that "covers"
+ /// all other subregs in pairs. Basically such subreg becomes a whole
+ /// register.
/// 2. Resulting register class contains registers of minimal size but not
/// less than RegNumBits.
///
/// SubRegs is map of OldSubReg -> [SubRegRC, NewSubReg] and is used as in/out
/// parameter:
/// OldSubReg - input parameter,
- /// SubRegRC - in/out, should be changed for unknown regclass,
+ /// SubRegRC - input parameter (cannot be null),
/// NewSubReg - output, contains shifted subregs on return.
const TargetRegisterClass *
getRegClassWithShiftedSubregs(const TargetRegisterClass *RC, unsigned RShift,
@@ -228,19 +227,7 @@ GCNRewritePartialRegUses::getRegClassWithShiftedSubregs(
BitVector ClassMask(getAllocatableAndAlignedRegClassMask(RCAlign));
for (auto &[OldSubReg, SRI] : SubRegs) {
auto &[SubRegRC, NewSubReg] = SRI;
-
- // Register class may be unknown, for example:
- // undef %0.sub4:sgpr_1024 = S_MOV_B32 01
- // %0.sub5:sgpr_1024 = S_MOV_B32 02
- // %1:vreg_64 = COPY %0.sub4_sub5
- // Register classes for subregs 'sub4' and 'sub5' are known from the
- // description of destination operand of S_MOV_B32 instruction but the
- // class for the subreg 'sub4_sub5' isn't specified by the COPY instruction.
- if (!SubRegRC)
- SubRegRC = TRI->getSubRegisterClass(RC, OldSubReg);
-
- if (!SubRegRC)
- return nullptr;
+ assert(SubRegRC);
LLVM_DEBUG(dbgs() << " " << TRI->getSubRegIndexName(OldSubReg) << ':'
<< TRI->getRegClassName(SubRegRC)
@@ -248,6 +235,8 @@ GCNRewritePartialRegUses::getRegClassWithShiftedSubregs(
<< " -> ");
if (OldSubReg == CoverSubregIdx) {
+ // Covering subreg will become a full register, RC should be allocatable.
+ assert(SubRegRC->isAllocatable());
NewSubReg = AMDGPU::NoSubRegister;
LLVM_DEBUG(dbgs() << "whole reg");
} else {
@@ -425,7 +414,7 @@ bool GCNRewritePartialRegUses::rewriteReg(Register Reg) const {
return false;
for (MachineOperand &MO : Range) {
- if (MO.getSubReg() == AMDGPU::NoSubRegister) // Whole reg used, quit.
+ if (MO.getSubReg() == AMDGPU::NoSubRegister) // Whole reg used, quit. [1]
return false;
}
@@ -433,21 +422,33 @@ bool GCNRewritePartialRegUses::rewriteReg(Register Reg) const {
LLVM_DEBUG(dbgs() << "Try to rewrite partial reg " << printReg(Reg, TRI)
<< ':' << TRI->getRegClassName(RC) << '\n');
- // Collect used subregs and constrained reg classes infered from instruction
+ // Collect used subregs and their reg classes infered from instruction
// operands.
SubRegMap SubRegs;
for (MachineOperand &MO : MRI->reg_nodbg_operands(Reg)) {
- assert(MO.getSubReg() != AMDGPU::NoSubRegister);
- auto *OpDescRC = getOperandRegClass(MO);
- const auto [I, Inserted] = SubRegs.try_emplace(MO.getSubReg(), OpDescRC);
- if (!Inserted && OpDescRC) {
- SubRegInfo &SRI = I->second;
- SRI.RC = SRI.RC ? TRI->getCommonSubClass(SRI.RC, OpDescRC) : OpDescRC;
- if (!SRI.RC) {
- LLVM_DEBUG(dbgs() << " Couldn't find common target regclass\n");
- return false;
+ const unsigned SubReg = MO.getSubReg();
+ assert(SubReg != AMDGPU::NoSubRegister); // Due to [1].
+ LLVM_DEBUG(dbgs() << " " << TRI->getSubRegIndexName(SubReg) << ':');
+
+ const auto [I, Inserted] = SubRegs.try_emplace(SubReg);
+ const TargetRegisterClass *&SubRegRC = I->second.RC;
+
+ if (Inserted)
+ SubRegRC = TRI->getSubRegisterClass(RC, SubReg);
+
+ if (SubRegRC) {
+ if (const TargetRegisterClass *OpDescRC = getOperandRegClass(MO)) {
+ LLVM_DEBUG(dbgs() << TRI->getRegClassName(SubRegRC) << " & "
+ << TRI->getRegClassName(OpDescRC) << " = ");
+ SubRegRC = TRI->getCommonSubClass(SubRegRC, OpDescRC);
}
}
+
+ if (!SubRegRC) {
+ LLVM_DEBUG(dbgs() << "couldn't find target regclass\n");
+ return false;
+ } else
+ LLVM_DEBUG(dbgs() << TRI->getRegClassName(SubRegRC) << '\n');
}
auto *NewRC = getMinSizeReg(RC, SubRegs);
diff --git a/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-dbg.mir b/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-dbg.mir
index 4e1f912a9d6f98e..85d0c054754d03d 100644
--- a/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-dbg.mir
+++ b/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-dbg.mir
@@ -7,7 +7,6 @@
unreachable, !dbg !11
}
- ; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
declare void @llvm.dbg.value(metadata, metadata, metadata) #0
attributes #0 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
@@ -36,21 +35,21 @@ name: test_vreg_96_w64
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_96_w64
- ; CHECK: undef %3.sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec, debug-location !11
- ; CHECK-NEXT: DBG_VALUE %3.sub0, $noreg, !9, !DIExpression(), debug-location !11
- ; CHECK-NEXT: %3.sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec, debug-location !DILocation(line: 2, column: 1, scope: !5)
- ; CHECK-NEXT: DBG_VALUE %3.sub1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)
- ; CHECK-NEXT: S_NOP 0, implicit %3, debug-location !DILocation(line: 3, column: 1, scope: !5)
- ; CHECK-NEXT: undef %4.sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec, debug-location !DILocation(line: 4, column: 1, scope: !5)
- ; CHECK-NEXT: DBG_VALUE %4.sub0, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
- ; CHECK-NEXT: %4.sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec, debug-location !DILocation(line: 5, column: 1, scope: !5)
- ; CHECK-NEXT: DBG_VALUE %4.sub1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
- ; CHECK-NEXT: S_NOP 0, implicit %4, debug-location !DILocation(line: 6, column: 1, scope: !5)
- ; CHECK-NEXT: undef %5.sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec, debug-location !DILocation(line: 4, column: 1, scope: !5)
- ; CHECK-NEXT: DBG_VALUE %5, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
- ; CHECK-NEXT: %5.sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec, debug-location !DILocation(line: 5, column: 1, scope: !5)
- ; CHECK-NEXT: DBG_VALUE %5, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
- ; CHECK-NEXT: S_NOP 0, implicit %5, debug-location !DILocation(line: 6, column: 1, scope: !5)
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec, debug-location !11
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_]].sub0, $noreg, !9, !DIExpression(), debug-location !11
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec, debug-location !DILocation(line: 2, column: 1, scope: !5)
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_]].sub1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]], debug-location !DILocation(line: 3, column: 1, scope: !5)
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec, debug-location !DILocation(line: 4, column: 1, scope: !5)
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_1]].sub0, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec, debug-location !DILocation(line: 5, column: 1, scope: !5)
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_1]].sub1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]], debug-location !DILocation(line: 6, column: 1, scope: !5)
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_2:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec, debug-location !DILocation(line: 4, column: 1, scope: !5)
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_2]], $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
+ ; CHECK-NEXT: [[V_MOV_B32_e32_2:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec, debug-location !DILocation(line: 5, column: 1, scope: !5)
+ ; CHECK-NEXT: DBG_VALUE [[V_MOV_B32_e32_2]], $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_2]], debug-location !DILocation(line: 6, column: 1, scope: !5)
undef %0.sub0:vreg_96 = V_MOV_B32_e32 0, implicit $exec, debug-location !11
DBG_VALUE %0.sub0, $noreg, !9, !DIExpression(), debug-location !11
%0.sub1:vreg_96 = V_MOV_B32_e32 1, implicit $exec, debug-location !DILocation(line: 2, column: 1, scope: !5)
diff --git a/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-gen.mir b/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-gen.mir
index d51e63f92e69141..037f39df8c3e06e 100644
--- a/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-gen.mir
+++ b/llvm/test/CodeGen/AMDGPU/rewrite-partial-reg-uses-gen.mir
@@ -6,26 +6,26 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_subregs_composition_vreg_1024
- ; CHECK: undef %5.sub0:vreg_96 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: %5.sub1:vreg_96 = V_MOV_B32_e32 2, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %5.sub0_sub1
- ; CHECK-NEXT: S_NOP 0, implicit %5.sub1_sub2
- ; CHECK-NEXT: undef %6.sub0:vreg_128 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %6.sub1:vreg_128 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %6.sub0_sub1_sub2
- ; CHECK-NEXT: S_NOP 0, implicit %6.sub1_sub2_sub3
- ; CHECK-NEXT: undef %7.sub0:vreg_160 = V_MOV_B32_e32 21, implicit $exec
- ; CHECK-NEXT: %7.sub1:vreg_160 = V_MOV_B32_e32 22, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %7.sub0_sub1_sub2_sub3
- ; CHECK-NEXT: S_NOP 0, implicit %7.sub1_sub2_sub3_sub4
- ; CHECK-NEXT: undef %8.sub0:vreg_192 = V_MOV_B32_e32 31, implicit $exec
- ; CHECK-NEXT: %8.sub1:vreg_192 = V_MOV_B32_e32 32, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %8.sub0_sub1_sub2_sub3_sub4
- ; CHECK-NEXT: S_NOP 0, implicit %8.sub1_sub2_sub3_sub4_sub5
- ; CHECK-NEXT: undef %9.sub0:vreg_256 = V_MOV_B32_e32 41, implicit $exec
- ; CHECK-NEXT: %9.sub2:vreg_256 = V_MOV_B32_e32 43, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %9.sub0_sub1_sub2_sub3_sub4_sub5
- ; CHECK-NEXT: S_NOP 0, implicit %9.sub2_sub3_sub4_sub5_sub6_sub7
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_96 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_96 = V_MOV_B32_e32 2, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]].sub0_sub1
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]].sub1_sub2
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_128 = V_MOV_B32_e32 11, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_128 = V_MOV_B32_e32 12, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]].sub0_sub1_sub2
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]].sub1_sub2_sub3
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_2:%[0-9]+]].sub0:vreg_160 = V_MOV_B32_e32 21, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_2:%[0-9]+]].sub1:vreg_160 = V_MOV_B32_e32 22, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_2]].sub0_sub1_sub2_sub3
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_2]].sub1_sub2_sub3_sub4
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_3:%[0-9]+]].sub0:vreg_192 = V_MOV_B32_e32 31, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_3:%[0-9]+]].sub1:vreg_192 = V_MOV_B32_e32 32, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_3]].sub0_sub1_sub2_sub3_sub4
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_3]].sub1_sub2_sub3_sub4_sub5
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_4:%[0-9]+]].sub0:vreg_256 = V_MOV_B32_e32 41, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_4:%[0-9]+]].sub2:vreg_256 = V_MOV_B32_e32 43, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_4]].sub0_sub1_sub2_sub3_sub4_sub5
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_4]].sub2_sub3_sub4_sub5_sub6_sub7
undef %0.sub1:vreg_1024 = V_MOV_B32_e32 01, implicit $exec
%0.sub2:vreg_1024 = V_MOV_B32_e32 02, implicit $exec
S_NOP 0, implicit %0.sub1_sub2
@@ -97,12 +97,12 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_96_w64
- ; CHECK: undef %2.sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
- ; CHECK-NEXT: %2.sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %2
- ; CHECK-NEXT: undef %3.sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %3.sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %3
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]]
undef %0.sub0:vreg_96 = V_MOV_B32_e32 00, implicit $exec
%0.sub1:vreg_96 = V_MOV_B32_e32 01, implicit $exec
S_NOP 0, implicit %0.sub0_sub1
@@ -140,15 +140,15 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_128_w64
- ; CHECK: undef %3.sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
- ; CHECK-NEXT: %3.sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %3
- ; CHECK-NEXT: undef %4.sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %4.sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %4
- ; CHECK-NEXT: undef %5.sub0:vreg_64 = V_MOV_B32_e32 22, implicit $exec
- ; CHECK-NEXT: %5.sub1:vreg_64 = V_MOV_B32_e32 23, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %5
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_2:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 22, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_2:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 23, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_2]]
undef %0.sub0:vreg_128 = V_MOV_B32_e32 00, implicit $exec
%0.sub1:vreg_128 = V_MOV_B32_e32 01, implicit $exec
S_NOP 0, implicit %0.sub0_sub1
@@ -168,14 +168,14 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_128_w96
- ; CHECK: undef %2.sub0:vreg_96 = V_MOV_B32_e32 0, implicit $exec
- ; CHECK-NEXT: %2.sub1:vreg_96 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: %2.sub2:vreg_96 = V_MOV_B32_e32 2, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %2
- ; CHECK-NEXT: undef %3.sub0:vreg_96 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %3.sub1:vreg_96 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: %3.sub2:vreg_96 = V_MOV_B32_e32 13, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %3
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_96 = V_MOV_B32_e32 0, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_96 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub2:vreg_96 = V_MOV_B32_e32 2, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_96 = V_MOV_B32_e32 11, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_96 = V_MOV_B32_e32 12, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub2:vreg_96 = V_MOV_B32_e32 13, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]]
undef %0.sub0:vreg_128 = V_MOV_B32_e32 00, implicit $exec
%0.sub1:vreg_128 = V_MOV_B32_e32 01, implicit $exec
%0.sub2:vreg_128 = V_MOV_B32_e32 02, implicit $exec
@@ -215,15 +215,15 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_160_w64
- ; CHECK: undef %3.sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
- ; CHECK-NEXT: %3.sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %3
- ; CHECK-NEXT: undef %4.sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %4.sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %4
- ; CHECK-NEXT: undef %5.sub0:vreg_64 = V_MOV_B32_e32 23, implicit $exec
- ; CHECK-NEXT: %5.sub1:vreg_64 = V_MOV_B32_e32 24, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %5
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 0, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_1:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 11, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 12, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_1]]
+ ; CHECK-NEXT: undef [[V_MOV_B32_e32_2:%[0-9]+]].sub0:vreg_64 = V_MOV_B32_e32 23, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_2:%[0-9]+]].sub1:vreg_64 = V_MOV_B32_e32 24, implicit $exec
+ ; CHECK-NEXT: S_NOP 0, implicit [[V_MOV_B32_e32_2]]
undef %0.sub0:vreg_160 = V_MOV_B32_e32 00, implicit $exec
%0.sub1:vreg_160 = V_MOV_B32_e32 01, implicit $exec
S_NOP 0, implicit %0.sub0_sub1
@@ -243,18 +243,18 @@ tracksRegLiveness: true
body: |
bb.0:
; CHECK-LABEL: name: test_vreg_160_w96
- ; CHECK: undef %3.sub0:vreg_96 = V_MOV_B32_e32 0, implicit $exec
- ; CHECK-NEXT: %3.sub1:vreg_96 = V_MOV_B32_e32 1, implicit $exec
- ; CHECK-NEXT: %3.sub2:vreg_96 = V_MOV_B32_e32 2, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %3
- ; CHECK-NEXT: undef %4.sub0:vreg_96 = V_MOV_B32_e32 11, implicit $exec
- ; CHECK-NEXT: %4.sub1:vreg_96 = V_MOV_B32_e32 12, implicit $exec
- ; CHECK-NEXT: %4.sub2:vreg_96 = V_MOV_B32_e32 13, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %4
- ; CHECK-NEXT: undef %5.sub0:vreg_96 = V_MOV_B32_e32 22, implicit $exec
- ; CHECK-NEXT: %5.sub1:vreg_96 = V_MOV_B32_e32 23, implicit $exec
- ; CHECK-NEXT: %5.sub2:vreg_96 = V_MOV_B32_e32 24, implicit $exec
- ; CHECK-NEXT: S_NOP 0, implicit %5
+ ; CHECK: undef [[V_MOV_B32_e32_:%[0-9]+]].sub0:vreg_96 = V_MOV_B32_e32 0, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub1:vreg_96 = V_MOV_B32_e32 1, implicit $exec
+ ; CHECK-NEXT: [[V_MOV_B32_e32_:%[0-9]+]].sub2:vreg_96 = V_MOV_B32_e32 2, implicit $exec
+ ; CHECK-NEX...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/69957
More information about the llvm-commits
mailing list