[llvm] [AMDGPU] Allow allocation of lo128 registers from all banks (PR #172614)
via llvm-commits
llvm-commits at lists.llvm.org
Wed Dec 17 01:04:38 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-amdgpu
Author: Stanislav Mekhanoshin (rampitec)
<details>
<summary>Changes</summary>
We can encode 16-bit operands in a short form for VGPRs [0..127].
When we have 1K registers available we can in fact allocate 4
times more from all 4 banks. That, however, requires an allocatable
class for these operands. When for most of the instructions it will
result in the VOP3 longer form, for V_FMAAMK/FMADAK_F16 it will
simply prohibit the encoding because these do not have VOP3 forms.
A straight forward solution would be to create a register class
with all registers having bit 8 of the encoding zero, i.e. to
create a register class with holes punched in it: [0-127, 256-383,
512-639, 768-895]. LLVM, however, does not like register classes
with punched holes when they also have subregisters. The cross-
product of all classes explodes and some combinations of a 'class
having a common subreg with another' becomeing impossible. Just
doing so explodes our register info to 4+Gb, uncompilable too.
The solution proposed is to define _lo128 RC with contigous 896
VGPRs, but the allocation order of it hiding prohibited registers.
That keeps generated register info a reasonable size (+~50%).
The same is needed to VGPR_16_Lo128 in true16 mode.
In general we could later reuse VGPR_32 with AltOrderSelect, but
we would need to pass there operand type and deal with the AsmParser.
One other consideration is that we can allocate a register leaving
a hole of the whole 128 registers, but a subsequent patch can fix
it, i.e. by the time of the RA we really know estimated register
pressure and can further limit allocation order.
For the short: w/o it we either have spilling when we have VGPRs
available, or outright have 'run out of registers' when we have
a lot of 16-bit registers used, a lot of budget, but we cannot
encode it.
---
Patch is 442.47 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/172614.diff
42 Files Affected:
- (modified) llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp (+15-1)
- (modified) llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp (+5-1)
- (modified) llvm/lib/Target/AMDGPU/SIInstrFormats.td (+21)
- (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+1-1)
- (modified) llvm/lib/Target/AMDGPU/SIRegisterInfo.td (+9-11)
- (modified) llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp (+2-1)
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll (+16-16)
- (modified) llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-ignore-copies-crash.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/branch-relax-indirect-branch.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/branch-relax-no-terminators.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/coalesce-copy-to-agpr-to-av-registers.mir (+120-120)
- (modified) llvm/test/CodeGen/AMDGPU/coalescer-early-clobber-subreg.mir (+8-8)
- (modified) llvm/test/CodeGen/AMDGPU/dst-sel-hazard.mir (+12-12)
- (modified) llvm/test/CodeGen/AMDGPU/hazards-gfx950.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/inflate-reg-class-vgpr-mfma-to-av-with-load-source.mir (+6-6)
- (modified) llvm/test/CodeGen/AMDGPU/inline-asm.i128.ll (+12-12)
- (modified) llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx10.mir (+15-15)
- (modified) llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx8.mir (+180-180)
- (modified) llvm/test/CodeGen/AMDGPU/local-stack-alloc-add-references.gfx9.mir (+90-90)
- (modified) llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir (+3-3)
- (modified) llvm/test/CodeGen/AMDGPU/no-limit-coalesce.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/optimize-exec-mask-pre-ra-no-fold-exec-copy.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/partial-regcopy-and-spill-missed-at-regalloc.ll (+12-12)
- (added) llvm/test/CodeGen/AMDGPU/regalloc-vgpr_lo128-gfx1250.mir (+94)
- (modified) llvm/test/CodeGen/AMDGPU/rename-independent-subregs.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-copy-from.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-subreg-insert-extract.mir (+6-6)
- (modified) llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-subreg-src2-chain.mir (+18-18)
- (modified) llvm/test/CodeGen/AMDGPU/sched-assert-dead-def-subreg-use-other-subreg.mir (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/sched-handleMoveUp-subreg-def-across-subreg-def.mir (+8-8)
- (added) llvm/test/CodeGen/AMDGPU/shrink-vgpr_lo128-gfx1250.mir (+89)
- (modified) llvm/test/CodeGen/AMDGPU/spill-vector-superclass.ll (+2-2)
- (modified) llvm/test/CodeGen/AMDGPU/subreg-undef-def-with-other-subreg-defs.mir (+12-12)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1_dpp16.txt (+46-46)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1_dpp8.txt (+46-23)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop2_dpp16.txt (+6-6)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop2_dpp8.txt (+6-3)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx1250_dasm_vop1_dpp8.txt (+13-13)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp16.txt (+46-46)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp8.txt (+49-46)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop2_dpp16.txt (+6-6)
- (modified) llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop2_dpp8.txt (+6-3)
``````````diff
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 7a91a40e18cde..8430f2e3b4344 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -426,6 +426,8 @@ class AMDGPUOperand : public MCParsedAsmOperand {
return isRegOrInline(RCID, type) && !hasModifiers();
}
+ bool isVGPR32_Lo128() const;
+
bool isSCSrcB16() const {
return isRegOrInlineNoMods(AMDGPU::SReg_32RegClassID, MVT::i16);
}
@@ -2243,7 +2245,19 @@ bool AMDGPUOperand::isLiteralImm(MVT type) const {
}
bool AMDGPUOperand::isRegClass(unsigned RCID) const {
- return isRegKind() && AsmParser->getMRI()->getRegClass(RCID).contains(getReg());
+ if (!isRegKind() ||
+ !AsmParser->getMRI()->getRegClass(RCID).contains(getReg()))
+ return false;
+ if (RCID == AMDGPU::VGPR_32_Lo128RegClassID ||
+ RCID == AMDGPU::VS_32_Lo128RegClassID)
+ return getReg() <= AMDGPU::VGPR127 || getReg() > AMDGPU::VGPR1023;
+ return true;
+}
+
+bool AMDGPUOperand::isVGPR32_Lo128() const {
+ if (!isRegKind())
+ return false;
+ return getReg() >= AMDGPU::VGPR0 && getReg() <= AMDGPU::VGPR127;
}
bool AMDGPUOperand::isVRegWithInputMods() const {
diff --git a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
index b63d71dc2fde9..55cd800d201db 100644
--- a/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
@@ -813,7 +813,11 @@ void AMDGPUInstPrinter::printRegularOperand(const MCInst *MI, unsigned OpNo,
OpInfo, STI.getHwMode(MCSubtargetInfo::HwMode_RegInfo));
const MCRegisterClass &RC = MRI.getRegClass(RCID);
auto Reg = mc2PseudoReg(Op.getReg());
- if (!RC.contains(Reg) && !isInlineValue(Reg)) {
+ bool Err = !RC.contains(Reg) && !isInlineValue(Reg);
+ if (!Err && (RCID == AMDGPU::VGPR_32_Lo128RegClassID ||
+ RCID == AMDGPU::VS_32_Lo128RegClassID))
+ Err = Reg >= AMDGPU::VGPR128 && Reg <= AMDGPU::VGPR1023;
+ if (Err) {
bool IsWaveSizeOp = OpInfo.isLookupRegClassByHwMode() &&
(OpInfo.RegClass == AMDGPU::SReg_1 ||
OpInfo.RegClass == AMDGPU::SReg_1_XEXEC);
diff --git a/llvm/lib/Target/AMDGPU/SIInstrFormats.td b/llvm/lib/Target/AMDGPU/SIInstrFormats.td
index de66c472be0ca..cb036b517f2e2 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrFormats.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrFormats.td
@@ -323,6 +323,12 @@ def CPolBit {
class VOPDstOperand<RegisterClassLike rc> : RegisterOperand<rc, "printVOPDst">;
+def Vgpr32Lo128 : AsmOperandClass {
+ let Name = "Vgpr32Lo128";
+ let PredicateMethod = "isVGPR32_Lo128";
+ let RenderMethod = "addRegOperands";
+}
+
def VOPDstOperand_t16 : VOPDstOperand <VGPR_16> {
let EncoderMethod = "getMachineOpValueT16";
let DecoderMethod = "DecodeVGPR_16RegisterClass";
@@ -333,12 +339,27 @@ def VOPDstOperand_t16Lo128 : VOPDstOperand <VGPR_16_Lo128> {
let DecoderMethod = "DecodeVGPR_16_Lo128RegisterClass";
}
+def VOPDstOperand_Vgpr32Lo128 : VOPDstOperand <VGPR_32_Lo128> {
+ let ParserMatchClass = Vgpr32Lo128;
+}
+
// Source-encoded destination operand for instructions like v_swap_b16.
def VOPSrcEncodedDstOperand_t16Lo128 : VOPDstOperand <VGPR_16_Lo128> {
let EncoderMethod = VSrcT_b16_Lo128.EncoderMethod;
let DecoderMethod = VSrcT_b16_Lo128.DecoderMethod;
}
+
+def VGPROp_16_Lo128 : RegisterOperand<VGPR_16_Lo128> {
+ let DecoderMethod = "DecodeVGPR_16_Lo128RegisterClass";
+ let EncoderMethod = "getMachineOpValueT16Lo128";
+}
+
+def VGPROp_32_Lo128 : RegisterOperand<VGPR_32_Lo128> {
+ let DecoderMethod = "DecodeVGPR_32RegisterClass";
+ let ParserMatchClass = Vgpr32Lo128;
+}
+
class VINTRPe <bits<2> op> : Enc32 {
bits<8> vdst;
bits<8> vsrc;
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index 628b972f97086..8738df31a9a56 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -1817,7 +1817,7 @@ class getVALUDstForVT_fake16<ValueType VT> {
RegisterOperand ret = !if(!eq(VT.Size, 32), VOPDstOperand<VGPR_32>,
!if(!eq(VT.Size, 128), VOPDstOperand<VReg_128_AlignTarget>,
!if(!eq(VT.Size, 64), VOPDstOperand<VReg_64_AlignTarget>,
- !if(!eq(VT.Size, 16), VOPDstOperand<VGPR_32_Lo128>,
+ !if(!eq(VT.Size, 16), VOPDstOperand_Vgpr32Lo128,
VOPDstS64orS32)))); // else VT == i1
}
diff --git a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
index 272d4b5609dfb..838d31df5bacd 100644
--- a/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIRegisterInfo.td
@@ -632,13 +632,20 @@ def VGPR_32 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types
let BaseClassOrder = 32;
}
-// Identical to VGPR_32 except it only contains the low 128 (Lo128) registers.
+// Identical to VGPR_32 except only the low 128 (Lo128) registers in each
+// register bank are allocatable.
def VGPR_32_Lo128 : SIRegisterClass<"AMDGPU", !listconcat(Reg32Types.types, Reg16Types.types), 32,
- (add (sequence "VGPR%u", 0, 127))> {
+ (add (sequence "VGPR%u", 0, 895))> {
+ let AltOrders = [(add (sequence "VGPR%u", 0, 127),
+ (sequence "VGPR%u", 256, 383),
+ (sequence "VGPR%u", 512, 639),
+ (sequence "VGPR%u", 768, 895))];
+ let AltOrderSelect = [{ return 1; }];
let AllocationPriority = !add(0, !mul(BaseClassPriority, BaseClassScaleFactor));
let GeneratePressureSet = 0;
let Size = 32;
let Weight = 1;
+ let BaseClassOrder = 33;
}
// Identical to VGPR_32 except it only contains the low 256 (Lo256) registers.
@@ -1487,15 +1494,6 @@ foreach size = ["64", "96", "128", "160", "192", "224", "256", "288", "320", "35
def VGPROp_#size#_Align2 : RegisterOperand<!cast<RegisterClassLike>("VReg_"#size#_Align2)>;
}
-def VGPROp_16_Lo128 : RegisterOperand<VGPR_16_Lo128> {
- let DecoderMethod = "DecodeVGPR_16_Lo128RegisterClass";
- let EncoderMethod = "getMachineOpValueT16Lo128";
-}
-
-def VGPROp_32_Lo128 : RegisterOperand<VGPR_32_Lo128> {
- let DecoderMethod = "DecodeVGPR_32RegisterClass";
-}
-
//===----------------------------------------------------------------------===//
// ASrc_* Operands with an AccVGPR
//===----------------------------------------------------------------------===//
diff --git a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
index 1b78f67e76d07..df8abccf97362 100644
--- a/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+++ b/llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
@@ -157,7 +157,8 @@ bool SIShrinkInstructions::shouldShrinkTrue16(MachineInstr &MI) const {
assert(!Reg.isVirtual() && "Prior checks should ensure we only shrink "
"True16 Instructions post-RA");
if (AMDGPU::VGPR_32RegClass.contains(Reg) &&
- !AMDGPU::VGPR_32_Lo128RegClass.contains(Reg))
+ !llvm::is_contained(
+ AMDGPU::VGPR_32_Lo128RegClass.getRawAllocationOrder(*MF), Reg))
return false;
if (AMDGPU::VGPR_16RegClass.contains(Reg) &&
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
index b290c314f1154..4089e21040b24 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-inline-asm.ll
@@ -66,7 +66,7 @@ define amdgpu_kernel void @asm_simple_agpr_clobber() {
define i32 @asm_vgpr_early_clobber() {
; CHECK-LABEL: name: asm_vgpr_early_clobber
; CHECK: bb.1 (%ir-block.0):
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7; v_mov_b32 $1, 7", 1 /* sideeffect attdialect */, 1245195 /* regdef-ec:VGPR_32 */, def early-clobber %8, 1245195 /* regdef-ec:VGPR_32 */, def early-clobber %9, !1
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7; v_mov_b32 $1, 7", 1 /* sideeffect attdialect */, 1376267 /* regdef-ec:VGPR_32 */, def early-clobber %8, 1376267 /* regdef-ec:VGPR_32 */, def early-clobber %9, !1
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %9
; CHECK-NEXT: [[ADD:%[0-9]+]]:_(s32) = G_ADD [[COPY]], [[COPY1]]
@@ -94,7 +94,7 @@ entry:
define i32 @test_single_vgpr_output() nounwind {
; CHECK-LABEL: name: test_single_vgpr_output
; CHECK: bb.1.entry:
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %8
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 7", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %8
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -106,7 +106,7 @@ entry:
define i32 @test_single_sgpr_output_s32() nounwind {
; CHECK-LABEL: name: test_single_sgpr_output_s32
; CHECK: bb.1.entry:
- ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1835018 /* regdef:SReg_32 */, def %8
+ ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1966090 /* regdef:SReg_32 */, def %8
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: $vgpr0 = COPY [[COPY]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -119,7 +119,7 @@ entry:
define float @test_multiple_register_outputs_same() #0 {
; CHECK-LABEL: name: test_multiple_register_outputs_same
; CHECK: bb.1 (%ir-block.0):
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_mov_b32 $1, 1", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %8, 1245194 /* regdef:VGPR_32 */, def %9
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_mov_b32 $1, 1", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %8, 1376266 /* regdef:VGPR_32 */, def %9
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %9
; CHECK-NEXT: [[FADD:%[0-9]+]]:_(s32) = G_FADD [[COPY]], [[COPY1]]
@@ -136,7 +136,7 @@ define float @test_multiple_register_outputs_same() #0 {
define double @test_multiple_register_outputs_mixed() #0 {
; CHECK-LABEL: name: test_multiple_register_outputs_mixed
; CHECK: bb.1 (%ir-block.0):
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %8, 2818058 /* regdef:VReg_64 */, def %9
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0; v_add_f64 $1, 0, 0", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %8, 2883594 /* regdef:VReg_64 */, def %9
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s64) = COPY %9
; CHECK-NEXT: [[UV:%[0-9]+]]:_(s32), [[UV1:%[0-9]+]]:_(s32) = G_UNMERGE_VALUES [[COPY1]](s64)
@@ -171,7 +171,7 @@ define amdgpu_kernel void @test_input_vgpr_imm() {
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[C]](s32)
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 v0, $0", 1 /* sideeffect attdialect */, 1245193 /* reguse:VGPR_32 */, [[COPY1]]
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 v0, $0", 1 /* sideeffect attdialect */, 1376265 /* reguse:VGPR_32 */, [[COPY1]]
; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "v_mov_b32 v0, $0", "v"(i32 42)
ret void
@@ -185,7 +185,7 @@ define amdgpu_kernel void @test_input_sgpr_imm() {
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p4) = COPY $sgpr8_sgpr9
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 42
; CHECK-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[C]](s32)
- ; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 1 /* sideeffect attdialect */, 1835017 /* reguse:SReg_32 */, [[COPY1]]
+ ; CHECK-NEXT: INLINEASM &"s_mov_b32 s0, $0", 1 /* sideeffect attdialect */, 1966089 /* reguse:SReg_32 */, [[COPY1]]
; CHECK-NEXT: S_ENDPGM 0
call void asm sideeffect "s_mov_b32 s0, $0", "s"(i32 42)
ret void
@@ -212,7 +212,7 @@ define float @test_input_vgpr(i32 %src) nounwind {
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
- ; CHECK-NEXT: INLINEASM &"v_add_f32 $0, 1.0, $1", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %9, 1245193 /* reguse:VGPR_32 */, [[COPY1]]
+ ; CHECK-NEXT: INLINEASM &"v_add_f32 $0, 1.0, $1", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %9, 1376265 /* reguse:VGPR_32 */, [[COPY1]]
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %9
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -227,7 +227,7 @@ define i32 @test_memory_constraint(ptr addrspace(3) %a) nounwind {
; CHECK-NEXT: liveins: $vgpr0
; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CHECK-NEXT: INLINEASM &"ds_read_b32 $0, $1", 8 /* mayload attdialect */, 1245194 /* regdef:VGPR_32 */, def %9, 262158 /* mem:m */, [[COPY]](p3)
+ ; CHECK-NEXT: INLINEASM &"ds_read_b32 $0, $1", 8 /* mayload attdialect */, 1376266 /* regdef:VGPR_32 */, def %9, 262158 /* mem:m */, [[COPY]](p3)
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %9
; CHECK-NEXT: $vgpr0 = COPY [[COPY1]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -244,7 +244,7 @@ define i32 @test_vgpr_matching_constraint(i32 %a) nounwind {
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; CHECK-NEXT: [[AND:%[0-9]+]]:_(s32) = G_AND [[COPY]], [[C]]
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[AND]](s32)
- ; CHECK-NEXT: INLINEASM &";", 1 /* sideeffect attdialect */, 1245194 /* regdef:VGPR_32 */, def %11, 2147483657 /* reguse tiedto:$0 */, [[COPY1]](tied-def 3)
+ ; CHECK-NEXT: INLINEASM &";", 1 /* sideeffect attdialect */, 1376266 /* regdef:VGPR_32 */, def %11, 2147483657 /* reguse tiedto:$0 */, [[COPY1]](tied-def 3)
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %11
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -256,13 +256,13 @@ define i32 @test_vgpr_matching_constraint(i32 %a) nounwind {
define i32 @test_sgpr_matching_constraint() nounwind {
; CHECK-LABEL: name: test_sgpr_matching_constraint
; CHECK: bb.1.entry:
- ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1835018 /* regdef:SReg_32 */, def %8
+ ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1966090 /* regdef:SReg_32 */, def %8
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
- ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 8", 0 /* attdialect */, 1835018 /* regdef:SReg_32 */, def %10
+ ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 8", 0 /* attdialect */, 1966090 /* regdef:SReg_32 */, def %10
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY %10
; CHECK-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]](s32)
; CHECK-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]](s32)
- ; CHECK-NEXT: INLINEASM &"s_add_u32 $0, $1, $2", 0 /* attdialect */, 1835018 /* regdef:SReg_32 */, def %12, 1835017 /* reguse:SReg_32 */, [[COPY2]], 2147483657 /* reguse tiedto:$0 */, [[COPY3]](tied-def 3)
+ ; CHECK-NEXT: INLINEASM &"s_add_u32 $0, $1, $2", 0 /* attdialect */, 1966090 /* regdef:SReg_32 */, def %12, 1966089 /* reguse:SReg_32 */, [[COPY2]], 2147483657 /* reguse tiedto:$0 */, [[COPY3]](tied-def 3)
; CHECK-NEXT: [[COPY4:%[0-9]+]]:_(s32) = COPY %12
; CHECK-NEXT: $vgpr0 = COPY [[COPY4]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
@@ -285,7 +285,7 @@ define void @test_many_matching_constraints(i32 %a, i32 %b, i32 %c) nounwind {
; CHECK-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]](s32)
; CHECK-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
; CHECK-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]](s32)
- ; CHECK-NEXT: INLINEASM &"; ", 1 /* sideeffect attdialect */, 1245194 /* regdef:VGPR_32 */, def %11, 1245194 /* regdef:VGPR_32 */, def %12, 1245194 /* regdef:VGPR_32 */, def %13, 2147483657 /* reguse tiedto:$0 */, [[COPY3]](tied-def 3), 2147614729 /* reguse tiedto:$2 */, [[COPY4]](tied-def 7), 2147549193 /* reguse tiedto:$1 */, [[COPY5]](tied-def 5)
+ ; CHECK-NEXT: INLINEASM &"; ", 1 /* sideeffect attdialect */, 1376266 /* regdef:VGPR_32 */, def %11, 1376266 /* regdef:VGPR_32 */, def %12, 1376266 /* regdef:VGPR_32 */, def %13, 2147483657 /* reguse tiedto:$0 */, [[COPY3]](tied-def 3), 2147614729 /* reguse tiedto:$2 */, [[COPY4]](tied-def 7), 2147549193 /* reguse tiedto:$1 */, [[COPY5]](tied-def 5)
; CHECK-NEXT: [[COPY6:%[0-9]+]]:_(s32) = COPY %11
; CHECK-NEXT: [[COPY7:%[0-9]+]]:_(s32) = COPY %12
; CHECK-NEXT: [[COPY8:%[0-9]+]]:_(s32) = COPY %13
@@ -306,10 +306,10 @@ define void @test_many_matching_constraints(i32 %a, i32 %b, i32 %c) nounwind {
define i32 @test_sgpr_to_vgpr_move_matching_constraint() nounwind {
; CHECK-LABEL: name: test_sgpr_to_vgpr_move_matching_constraint
; CHECK: bb.1.entry:
- ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1835018 /* regdef:SReg_32 */, def %8
+ ; CHECK-NEXT: INLINEASM &"s_mov_b32 $0, 7", 0 /* attdialect */, 1966090 /* regdef:SReg_32 */, def %8
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY %8
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]](s32)
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, $1", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %10, 2147483657 /* reguse tiedto:$0 */, [[COPY1]](tied-def 3)
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, $1", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %10, 2147483657 /* reguse tiedto:$0 */, [[COPY1]](tied-def 3)
; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY %10
; CHECK-NEXT: $vgpr0 = COPY [[COPY2]](s32)
; CHECK-NEXT: SI_RETURN implicit $vgpr0
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-ignore-copies-crash.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-ignore-copies-crash.mir
index 7ca3869b535e4..65bde1e7d8efb 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-ignore-copies-crash.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankcombiner-ignore-copies-crash.mir
@@ -24,7 +24,7 @@ body: |
; CHECK-NEXT: [[COPY1:%[0-9]+]]:vgpr(s32) = COPY [[C]](s32)
; CHECK-NEXT: [[FMUL:%[0-9]+]]:vgpr(s32) = G_FMUL [[COPY]], [[COPY1]]
; CHECK-NEXT: [[C1:%[0-9]+]]:sgpr(s32) = G_FCONSTANT float 1.000000e+00
- ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %5(s32)
+ ; CHECK-NEXT: INLINEASM &"v_mov_b32 $0, 0", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %5(s32)
; CHECK-NEXT: [[COPY2:%[0-9]+]]:vgpr(s32) = COPY [[C1]](s32)
; CHECK-NEXT: [[AMDGPU_FMED3_:%[0-9]+]]:vgpr(s32) = nnan G_AMDGPU_FMED3 [[FMUL]], %5, [[COPY2]]
; CHECK-NEXT: $vgpr0 = COPY [[AMDGPU_FMED3_]](s32)
@@ -33,7 +33,7 @@ body: |
%2:vgpr(s32) = COPY %1(s32)
%3:vgpr(s32) = G_FMUL %0, %2
%4:sgpr(s32) = G_FCONSTANT float 1.000000e+00
- INLINEASM &"v_mov_b32 $0, 0", 0 /* attdialect */, 1245194 /* regdef:VGPR_32 */, def %5:vgpr_32
+ INLINEASM &"v_mov_b32 $0, 0", 0 /* attdialect */, 1376266 /* regdef:VGPR_32 */, def %5:vgpr_32
%6:vgpr(s32) = COPY %4(s32)
%7:vgpr(s32) = nnan G_AMDGPU_FMED3 %3(s32), %5(s32), %6(s32)
$vgpr0 = COPY %7(s32)
diff --git a/llvm/test/CodeGen/AMDGPU/branch-relax-indirect-branch.mir b/llvm/test/CodeGen/AMDGPU/branch-relax-indirect-branch.mir
index 34c0159dd3ddb..b1690860178de 100644
--- a/llvm/test/CodeGen/AMDGPU/branch-relax-indirect-branch.mir
+++ b/llvm/test/CodeGen/AMDGPU/branch-relax-indirect-branch.mir
@@ -68,7 +68,7 @@ body: |
; CHECK-NEXT: successors: %bb.3(0x04000000), %bb.7(0x7c000000)
; CHECK-NEXT: liveins: $vcc_hi, $vcc_lo, $sgpr5, $sgpr6, $sgpr7, $sgpr8, $sgpr9, $sgpr10, $sgpr11, $sgpr12, $sgpr13, $sgpr14, $sgpr15, $sgpr16, $sgpr17, $sgpr18, $sgpr19, $sgpr20, $sgpr21, $sgpr22, $sgpr23, $sgpr24, $sgpr25, $sgpr26, $sgpr27, $sgpr28, $sgpr29, $sgpr30, $sgpr31, $sgpr34, $sgpr35, $sgpr36, $sgpr37, $sgpr38, $sgpr39, $sgpr40, $sgpr41, $sgpr42, $sgpr43, $sgpr44, $sgpr45, $sgpr46, $sgpr47, $sgpr48, $sgpr49, $sgpr50, $sgpr51, $sgpr52, $sgpr53, $sgpr54, $sg...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/172614
More information about the llvm-commits
mailing list