[llvm] [AMDGPU][SDAG] Initial support for ISD::PTRADD (PR #141725)
Fabian Ritter via llvm-commits
llvm-commits at lists.llvm.org
Fri Jun 13 04:22:53 PDT 2025
https://github.com/ritter-x2a updated https://github.com/llvm/llvm-project/pull/141725
>From e12ee9eac0341ab84ef5d8ca1be169b563f4e2c2 Mon Sep 17 00:00:00 2001
From: Fabian Ritter <fabian.ritter at amd.com>
Date: Tue, 27 May 2025 08:02:50 -0400
Subject: [PATCH 1/4] [AMDGPU][SDAG] Initial support for ISD::PTRADD
Enable generation of PTRADD SelectionDAG nodes for pointer arithmetic for SI,
for now behind an internal CLI option. Also add basic patterns to match these
nodes. Optimizations will come in follow-up PRs. Basic tests for SDAG codegen
with PTRADD are in test/CodeGen/AMDGPU/ptradd-sdag.ll
Since GlobalISel also uses the PTRADD SDAG patterns via SelectionDAGCompat,
this change affects GlobalISel tests:
- Uniform 32-bit address arithmetic is now lowered to s_add_i32 instead of
s_add_u32, which is consistent to what SDAG does (and gives
SIShrinkInstructions the chance to generate s_addk_i32).
- 64-bit address arithmetic uses the [sv]_add_u64 pseudos, which is consistent
with SDAG and means that GISel now generates 64-bit adds for gfx12. The only
drawback with that is that we could save 1-2 instructions if we didn't use
64-bit adds with >32-bit immediate (two movs with 32-bit immediates,
s_delay_alu, and a 64-bit add vs two 32-bit adds with immediate), but that's
a separate problem.
- The register class for the dead carry-out/sign-bit operand of
V_ADD_CO_U32_e64 on architectures without carry-less additions now is sreg_64
instead of sreg_64_xexec. I'm not sure if that loses us something worth
preserving; I haven't found an obvious way to avoid this.
Overall, the changes in the GlobalISel tests seem to be improvements.
---
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 12 +
llvm/lib/Target/AMDGPU/SIISelLowering.h | 2 +
llvm/lib/Target/AMDGPU/SIInstructions.td | 31 +
.../GlobalISel/call-outgoing-stack-args.ll | 44 +-
.../GlobalISel/dynamic-alloca-uniform.ll | 48 +-
.../CodeGen/AMDGPU/GlobalISel/flat-scratch.ll | 96 +-
.../GlobalISel/insertelement-stack-lower.ll | 6 +-
...inst-select-amdgpu-atomic-cmpxchg-flat.mir | 80 +-
...st-select-amdgpu-atomic-cmpxchg-global.mir | 124 +-
.../inst-select-atomic-cmpxchg-local.mir | 16 +-
.../inst-select-atomic-cmpxchg-region.mir | 16 +-
.../inst-select-atomicrmw-add-flat.mir | 240 +---
.../inst-select-atomicrmw-add-global.mir | 220 +---
.../inst-select-atomicrmw-xchg-local.mir | 8 +-
.../inst-select-atomicrmw-xchg-region.mir | 8 +-
.../inst-select-load-atomic-flat.mir | 60 +-
.../inst-select-load-atomic-global.mir | 96 +-
.../inst-select-load-atomic-local.mir | 20 +-
.../GlobalISel/inst-select-load-constant.mir | 100 +-
.../GlobalISel/inst-select-load-flat.mir | 760 +++----------
...st-select-load-global-old-legalization.mir | 874 ++++----------
.../inst-select-load-global-saddr.mir | 312 ++---
.../GlobalISel/inst-select-load-global.mir | 874 ++++----------
.../GlobalISel/inst-select-load-local-128.mir | 26 +-
.../GlobalISel/inst-select-load-local.mir | 12 +-
.../GlobalISel/inst-select-load-private.mir | 34 +-
.../GlobalISel/inst-select-load-smrd.mir | 24 +-
.../GlobalISel/inst-select-pattern-add3.mir | 12 +-
.../AMDGPU/GlobalISel/inst-select-ptr-add.mir | 318 ++----
.../GlobalISel/inst-select-sextload-local.mir | 2 +-
.../GlobalISel/inst-select-store-flat.mir | 250 +---
.../GlobalISel/inst-select-store-global.mir | 256 +----
.../GlobalISel/inst-select-store-local.mir | 53 +-
.../GlobalISel/inst-select-store-private.mir | 2 +-
.../GlobalISel/inst-select-zextload-local.mir | 2 +-
.../CodeGen/AMDGPU/GlobalISel/lds-relocs.ll | 2 +-
.../AMDGPU/GlobalISel/lds-zero-initializer.ll | 4 +-
.../CodeGen/AMDGPU/GlobalISel/mubuf-global.ll | 53 +-
.../AMDGPU/GlobalISel/non-entry-alloca.ll | 16 +-
.../test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll | 30 +-
llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll | 4 +-
.../test/CodeGen/AMDGPU/dynamic_stackalloc.ll | 128 +--
.../AMDGPU/gfx12_scalar_subword_loads.ll | 134 +--
llvm/test/CodeGen/AMDGPU/global-saddr-load.ll | 372 +++---
.../CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll | 524 ++++-----
llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll | 83 +-
llvm/test/CodeGen/AMDGPU/offset-split-flat.ll | 60 +-
.../CodeGen/AMDGPU/offset-split-global.ll | 60 +-
llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll | 1013 +++++++++++++++++
llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll | 3 +-
50 files changed, 2988 insertions(+), 4536 deletions(-)
create mode 100644 llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 53dc540cbd635..5164d27fcf003 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -61,6 +61,13 @@ static cl::opt<bool> UseDivergentRegisterIndexing(
cl::desc("Use indirect register addressing for divergent indexes"),
cl::init(false));
+// TODO This option should be removed once we switch to always using PTRADD in
+// the SelectionDAG.
+static cl::opt<bool> UseSelectionDAGPTRADD(
+ "amdgpu-use-sdag-ptradd", cl::Hidden,
+ cl::desc("Generate ISD::PTRADD nodes in the SelectionDAG ISel"),
+ cl::init(false));
+
static bool denormalModeIsFlushAllF32(const MachineFunction &MF) {
const SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
return Info->getMode().FP32Denormals == DenormalMode::getPreserveSign();
@@ -10457,6 +10464,11 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op,
}
}
+bool SITargetLowering::shouldPreservePtrArith(const Function &F,
+ EVT PtrVT) const {
+ return UseSelectionDAGPTRADD;
+}
+
// The raw.(t)buffer and struct.(t)buffer intrinsics have two offset args:
// offset (the offset that is included in bounds checking and swizzling, to be
// split between the instruction's voffset and immoffset fields) and soffset
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.h b/llvm/lib/Target/AMDGPU/SIISelLowering.h
index e14611d999641..d71a22722129e 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.h
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.h
@@ -260,6 +260,8 @@ class SITargetLowering final : public AMDGPUTargetLowering {
bool shouldExpandVectorDynExt(SDNode *N) const;
+ bool shouldPreservePtrArith(const Function &F, EVT PtrVT) const override;
+
private:
// Analyze a combined offset from an amdgcn_s_buffer_load intrinsic and store
// the three offsets (voffset, soffset and instoffset) into the SDValue[3]
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 360fd05cb3d96..c93584bc8b5ce 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1376,6 +1376,37 @@ def : GCNPat <
(i32 (V_MOV_B32_e32 (i32 0))), sub1)
>;
+//===----------------------------------------------------------------------===//
+// PTRADD Patterns
+//===----------------------------------------------------------------------===//
+
+def : GCNPat<
+ (DivergentBinFrag<ptradd> i64:$src0, i64:$src1),
+ (V_ADD_U64_PSEUDO $src0, $src1)>;
+
+def : GCNPat<
+ (DivergentBinFrag<ptradd> i32:$src0, i32:$src1),
+ (V_ADD_U32_e64 $src0, $src1, 0)> {
+ let SubtargetPredicate = HasAddNoCarryInsts;
+}
+
+def : GCNPat<
+ (DivergentBinFrag<ptradd> i32:$src0, i32:$src1),
+ (V_ADD_CO_U32_e64 $src0, $src1)> {
+ let SubtargetPredicate = NotHasAddNoCarryInsts;
+}
+
+def : GCNPat<
+ (UniformBinFrag<ptradd> i64:$src0, i64:$src1),
+ (S_ADD_U64_PSEUDO $src0, $src1)>;
+
+// Whether we select S_ADD_I32 or S_ADD_U32 does not make much of a
+// difference. Most notably, S_ADD_I32 instructions can be transformed
+// to S_ADDK_I32, so we select that.
+def : GCNPat<
+ (UniformBinFrag<ptradd> i32:$src0, i32:$src1),
+ (S_ADD_I32 $src0, $src1)>;
+
/********** ============================================ **********/
/********** Extraction, Insertion, Building and Casting **********/
/********** ============================================ **********/
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
index 7adaddf2fc8ba..5a3b36fc1ada2 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
@@ -36,15 +36,15 @@ define amdgpu_kernel void @kernel_caller_stack() {
; FLATSCR-NEXT: s_mov_b32 s32, 0
; FLATSCR-NEXT: s_add_u32 flat_scratch_lo, s8, s13
; FLATSCR-NEXT: s_addc_u32 flat_scratch_hi, s9, 0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 4
+; FLATSCR-NEXT: s_add_i32 s0, s32, 4
; FLATSCR-NEXT: v_mov_b32_e32 v0, 9
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 8
+; FLATSCR-NEXT: s_add_i32 s0, s32, 8
; FLATSCR-NEXT: v_mov_b32_e32 v0, 10
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 12
+; FLATSCR-NEXT: s_add_i32 s0, s32, 12
; FLATSCR-NEXT: v_mov_b32_e32 v0, 11
-; FLATSCR-NEXT: s_add_u32 s2, s32, 16
+; FLATSCR-NEXT: s_add_i32 s2, s32, 16
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
; FLATSCR-NEXT: v_mov_b32_e32 v0, 12
; FLATSCR-NEXT: s_getpc_b64 s[0:1]
@@ -189,13 +189,13 @@ define amdgpu_kernel void @kernel_caller_byval() {
; FLATSCR-NEXT: s_getpc_b64 s[0:1]
; FLATSCR-NEXT: s_add_u32 s0, s0, external_void_func_byval at rel32@lo+4
; FLATSCR-NEXT: s_addc_u32 s1, s1, external_void_func_byval at rel32@hi+12
-; FLATSCR-NEXT: s_add_u32 s2, s32, 8
-; FLATSCR-NEXT: s_add_u32 s3, s32, 16
-; FLATSCR-NEXT: s_add_u32 s4, s32, 24
-; FLATSCR-NEXT: s_add_u32 s5, s32, 32
-; FLATSCR-NEXT: s_add_u32 s6, s32, 40
-; FLATSCR-NEXT: s_add_u32 s7, s32, 48
-; FLATSCR-NEXT: s_add_u32 s8, s32, 56
+; FLATSCR-NEXT: s_add_i32 s2, s32, 8
+; FLATSCR-NEXT: s_add_i32 s3, s32, 16
+; FLATSCR-NEXT: s_add_i32 s4, s32, 24
+; FLATSCR-NEXT: s_add_i32 s5, s32, 32
+; FLATSCR-NEXT: s_add_i32 s6, s32, 40
+; FLATSCR-NEXT: s_add_i32 s7, s32, 48
+; FLATSCR-NEXT: s_add_i32 s8, s32, 56
; FLATSCR-NEXT: s_waitcnt vmcnt(7)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[0:1], s32
; FLATSCR-NEXT: s_waitcnt vmcnt(7)
@@ -266,16 +266,16 @@ define void @func_caller_stack() {
; FLATSCR-NEXT: s_mov_b64 exec, s[2:3]
; FLATSCR-NEXT: s_add_i32 s32, s32, 16
; FLATSCR-NEXT: v_writelane_b32 v40, s0, 2
-; FLATSCR-NEXT: s_add_u32 s0, s32, 4
+; FLATSCR-NEXT: s_add_i32 s0, s32, 4
; FLATSCR-NEXT: v_mov_b32_e32 v0, 9
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 8
+; FLATSCR-NEXT: s_add_i32 s0, s32, 8
; FLATSCR-NEXT: v_mov_b32_e32 v0, 10
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 12
+; FLATSCR-NEXT: s_add_i32 s0, s32, 12
; FLATSCR-NEXT: v_mov_b32_e32 v0, 11
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_u32 s0, s32, 16
+; FLATSCR-NEXT: s_add_i32 s0, s32, 16
; FLATSCR-NEXT: v_mov_b32_e32 v0, 12
; FLATSCR-NEXT: v_writelane_b32 v40, s30, 0
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
@@ -393,8 +393,8 @@ define void @func_caller_byval(ptr addrspace(5) %argptr) {
; FLATSCR-NEXT: s_add_i32 s32, s32, 16
; FLATSCR-NEXT: v_add_u32_e32 v3, 8, v0
; FLATSCR-NEXT: v_writelane_b32 v40, s0, 2
-; FLATSCR-NEXT: s_add_u32 s0, s32, 8
-; FLATSCR-NEXT: s_add_u32 s2, s32, 56
+; FLATSCR-NEXT: s_add_i32 s0, s32, 8
+; FLATSCR-NEXT: s_add_i32 s2, s32, 56
; FLATSCR-NEXT: v_writelane_b32 v40, s30, 0
; FLATSCR-NEXT: v_writelane_b32 v40, s31, 1
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
@@ -404,28 +404,28 @@ define void @func_caller_byval(ptr addrspace(5) %argptr) {
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_u32 s0, s32, 16
+; FLATSCR-NEXT: s_add_i32 s0, s32, 16
; FLATSCR-NEXT: v_add_u32_e32 v3, 24, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_u32 s0, s32, 24
+; FLATSCR-NEXT: s_add_i32 s0, s32, 24
; FLATSCR-NEXT: v_add_u32_e32 v3, 32, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_u32 s0, s32, 32
+; FLATSCR-NEXT: s_add_i32 s0, s32, 32
; FLATSCR-NEXT: v_add_u32_e32 v3, 40, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_u32 s0, s32, 40
+; FLATSCR-NEXT: s_add_i32 s0, s32, 40
; FLATSCR-NEXT: v_add_u32_e32 v3, 48, v0
; FLATSCR-NEXT: v_add_u32_e32 v0, 56, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_u32 s0, s32, 48
+; FLATSCR-NEXT: s_add_i32 s0, s32, 48
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[0:1], v0, off
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
index 6b767d9e754be..a1bb8b390847f 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
@@ -20,7 +20,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s4
; GFX9-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_u32 s32, s4, s5
+; GFX9-NEXT: s_add_i32 s32, s4, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -39,7 +39,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX10-NEXT: s_and_b32 s5, s5, -16
; GFX10-NEXT: s_lshl_b32 s5, s5, 5
-; GFX10-NEXT: s_add_u32 s32, s4, s5
+; GFX10-NEXT: s_add_i32 s32, s4, s5
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align4:
@@ -56,7 +56,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX11-NEXT: s_and_b32 s1, s1, -16
; GFX11-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s32, s0, s1
+; GFX11-NEXT: s_add_i32 s32, s0, s1
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 4, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -84,7 +84,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_u32 s32, s6, s4
+; GFX9-NEXT: s_add_i32 s32, s6, s4
; GFX9-NEXT: s_mov_b32 s32, s33
; GFX9-NEXT: s_mov_b32 s33, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -110,7 +110,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_u32 s32, s6, s4
+; GFX10-NEXT: s_add_i32 s32, s6, s4
; GFX10-NEXT: s_mov_b32 s32, s33
; GFX10-NEXT: s_mov_b32 s33, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -136,7 +136,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s32, s2, s0
+; GFX11-NEXT: s_add_i32 s32, s2, s0
; GFX11-NEXT: s_mov_b32 s32, s33
; GFX11-NEXT: s_mov_b32 s33, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
@@ -161,7 +161,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s4
; GFX9-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_u32 s32, s4, s5
+; GFX9-NEXT: s_add_i32 s32, s4, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -180,7 +180,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX10-NEXT: s_and_b32 s5, s5, -16
; GFX10-NEXT: s_lshl_b32 s5, s5, 5
-; GFX10-NEXT: s_add_u32 s32, s4, s5
+; GFX10-NEXT: s_add_i32 s32, s4, s5
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align16:
@@ -197,7 +197,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX11-NEXT: s_and_b32 s1, s1, -16
; GFX11-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s32, s0, s1
+; GFX11-NEXT: s_add_i32 s32, s0, s1
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 16, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -225,7 +225,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_u32 s32, s6, s4
+; GFX9-NEXT: s_add_i32 s32, s6, s4
; GFX9-NEXT: s_mov_b32 s32, s33
; GFX9-NEXT: s_mov_b32 s33, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -251,7 +251,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_u32 s32, s6, s4
+; GFX10-NEXT: s_add_i32 s32, s6, s4
; GFX10-NEXT: s_mov_b32 s32, s33
; GFX10-NEXT: s_mov_b32 s33, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -277,7 +277,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s32, s2, s0
+; GFX11-NEXT: s_add_i32 s32, s2, s0
; GFX11-NEXT: s_mov_b32 s32, s33
; GFX11-NEXT: s_mov_b32 s33, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
@@ -294,7 +294,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX9-NEXT: s_movk_i32 s32, 0x800
; GFX9-NEXT: s_add_u32 s0, s0, s17
; GFX9-NEXT: s_addc_u32 s1, s1, 0
-; GFX9-NEXT: s_add_u32 s5, s32, 0x7ff
+; GFX9-NEXT: s_add_i32 s5, s32, 0x7ff
; GFX9-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s5, s5, 0xfffff800
@@ -303,7 +303,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s5
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_u32 s32, s5, s4
+; GFX9-NEXT: s_add_i32 s32, s5, s4
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -313,7 +313,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX10-NEXT: s_movk_i32 s32, 0x400
; GFX10-NEXT: s_add_u32 s0, s0, s17
; GFX10-NEXT: s_addc_u32 s1, s1, 0
-; GFX10-NEXT: s_add_u32 s5, s32, 0x3ff
+; GFX10-NEXT: s_add_i32 s5, s32, 0x3ff
; GFX10-NEXT: v_mov_b32_e32 v0, 0
; GFX10-NEXT: s_and_b32 s5, s5, 0xfffffc00
; GFX10-NEXT: s_mov_b32 s33, 0
@@ -323,7 +323,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_u32 s32, s5, s4
+; GFX10-NEXT: s_add_i32 s32, s5, s4
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align32:
@@ -331,7 +331,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX11-NEXT: s_load_b32 s0, s[4:5], 0x0
; GFX11-NEXT: s_mov_b32 s32, 32
; GFX11-NEXT: v_mov_b32_e32 v0, 0
-; GFX11-NEXT: s_add_u32 s1, s32, 0x3ff
+; GFX11-NEXT: s_add_i32 s1, s32, 0x3ff
; GFX11-NEXT: s_mov_b32 s33, 0
; GFX11-NEXT: s_and_b32 s1, s1, 0xfffffc00
; GFX11-NEXT: scratch_store_b32 off, v0, s1
@@ -341,7 +341,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s32, s1, s0
+; GFX11-NEXT: s_add_i32 s32, s1, s0
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 32, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -366,7 +366,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX9-NEXT: s_mov_b32 s33, s6
; GFX9-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-NEXT: s_load_dword s4, s[4:5], 0x0
-; GFX9-NEXT: s_add_u32 s5, s32, 0x7ff
+; GFX9-NEXT: s_add_i32 s5, s32, 0x7ff
; GFX9-NEXT: s_and_b32 s5, s5, 0xfffff800
; GFX9-NEXT: v_mov_b32_e32 v1, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -374,7 +374,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_u32 s32, s5, s4
+; GFX9-NEXT: s_add_i32 s32, s5, s4
; GFX9-NEXT: s_mov_b32 s32, s34
; GFX9-NEXT: s_mov_b32 s34, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -397,7 +397,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX10-NEXT: s_mov_b32 s33, s6
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_load_dword s4, s[4:5], 0x0
-; GFX10-NEXT: s_add_u32 s5, s32, 0x3ff
+; GFX10-NEXT: s_add_i32 s5, s32, 0x3ff
; GFX10-NEXT: s_and_b32 s5, s5, 0xfffffc00
; GFX10-NEXT: v_mov_b32_e32 v1, s5
; GFX10-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -405,7 +405,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_u32 s32, s5, s4
+; GFX10-NEXT: s_add_i32 s32, s5, s4
; GFX10-NEXT: s_mov_b32 s32, s34
; GFX10-NEXT: s_mov_b32 s34, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -427,7 +427,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX11-NEXT: s_mov_b32 s33, s2
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_load_b32 s0, s[0:1], 0x0
-; GFX11-NEXT: s_add_u32 s1, s32, 0x3ff
+; GFX11-NEXT: s_add_i32 s1, s32, 0x3ff
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_3) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: s_and_b32 s1, s1, 0xfffffc00
; GFX11-NEXT: scratch_store_b32 off, v0, s1
@@ -436,7 +436,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
-; GFX11-NEXT: s_add_u32 s32, s1, s0
+; GFX11-NEXT: s_add_i32 s32, s1, s0
; GFX11-NEXT: s_mov_b32 s32, s34
; GFX11-NEXT: s_mov_b32 s34, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
index 8a80afd4a768f..d1083588e8ac0 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
@@ -855,7 +855,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX9-NEXT: s_lshl_b32 s0, s0, 7
; GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX9-NEXT: v_add_u32_e32 v1, 0x100, v1
-; GFX9-NEXT: s_add_u32 s0, 0x100, s0
+; GFX9-NEXT: s_addk_i32 s0, 0x100
; GFX9-NEXT: v_mov_b32_e32 v2, 15
; GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -883,7 +883,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; GFX10-NEXT: s_add_u32 s0, 0x100, s0
+; GFX10-NEXT: s_addk_i32 s0, 0x100
; GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -899,7 +899,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX942-NEXT: s_waitcnt lgkmcnt(0)
; GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; GFX942-NEXT: s_add_u32 s0, 0x100, s0
+; GFX942-NEXT: s_addk_i32 s0, 0x100
; GFX942-NEXT: v_mov_b32_e32 v2, 15
; GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX942-NEXT: scratch_store_dword v1, v2, off offset:384 sc0 sc1
@@ -924,7 +924,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; GFX11-NEXT: s_add_u32 s0, 0x100, s0
+; GFX11-NEXT: s_addk_i32 s0, 0x100
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX11-NEXT: scratch_load_b32 v0, v1, off offset:124 glc dlc
@@ -946,7 +946,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX12-NEXT: scratch_store_b32 v0, v2, off offset:384 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_storecnt 0x0
; GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; GFX12-NEXT: s_add_co_u32 s0, 0x100, s0
+; GFX12-NEXT: s_addk_co_i32 s0, 0x100
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -966,7 +966,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX9-NEXT: s_lshl_b32 s0, s0, 7
; UNALIGNED_GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX9-NEXT: v_add_u32_e32 v1, 0x100, v1
-; UNALIGNED_GFX9-NEXT: s_add_u32 s0, 0x100, s0
+; UNALIGNED_GFX9-NEXT: s_addk_i32 s0, 0x100
; UNALIGNED_GFX9-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -994,7 +994,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX10-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX10-NEXT: s_add_u32 s0, 0x100, s0
+; UNALIGNED_GFX10-NEXT: s_addk_i32 s0, 0x100
; UNALIGNED_GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1010,7 +1010,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX942-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX942-NEXT: s_add_u32 s0, 0x100, s0
+; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0x100
; UNALIGNED_GFX942-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX942-NEXT: scratch_store_dword v1, v2, off offset:384 sc0 sc1
@@ -1035,7 +1035,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX11-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX11-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX11-NEXT: s_add_u32 s0, 0x100, s0
+; UNALIGNED_GFX11-NEXT: s_addk_i32 s0, 0x100
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v0, v1, off offset:124 glc dlc
@@ -1057,7 +1057,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX12-NEXT: scratch_store_b32 v0, v2, off offset:384 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_storecnt 0x0
; UNALIGNED_GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, 0x100, s0
+; UNALIGNED_GFX12-NEXT: s_addk_co_i32 s0, 0x100
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -1495,7 +1495,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX9-NEXT: s_lshl_b32 s0, s0, 7
; GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX9-NEXT: v_add_u32_e32 v1, 0x4004, v1
-; GFX9-NEXT: s_add_u32 s0, 0x4004, s0
+; GFX9-NEXT: s_addk_i32 s0, 0x4004
; GFX9-NEXT: v_mov_b32_e32 v2, 15
; GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -1523,7 +1523,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; GFX10-NEXT: s_add_u32 s0, 0x4004, s0
+; GFX10-NEXT: s_addk_i32 s0, 0x4004
; GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1539,7 +1539,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX942-NEXT: s_waitcnt lgkmcnt(0)
; GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; GFX942-NEXT: s_add_u32 s0, 0x4004, s0
+; GFX942-NEXT: s_addk_i32 s0, 0x4004
; GFX942-NEXT: v_mov_b32_e32 v2, 15
; GFX942-NEXT: s_movk_i32 s1, 0x4004
; GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
@@ -1563,7 +1563,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; GFX11-NEXT: s_add_u32 s0, 0x4004, s0
+; GFX11-NEXT: s_addk_i32 s0, 0x4004
; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX11-NEXT: s_movk_i32 s0, 0x4004
@@ -1588,7 +1588,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX12-NEXT: scratch_store_b32 v0, v2, off offset:16512 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_storecnt 0x0
; GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; GFX12-NEXT: s_add_co_u32 s0, 0x4000, s0
+; GFX12-NEXT: s_addk_co_i32 s0, 0x4000
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -1608,7 +1608,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX9-NEXT: s_lshl_b32 s0, s0, 7
; UNALIGNED_GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX9-NEXT: v_add_u32_e32 v1, 0x4004, v1
-; UNALIGNED_GFX9-NEXT: s_add_u32 s0, 0x4004, s0
+; UNALIGNED_GFX9-NEXT: s_addk_i32 s0, 0x4004
; UNALIGNED_GFX9-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -1636,7 +1636,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX10-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX10-NEXT: s_add_u32 s0, 0x4004, s0
+; UNALIGNED_GFX10-NEXT: s_addk_i32 s0, 0x4004
; UNALIGNED_GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1652,7 +1652,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX942-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX942-NEXT: s_add_u32 s0, 0x4004, s0
+; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0x4004
; UNALIGNED_GFX942-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX942-NEXT: s_movk_i32 s1, 0x4004
; UNALIGNED_GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
@@ -1676,7 +1676,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
; UNALIGNED_GFX11-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX11-NEXT: s_add_u32 s0, 0x4004, s0
+; UNALIGNED_GFX11-NEXT: s_addk_i32 s0, 0x4004
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
; UNALIGNED_GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX11-NEXT: s_movk_i32 s0, 0x4004
@@ -1701,7 +1701,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX12-NEXT: scratch_store_b32 v0, v2, off offset:16512 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_storecnt 0x0
; UNALIGNED_GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, 0x4000, s0
+; UNALIGNED_GFX12-NEXT: s_addk_co_i32 s0, 0x4000
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -3803,8 +3803,8 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; GFX9: ; %bb.0: ; %entry
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; GFX9-NEXT: s_add_u32 s0, s2, 0xffe8
-; GFX9-NEXT: scratch_load_dword v2, off, s0
+; GFX9-NEXT: s_add_i32 s2, s2, 0xffe8
+; GFX9-NEXT: scratch_load_dword v2, off, s2
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
; GFX9-NEXT: s_endpgm
@@ -3815,15 +3815,15 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; GFX10-NEXT: s_addc_u32 s1, s1, 0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
-; GFX10-NEXT: s_add_u32 s0, s2, 0xffe8
-; GFX10-NEXT: scratch_load_dword v2, off, s0
+; GFX10-NEXT: s_add_i32 s2, s2, 0xffe8
+; GFX10-NEXT: scratch_load_dword v2, off, s2
; GFX10-NEXT: s_waitcnt vmcnt(0)
; GFX10-NEXT: global_store_dword v[0:1], v2, off
; GFX10-NEXT: s_endpgm
;
; GFX942-LABEL: sgpr_base_large_offset:
; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_add_u32 s0, s0, 0xffe8
+; GFX942-NEXT: s_add_i32 s0, s0, 0xffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3831,7 +3831,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; GFX11-LABEL: sgpr_base_large_offset:
; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_add_u32 s0, s0, 0xffe8
+; GFX11-NEXT: s_add_i32 s0, s0, 0xffe8
; GFX11-NEXT: scratch_load_b32 v2, off, s0
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3848,8 +3848,8 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; UNALIGNED_GFX9: ; %bb.0: ; %entry
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s2, 0xffe8
-; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0
+; UNALIGNED_GFX9-NEXT: s_add_i32 s2, s2, 0xffe8
+; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s2
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX9-NEXT: s_endpgm
@@ -3860,15 +3860,15 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; UNALIGNED_GFX10-NEXT: s_addc_u32 s1, s1, 0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
-; UNALIGNED_GFX10-NEXT: s_add_u32 s0, s2, 0xffe8
-; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s0
+; UNALIGNED_GFX10-NEXT: s_add_i32 s2, s2, 0xffe8
+; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s2
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX10-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX10-NEXT: s_endpgm
;
; UNALIGNED_GFX942-LABEL: sgpr_base_large_offset:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
-; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0xffe8
+; UNALIGNED_GFX942-NEXT: s_add_i32 s0, s0, 0xffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3876,7 +3876,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; UNALIGNED_GFX11-LABEL: sgpr_base_large_offset:
; UNALIGNED_GFX11: ; %bb.0: ; %entry
-; UNALIGNED_GFX11-NEXT: s_add_u32 s0, s0, 0xffe8
+; UNALIGNED_GFX11-NEXT: s_add_i32 s0, s0, 0xffe8
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v2, off, s0
; UNALIGNED_GFX11-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3901,7 +3901,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
; GFX9-NEXT: s_and_b32 s0, s2, -4
-; GFX9-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; GFX9-NEXT: s_add_i32 s0, s0, 0x100ffe8
; GFX9-NEXT: scratch_load_dword v2, off, s0 glc
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
@@ -3914,7 +3914,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
; GFX10-NEXT: s_and_b32 s0, s2, -4
-; GFX10-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; GFX10-NEXT: s_add_i32 s0, s0, 0x100ffe8
; GFX10-NEXT: scratch_load_dword v2, off, s0 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
; GFX10-NEXT: global_store_dword v[0:1], v2, off
@@ -3923,7 +3923,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX942-LABEL: sgpr_base_large_offset_split:
; GFX942: ; %bb.0: ; %entry
; GFX942-NEXT: s_and_b32 s0, s0, -4
-; GFX942-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; GFX942-NEXT: s_add_i32 s0, s0, 0x100ffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0 sc0 sc1
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3933,7 +3933,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX11: ; %bb.0: ; %entry
; GFX11-NEXT: s_and_b32 s0, s0, -4
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; GFX11-NEXT: s_add_i32 s0, s0, 0x100ffe8
; GFX11-NEXT: scratch_load_b32 v2, off, s0 glc dlc
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3943,7 +3943,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX12: ; %bb.0: ; %entry
; GFX12-NEXT: s_and_b32 s0, s0, -4
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_co_u32 s0, s0, 0x100ffe8
+; GFX12-NEXT: s_add_co_i32 s0, s0, 0x100ffe8
; GFX12-NEXT: scratch_load_b32 v2, off, s0 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
@@ -3954,7 +3954,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
; UNALIGNED_GFX9-NEXT: s_and_b32 s0, s2, -4
-; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX9-NEXT: s_add_i32 s0, s0, 0x100ffe8
; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0 glc
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
@@ -3967,7 +3967,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
; UNALIGNED_GFX10-NEXT: s_and_b32 s0, s2, -4
-; UNALIGNED_GFX10-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX10-NEXT: s_add_i32 s0, s0, 0x100ffe8
; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s0 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX10-NEXT: global_store_dword v[0:1], v2, off
@@ -3976,7 +3976,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX942-LABEL: sgpr_base_large_offset_split:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
; UNALIGNED_GFX942-NEXT: s_and_b32 s0, s0, -4
-; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX942-NEXT: s_add_i32 s0, s0, 0x100ffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0 sc0 sc1
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3986,7 +3986,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX11: ; %bb.0: ; %entry
; UNALIGNED_GFX11-NEXT: s_and_b32 s0, s0, -4
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; UNALIGNED_GFX11-NEXT: s_add_u32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX11-NEXT: s_add_i32 s0, s0, 0x100ffe8
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v2, off, s0 glc dlc
; UNALIGNED_GFX11-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3996,7 +3996,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX12: ; %bb.0: ; %entry
; UNALIGNED_GFX12-NEXT: s_and_b32 s0, s0, -4
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX12-NEXT: s_add_co_i32 s0, s0, 0x100ffe8
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v2, off, s0 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_loadcnt 0x0
; UNALIGNED_GFX12-NEXT: global_store_b32 v[0:1], v2, off
@@ -4241,8 +4241,8 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
; GFX9: ; %bb.0: ; %entry
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; GFX9-NEXT: s_add_u32 s0, s2, 0xffffffe8
-; GFX9-NEXT: scratch_load_dword v2, off, s0
+; GFX9-NEXT: s_addk_i32 s2, 0xffe8
+; GFX9-NEXT: scratch_load_dword v2, off, s2
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
; GFX9-NEXT: s_endpgm
@@ -4260,7 +4260,7 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
;
; GFX942-LABEL: sgpr_base_negative_offset:
; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_add_u32 s0, s0, 0xffffffe8
+; GFX942-NEXT: s_addk_i32 s0, 0xffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -4284,8 +4284,8 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
; UNALIGNED_GFX9: ; %bb.0: ; %entry
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s2, 0xffffffe8
-; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0
+; UNALIGNED_GFX9-NEXT: s_addk_i32 s2, 0xffe8
+; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s2
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX9-NEXT: s_endpgm
@@ -4303,7 +4303,7 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
;
; UNALIGNED_GFX942-LABEL: sgpr_base_negative_offset:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
-; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0xffffffe8
+; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0xffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
index 94853767ccfac..6cc7d8d334690 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
@@ -149,9 +149,9 @@ define amdgpu_kernel void @v_insert_v64i32_varidx(ptr addrspace(1) %out.ptr, ptr
; GCN-NEXT: v_mov_b32_e32 v0, s51
; GCN-NEXT: s_lshl_b32 s4, s4, 2
; GCN-NEXT: buffer_store_dword v0, off, s[0:3], 0 offset:252
-; GCN-NEXT: v_mov_b32_e32 v0, s24
-; GCN-NEXT: v_mov_b32_e32 v1, s4
-; GCN-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
+; GCN-NEXT: v_mov_b32_e32 v0, s4
+; GCN-NEXT: v_mov_b32_e32 v1, s24
+; GCN-NEXT: buffer_store_dword v1, v0, s[0:3], 0 offen
; GCN-NEXT: buffer_load_dword v0, off, s[0:3], 0
; GCN-NEXT: s_nop 0
; GCN-NEXT: buffer_load_dword v1, off, s[0:3], 0 offset:4
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
index 85d852fc779b2..e9db317ec7817 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
@@ -91,14 +91,8 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gep4
@@ -119,14 +113,8 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gep4
@@ -243,14 +231,8 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s64_flat_gep4
@@ -271,14 +253,8 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s64_flat_gep4
@@ -328,14 +304,8 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -346,14 +316,8 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX10-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -364,14 +328,8 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -382,14 +340,8 @@ body: |
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX12-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
index 09eb77fcbdd9d..7545ddc8a436c 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
@@ -146,14 +146,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gep4
@@ -164,14 +158,8 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gep4
@@ -342,14 +330,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s64_global_gep4
@@ -360,14 +342,8 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX8-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s64_global_gep4
@@ -417,21 +393,15 @@ body: |
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE2]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[REG_SEQUENCE1]], [[REG_SEQUENCE3]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
- ; GFX6-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
- ; GFX6-NEXT: $vgpr0 = COPY [[COPY7]]
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[V_ADD_U]], [[REG_SEQUENCE2]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
+ ; GFX6-NEXT: $vgpr0 = COPY [[COPY3]]
;
; GFX7-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2, $vgpr3
@@ -441,21 +411,15 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE2]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[REG_SEQUENCE1]], [[REG_SEQUENCE3]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
- ; GFX7-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
- ; GFX7-NEXT: $vgpr0 = COPY [[COPY7]]
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[V_ADD_U]], [[REG_SEQUENCE2]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
+ ; GFX7-NEXT: $vgpr0 = COPY [[COPY3]]
;
; GFX7-FLAT-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2, $vgpr3
@@ -465,14 +429,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
@@ -483,14 +441,8 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
@@ -820,16 +772,10 @@ body: |
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4095
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY3]], [[COPY4]], implicit-def $scc
- ; GFX7-FLAT-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY5]], [[COPY6]], implicit-def dead $scc, implicit $scc
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[COPY7:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY7]], [[REG_SEQUENCE1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY3]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_sgpr_ptr_offset_4095
@@ -839,16 +785,10 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4095
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY3]], [[COPY4]], implicit-def $scc
- ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY5]], [[COPY6]], implicit-def dead $scc, implicit $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
- ; GFX8-NEXT: [[COPY7:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY7]], [[REG_SEQUENCE1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY3]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_sgpr_ptr_offset_4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
index e90e7d95b4c1e..41916dd2973c1 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
@@ -25,6 +25,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_local
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -34,6 +35,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_local
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -42,6 +44,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_gfx9_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_gfx9_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_local
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -74,10 +77,11 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: %4:vgpr_32, dead %6:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 %4, [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
+ ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -87,6 +91,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -95,6 +100,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_gfx9_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 4, 0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_gfx9_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -131,6 +137,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_local
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -140,6 +147,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_local
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -148,6 +156,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_gfx9_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_local
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
@@ -182,6 +191,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -191,6 +201,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -199,6 +210,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_gfx9_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
index faad869f67a62..761fe5a79ebd7 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
@@ -25,6 +25,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_region
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -34,6 +35,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_region
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -42,6 +44,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_region
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -74,10 +77,11 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: %4:vgpr_32, dead %6:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 %4, [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
+ ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -87,6 +91,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -95,6 +100,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -131,6 +137,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_region
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -140,6 +147,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_region
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -148,6 +156,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_region
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
@@ -182,6 +191,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -191,6 +201,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -199,6 +210,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
+ ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
index dc317a8413cd5..305a05a5f1950 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
@@ -65,14 +65,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2047
@@ -89,14 +83,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -122,14 +110,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2047_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -144,14 +126,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 2047
@@ -175,14 +151,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2048
@@ -199,14 +169,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -232,14 +196,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2048_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -254,14 +212,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 2048
@@ -285,14 +237,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset4095
@@ -309,14 +255,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -342,14 +282,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset4095_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -364,14 +298,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 4095
@@ -395,14 +323,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -411,14 +333,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX10-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -427,14 +343,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -443,14 +353,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX12-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -484,14 +388,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX9-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -499,14 +397,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX10-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -514,14 +406,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX11-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -529,14 +415,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX12-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -611,14 +491,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s64_offset4095
@@ -635,14 +509,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
@@ -668,14 +536,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s64_offset4095_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -690,14 +552,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(s64) = G_CONSTANT i64 4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
index 0d1508900a98a..44c1a43e1ba18 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
@@ -164,14 +164,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2047
@@ -241,14 +235,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2047_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -313,14 +301,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2048
@@ -337,14 +319,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset2048
@@ -398,14 +374,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2048_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -420,14 +390,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset2048_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -478,14 +442,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4095
@@ -502,14 +460,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4095
@@ -563,14 +515,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4095_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -585,14 +531,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4095_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -644,14 +584,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -660,14 +594,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX10-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -676,14 +604,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -692,14 +614,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX12-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -746,14 +662,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -761,14 +671,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX10-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -776,14 +680,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -791,14 +689,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX12-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -971,14 +863,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s64_offset4095
@@ -995,14 +881,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = GLOBAL_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s64), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = GLOBAL_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s64), addrspace 1)
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[GLOBAL_ATOMIC_ADD_X2_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s64_offset4095
@@ -1056,14 +936,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s64_offset4095_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -1078,14 +952,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s64_offset4095_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
index f9fc72a65dbd9..ecd3160088852 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
@@ -24,6 +24,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_local
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -32,6 +33,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_local
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
@@ -61,10 +63,11 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: %3:vgpr_32, dead %5:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 %3, [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
+ ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -73,6 +76,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 4, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
index 551261ee8a76b..bb94760ee6857 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
@@ -24,6 +24,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_region
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -32,6 +33,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_region
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
@@ -61,10 +63,11 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: %3:vgpr_32, dead %5:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 %3, [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
+ ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -73,6 +76,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
+ ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
index cebdffc74847c..05f399a88e9e4 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
@@ -429,14 +429,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -444,14 +438,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -459,14 +447,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -474,14 +456,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -514,14 +490,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_flat_s32_seq_cst_gep_4095
@@ -536,14 +506,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_atomic_flat_s32_seq_cst_gep_4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
index eafc96dd32bdd..70fda045cef7e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
@@ -451,19 +451,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_DWORD_ADDR64_]]
;
; GFX7-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -471,19 +465,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_DWORD_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -491,14 +479,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -562,14 +544,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_global_s32_seq_cst_gep_4095
@@ -584,14 +560,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 4095
@@ -617,19 +587,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[BUFFER_LOAD_DWORDX2_ADDR64_]]
;
; GFX7-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
@@ -637,19 +601,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[BUFFER_LOAD_DWORDX2_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
@@ -657,14 +615,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = FLAT_LOAD_DWORDX2 [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s64), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = FLAT_LOAD_DWORDX2 [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s64), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_LOAD_DWORDX2_]]
;
; GFX9-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
index 5c2df3904b817..99e7529da5dc1 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
@@ -24,6 +24,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
+ ;
; GFX7-LABEL: name: load_atomic_local_s32_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -31,6 +32,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
+ ;
; GFX9-LABEL: name: load_atomic_local_s32_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -61,6 +63,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(<2 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s16>), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[LOAD]](<2 x s16>)
+ ;
; GFX7-LABEL: name: load_atomic_local_v2s16_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -68,6 +71,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(<2 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s16>), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[LOAD]](<2 x s16>)
+ ;
; GFX9-LABEL: name: load_atomic_local_v2s16_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -98,6 +102,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(p3) = G_LOAD [[COPY]](p3) :: (load seq_cst (p3), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[LOAD]](p3)
+ ;
; GFX7-LABEL: name: load_atomic_local_p3_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -105,6 +110,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(p3) = G_LOAD [[COPY]](p3) :: (load seq_cst (p3), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[LOAD]](p3)
+ ;
; GFX9-LABEL: name: load_atomic_local_p3_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -135,6 +141,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B64_:%[0-9]+]]:vreg_64 = DS_READ_B64 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ_B64_]]
+ ;
; GFX7-LABEL: name: load_atomic_local_s64_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -142,6 +149,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B64_:%[0-9]+]]:vreg_64 = DS_READ_B64 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ_B64_]]
+ ;
; GFX9-LABEL: name: load_atomic_local_s64_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -172,6 +180,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s32>), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<2 x s32>)
+ ;
; GFX7-LABEL: name: load_atomic_local_v2s32_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -179,6 +188,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s32>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<2 x s32>)
+ ;
; GFX9-LABEL: name: load_atomic_local_v2s32_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -209,6 +219,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<4 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<4 x s16>), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<4 x s16>)
+ ;
; GFX7-LABEL: name: load_atomic_local_v4s16_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -216,6 +227,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<4 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<4 x s16>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<4 x s16>)
+ ;
; GFX9-LABEL: name: load_atomic_local_v4s16_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -246,6 +258,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p1) = G_LOAD [[COPY]](p3) :: (load seq_cst (p1), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p1)
+ ;
; GFX7-LABEL: name: load_atomic_local_p1_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -253,6 +266,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p1) = G_LOAD [[COPY]](p3) :: (load seq_cst (p1), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p1)
+ ;
; GFX9-LABEL: name: load_atomic_local_p1_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -283,6 +297,7 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p0) = G_LOAD [[COPY]](p3) :: (load seq_cst (p0), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p0)
+ ;
; GFX7-LABEL: name: load_atomic_local_p0_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -290,6 +305,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p0) = G_LOAD [[COPY]](p3) :: (load seq_cst (p0), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p0)
+ ;
; GFX9-LABEL: name: load_atomic_local_p0_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -318,10 +334,11 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
+ ;
; GFX7-LABEL: name: load_atomic_local_s32_seq_cst_gep_65535
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -329,6 +346,7 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 65535, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
+ ;
; GFX9-LABEL: name: load_atomic_local_s32_seq_cst_gep_65535
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
index ada80da490fc5..442bdb2a3b044 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
@@ -1375,14 +1375,8 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
+ ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX6-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX7-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1390,14 +1384,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
- ; GFX7-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX7-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX7-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
+ ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX7-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX8-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1405,14 +1393,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
+ ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX8-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX10-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1420,14 +1402,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
+ ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX10-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX11-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1435,14 +1411,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
+ ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX11-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
%0:sgpr(p4) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -1
@@ -1468,14 +1438,8 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX6-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX7-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1483,14 +1447,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX7-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX7-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX7-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX7-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX7-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX8-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1498,14 +1456,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX8-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX10-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1513,14 +1465,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX10-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX11-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1528,14 +1474,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
; GFX11-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
%0:sgpr(p4) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -524288
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
index e1325a0a0bc50..9bd5700e1e81c 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
@@ -1129,14 +1129,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1144,14 +1138,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1166,14 +1154,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1213,14 +1195,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1228,14 +1204,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1250,14 +1220,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1297,14 +1261,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1312,14 +1270,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1327,14 +1279,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1342,14 +1288,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1357,14 +1297,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1397,14 +1331,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1412,14 +1340,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1427,14 +1349,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1442,14 +1358,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1457,14 +1367,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1497,14 +1401,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1512,14 +1410,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1534,14 +1426,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1581,14 +1467,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1596,14 +1476,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1611,14 +1485,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1626,14 +1494,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1641,14 +1503,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1681,14 +1537,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1696,14 +1546,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1711,14 +1555,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1726,14 +1564,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1741,14 +1573,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1781,14 +1607,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1796,14 +1616,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1811,14 +1625,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1826,14 +1634,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1841,14 +1643,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1881,14 +1677,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1896,14 +1686,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1911,14 +1695,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1926,14 +1704,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1941,14 +1713,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1981,14 +1747,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1996,14 +1756,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -2011,14 +1765,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -2026,14 +1774,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -2041,14 +1783,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -2081,14 +1817,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -2096,14 +1826,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -2111,14 +1835,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -2126,14 +1844,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -2141,14 +1853,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -2181,14 +1887,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -2196,14 +1896,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -2211,14 +1905,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -2226,14 +1914,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -2241,14 +1923,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -2281,14 +1957,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2296,14 +1966,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2311,14 +1975,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2326,14 +1984,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2341,14 +1993,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2381,14 +2027,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2396,14 +2036,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2411,14 +2045,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2426,14 +2054,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2441,14 +2063,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2456,14 +2072,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX12-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -2489,14 +2099,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2504,14 +2108,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2519,14 +2117,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2534,14 +2126,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2549,14 +2135,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2589,14 +2169,8 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2604,14 +2178,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2619,14 +2187,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2634,14 +2196,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2649,14 +2205,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2664,14 +2214,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX12-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
index 864fa2ab438df..aabcab6c9e40e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
@@ -1360,14 +1360,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1375,14 +1369,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1460,14 +1448,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1475,14 +1457,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1497,14 +1473,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1544,19 +1514,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1564,19 +1528,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1584,14 +1542,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1599,14 +1551,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1660,19 +1606,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1680,19 +1620,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1700,14 +1634,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1715,14 +1643,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1800,14 +1722,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1815,14 +1731,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1837,14 +1747,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1910,14 +1814,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1925,14 +1823,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1940,14 +1832,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1955,14 +1841,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1970,14 +1850,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_4096
@@ -2010,19 +1884,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2030,19 +1898,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2050,14 +1912,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2065,14 +1921,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2087,14 +1937,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2134,19 +1978,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2154,19 +1992,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2174,14 +2006,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2189,14 +2015,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2211,14 +2031,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2284,14 +2098,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2299,14 +2107,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2314,14 +2116,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2329,14 +2125,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2344,14 +2134,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2410,14 +2194,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2425,14 +2203,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2440,14 +2212,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2455,14 +2221,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2470,14 +2230,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2510,19 +2264,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2530,19 +2278,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2550,14 +2292,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2565,14 +2301,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2580,14 +2310,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2595,14 +2319,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2610,14 +2328,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2650,19 +2362,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2670,19 +2376,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2690,14 +2390,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2705,14 +2399,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2720,14 +2408,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2735,14 +2417,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2750,14 +2426,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2816,14 +2486,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2831,14 +2495,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2846,14 +2504,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2861,14 +2513,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2876,14 +2522,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2942,14 +2582,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2957,14 +2591,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2972,14 +2600,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2987,14 +2609,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -3002,14 +2618,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -3017,14 +2627,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -3050,19 +2654,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3070,19 +2668,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3090,14 +2682,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3105,14 +2691,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3120,14 +2700,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3135,14 +2709,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3150,14 +2718,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3190,19 +2752,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3210,19 +2766,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3230,14 +2780,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3245,14 +2789,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3260,14 +2798,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3275,14 +2807,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3290,14 +2816,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3305,14 +2825,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
index 65f6b8879e16f..f4de883d67900 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
@@ -178,14 +178,8 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX9-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX9-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -196,14 +190,8 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -214,14 +202,8 @@ body: |
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX11-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX11-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -232,14 +214,8 @@ body: |
; GFX12-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX12-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX12-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:vgpr(s32) = COPY $vgpr0
@@ -279,22 +255,10 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %zero:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX10-NEXT: %zext:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %zero, %subreg.sub1
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY %zext.sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY %zext.sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], %zext, implicit-def dead $vcc_lo, implicit $exec
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX10-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX10-NEXT: [[COPY10:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY7]], [[COPY8]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY9]], [[COPY10]], killed [[V_ADD_CO_U32_e64_3]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_2]], %subreg.sub0, [[V_ADDC_U32_e64_2]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U1:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[V_ADD_U]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_zext_vgpr_offset4095
@@ -352,22 +316,10 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %zero:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX10-NEXT: %zext:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %zero, %subreg.sub1
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY %zext.sub0
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
- ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY %zext.sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], %zext, implicit-def dead $vcc_lo, implicit $exec
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
- ; GFX10-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
- ; GFX10-NEXT: [[COPY10:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY7]], [[COPY8]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY9]], [[COPY10]], killed [[V_ADD_CO_U32_e64_3]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_2]], %subreg.sub0, [[V_ADDC_U32_e64_2]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U1:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[V_ADD_U]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_zext_vgpr_offset_neg4096
@@ -512,15 +464,9 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec_xnull = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64_xexec_xnull = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
; GFX9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[REG_SEQUENCE]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[S_ADD_U]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD_SADDR]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -528,15 +474,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -544,15 +484,9 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -644,15 +578,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -2049
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg2049
@@ -742,15 +670,9 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -758,15 +680,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -774,15 +690,9 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -790,15 +700,9 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 4294967296
@@ -824,15 +728,9 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec_xnull = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64_xexec_xnull = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
; GFX9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[REG_SEQUENCE]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[S_ADD_U]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD_SADDR]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -840,15 +738,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -856,15 +748,9 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -872,15 +758,9 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 4294971390
@@ -906,15 +786,9 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -922,15 +796,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -938,15 +806,9 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -954,15 +816,9 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -4294967295
@@ -987,15 +843,9 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -1003,15 +853,9 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -1019,15 +863,9 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -1035,15 +873,9 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
- ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
- ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -4294967296
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
index d67c2a694c1d0..6fef1da521ff9 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
@@ -1326,14 +1326,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1341,14 +1335,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1426,14 +1414,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1441,14 +1423,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1463,14 +1439,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1510,19 +1480,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1530,19 +1494,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1550,14 +1508,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1565,14 +1517,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1626,19 +1572,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1646,19 +1586,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1666,14 +1600,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1681,14 +1609,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1766,14 +1688,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1781,14 +1697,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1803,14 +1713,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1876,14 +1780,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1891,14 +1789,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1906,14 +1798,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1921,14 +1807,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1936,14 +1816,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1976,19 +1850,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1996,19 +1864,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2016,14 +1878,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2031,14 +1887,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2053,14 +1903,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -2100,19 +1944,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2120,19 +1958,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2140,14 +1972,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2155,14 +1981,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2177,14 +1997,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2250,14 +2064,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2265,14 +2073,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2280,14 +2082,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2295,14 +2091,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2310,14 +2100,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2376,14 +2160,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2391,14 +2169,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2406,14 +2178,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2421,14 +2187,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2436,14 +2196,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2476,19 +2230,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2496,19 +2244,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2516,14 +2258,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2531,14 +2267,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2546,14 +2276,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2561,14 +2285,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2576,14 +2294,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2616,19 +2328,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2636,19 +2342,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2656,14 +2356,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2671,14 +2365,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2686,14 +2374,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2701,14 +2383,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2716,14 +2392,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2782,14 +2452,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2797,14 +2461,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2812,14 +2470,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2827,14 +2479,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2842,14 +2488,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2908,14 +2548,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2923,14 +2557,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2938,14 +2566,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2953,14 +2575,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2968,14 +2584,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2983,14 +2593,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -3016,19 +2620,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3036,19 +2634,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3056,14 +2648,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3071,14 +2657,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3086,14 +2666,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3101,14 +2675,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3116,14 +2684,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -3156,19 +2718,13 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3176,19 +2732,13 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3196,14 +2746,8 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3211,14 +2755,8 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3226,14 +2764,8 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3241,14 +2773,8 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3256,14 +2782,8 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -3271,14 +2791,8 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
index 94104885748a9..59550bf5fe9c9 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
@@ -22,18 +22,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B128_:%[0-9]+]]:vreg_128 = DS_READ_B128 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_]]
+ ;
; GFX9-LABEL: name: load_local_v4s32_align16
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ_B128_gfx9_:%[0-9]+]]:vreg_128 = DS_READ_B128_gfx9 [[COPY]], 0, 0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v4s32_align16
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ_B128_gfx9_:%[0-9]+]]:vreg_128 = DS_READ_B128_gfx9 [[COPY]], 0, 0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v4s32_align16
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -64,18 +67,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
+ ;
; GFX9-LABEL: name: load_local_v4s32_align_8
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v4s32_align_8
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v4s32_align_8
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -106,18 +112,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 50, 51, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
+ ;
; GFX9-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 50, 51, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 50, 51, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -148,10 +157,11 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4000, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[V_ADD_CO_U32_e64_]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
+ ;
; GFX9-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -160,6 +170,7 @@ body: |
; GFX9-NEXT: [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[V_ADD_U32_e64_]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
@@ -168,6 +179,7 @@ body: |
; GFX10-NEXT: [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[V_ADD_U32_e64_]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -202,18 +214,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
+ ;
; GFX9-LABEL: name: load_local_v2s64
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v2s64
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v2s64
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -244,18 +259,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
+ ;
; GFX9-LABEL: name: load_local_v2p1
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX9-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
+ ;
; GFX10-LABEL: name: load_local_v2p1
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX10-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
+ ;
; GFX11-LABEL: name: load_local_v2p1
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -286,18 +304,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
+ ;
; GFX9-LABEL: name: load_local_s128
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX9-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
+ ;
; GFX10-LABEL: name: load_local_s128
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX10-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
+ ;
; GFX11-LABEL: name: load_local_s128
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -328,18 +349,21 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
+ ;
; GFX9-LABEL: name: load_local_v8s16
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX10-LABEL: name: load_local_v8s16
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
+ ;
; GFX11-LABEL: name: load_local_v8s16
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
index 59c57a5fefbed..e03e0cb95e445 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
@@ -740,7 +740,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -849,7 +849,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65536, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -859,7 +859,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65536, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -905,7 +905,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -1, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -915,7 +915,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -1, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -1021,7 +1021,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1020, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B32_:%[0-9]+]]:vreg_64 = DS_READ2_B32 [[V_ADD_CO_U32_e64_]], 0, 1, 0, implicit $m0, implicit $exec :: (load (s64), align 4, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ2_B32_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
index df148638e7005..df72ad7bf0c0e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
@@ -312,7 +312,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -416,7 +416,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -469,7 +469,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -522,7 +522,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -575,7 +575,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -628,7 +628,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -683,7 +683,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -736,7 +736,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -789,7 +789,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -844,7 +844,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -899,7 +899,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -954,7 +954,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1009,7 +1009,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8388607, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1064,7 +1064,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 16777214, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1121,7 +1121,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8388608, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1176,7 +1176,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -16777215, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1490,7 +1490,7 @@ body: |
; GFX6-LABEL: name: load_private_s32_from_1_fi_offset_4096
; GFX6: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
index 6f971788727b2..99fa052415a12 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
@@ -46,25 +46,13 @@ regBankSelected: true
# Max immediate for CI
# SIVI: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 17179869180
-# SIVI-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
-# SIVI-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
-# SIVI-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
-# SIVI-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
-# SIVI-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
-# SIVI: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
-# SIVI: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
+# SIVI-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
# SIVI: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# CI: S_LOAD_DWORD_IMM_ci [[PTR]], 4294967295, 0
# Immediate overflow for CI
# GCN: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 17179869184
-# GCN-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
-# GCN-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
-# GCN-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
-# GCN-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
-# GCN-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
-# GCN: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
-# GCN: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
+# GCN-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
# GCN: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# Max 32-bit byte offset
@@ -74,13 +62,7 @@ regBankSelected: true
# Overflow 32-bit byte offset
# SIVI: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
-# SIVI-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
-# SIVI-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
-# SIVI-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
-# SIVI-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
-# SIVI-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
-# SIVI: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
-# SIVI: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
+# SIVI-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
# SIVI: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# CI: S_LOAD_DWORD_IMM_ci [[PTR]], 1073741824, 0
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
index 681c366aa3411..8b65e719d169a 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
@@ -131,8 +131,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p3_vgpr_vgpr_vgpr
@@ -168,8 +168,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p5_vgpr_vgpr_vgpr
@@ -206,7 +206,7 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p3_s32_vgpr_vgpr_vgpr
@@ -244,7 +244,7 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p5_s32_vgpr_vgpr_vgpr
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
index 41e416e3f5d72..12703b7b35f68 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
@@ -20,70 +20,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX8-LABEL: name: gep_p0_sgpr_sgpr
; GFX8: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX9-LABEL: name: gep_p0_sgpr_sgpr
; GFX9: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX10-WAVE64-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX10-WAVE32-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
%0:sgpr(p0) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = COPY $sgpr2_sgpr3
%2:sgpr(p0) = G_PTR_ADD %0, %1
@@ -104,70 +74,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX8-LABEL: name: gep_p0_vgpr_vgpr
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX9-LABEL: name: gep_p0_vgpr_vgpr
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_vgpr_vgpr
; GFX10-WAVE64: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_vgpr_vgpr
; GFX10-WAVE32: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(p0) = G_PTR_ADD %0, %1
@@ -188,70 +128,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX8-LABEL: name: gep_p0_sgpr_vgpr
; GFX8: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX9-LABEL: name: gep_p0_sgpr_vgpr
; GFX9: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_sgpr_vgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_sgpr_vgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
%0:sgpr(p0) = COPY $sgpr0_sgpr1
%1:vgpr(s64) = COPY $vgpr0_vgpr1
%2:vgpr(p0) = G_PTR_ADD %0, %1
@@ -272,40 +182,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX8-LABEL: name: gep_p3_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX9-LABEL: name: gep_p3_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p3_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p3_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
%0:sgpr(p3) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p3) = G_PTR_ADD %0, %1
@@ -326,7 +236,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX8-LABEL: name: gep_p3_vgpr_vgpr
@@ -334,7 +244,7 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX9-LABEL: name: gep_p3_vgpr_vgpr
@@ -380,7 +290,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX8-LABEL: name: gep_p3_sgpr_vgpr
@@ -388,7 +298,7 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX9-LABEL: name: gep_p3_sgpr_vgpr
@@ -434,40 +344,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX8-LABEL: name: gep_p6_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX9-LABEL: name: gep_p6_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p6_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p6_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
%0:sgpr(p6) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p6) = G_PTR_ADD %0, %1
@@ -488,40 +398,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX8-LABEL: name: gep_p2_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX9-LABEL: name: gep_p2_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p2_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p2_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
%0:sgpr(p2) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p2) = G_PTR_ADD %0, %1
@@ -542,70 +452,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX8-LABEL: name: gep_p999_sgpr_sgpr
; GFX8: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX9-LABEL: name: gep_p999_sgpr_sgpr
; GFX9: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX10-WAVE64-LABEL: name: gep_p999_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX10-WAVE64-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
;
; GFX10-WAVE32-LABEL: name: gep_p999_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
- ; GFX10-WAVE32-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
- ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
%0:sgpr(p999) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = COPY $sgpr2_sgpr3
%2:sgpr(p999) = G_PTR_ADD %0, %1
@@ -626,70 +506,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX8-LABEL: name: gep_p999_vgpr_vgpr
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX9-LABEL: name: gep_p999_vgpr_vgpr
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE64-LABEL: name: gep_p999_vgpr_vgpr
; GFX10-WAVE64: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
;
; GFX10-WAVE32-LABEL: name: gep_p999_vgpr_vgpr
; GFX10-WAVE32: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
- ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
- ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
+ ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
%0:vgpr(p999) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(p999) = G_PTR_ADD %0, %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
index 37958480d28a5..778b1d8fe6234 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
@@ -113,7 +113,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_I8_:%[0-9]+]]:vgpr_32 = DS_READ_I8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_I8_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
index 6e92d851dee2e..92b5d6da6597c 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
@@ -1175,14 +1175,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_gep_2047
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1190,14 +1184,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_gep_2047
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1212,14 +1200,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_gep_2047
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1259,14 +1241,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1274,14 +1250,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1289,14 +1259,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1304,14 +1268,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1319,14 +1277,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1359,14 +1311,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1374,14 +1320,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1389,14 +1329,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1404,14 +1338,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1419,14 +1347,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1434,14 +1356,8 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 16777214
@@ -1467,14 +1383,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1482,14 +1392,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1497,14 +1401,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1512,14 +1410,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1527,14 +1419,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1567,14 +1453,8 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1582,14 +1462,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1597,14 +1471,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1612,14 +1480,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1627,14 +1489,8 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1642,14 +1498,8 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
index 9136f44dfc227..d704dee43c3e1 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
@@ -1289,14 +1289,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_gep_2047
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1304,14 +1298,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_gep_2047
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1384,14 +1372,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1399,14 +1381,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1414,14 +1390,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1429,14 +1399,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1468,19 +1432,13 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1488,19 +1446,13 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-FLAT-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1508,14 +1460,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1523,14 +1469,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1538,14 +1478,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1553,14 +1487,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1619,14 +1547,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1634,14 +1556,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1649,14 +1565,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1664,14 +1574,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1679,14 +1583,8 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 16777214
@@ -1712,19 +1610,13 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1732,19 +1624,13 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
+ ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-FLAT-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1752,14 +1638,8 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1767,14 +1647,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1782,14 +1656,8 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1797,14 +1665,8 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1812,14 +1674,8 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
- ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
- ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
- ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
- ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
- ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
- ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
- ; GFX12-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
+ ; GFX12-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
index 1b7c0fcb76714..7b609c44d78cd 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
@@ -27,6 +27,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s32), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s32_to_4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -34,12 +35,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s32), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s32_to_4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s32), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s32_to_4
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -73,6 +76,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B16 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s16), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s32_to_2
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -80,12 +84,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B16 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s16), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s32_to_2
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B16_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s16), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s32_to_2
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -119,6 +125,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s32_to_1
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -126,12 +133,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s32_to_1
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s32_to_1
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -165,6 +174,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s16>), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_v2s16
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -172,12 +182,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s16>), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_v2s16
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<2 x s16>), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_v2s16
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -211,6 +223,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p3), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_p3
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -218,12 +231,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p3), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_p3
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (p3), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_p3
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -251,15 +266,18 @@ body: |
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s32_to_1_constant_4095
; GFX7: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX7-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s32_to_1_constant_4095
; GFX9: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX9-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s32_to_1_constant_4095
; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX10-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
@@ -290,15 +308,18 @@ body: |
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s32_to_1_constant_4096
; GFX7: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX7-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s32_to_1_constant_4096
; GFX9: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX9-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s32_to_1_constant_4096
; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX10-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
@@ -330,6 +351,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[COPY1]](p3) :: (store (s64), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s64_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -339,6 +361,7 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s64_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -347,6 +370,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s64_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -382,6 +406,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](p1), [[COPY1]](p3) :: (store (p1), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_p1_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -391,6 +416,7 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (p1), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_p1_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -399,6 +425,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (p1), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_p1_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -434,6 +461,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](<2 x s32>), [[COPY1]](p3) :: (store (<2 x s32>), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_v2s32_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -443,6 +471,7 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_v2s32_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -451,6 +480,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (<2 x s32>), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_v2s32_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -486,6 +516,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](<4 x s16>), [[COPY1]](p3) :: (store (<4 x s16>), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_v4s16_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -495,6 +526,7 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_v4s16_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -503,6 +535,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (<4 x s16>), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_v4s16_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -538,6 +571,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s64), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s64_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -545,12 +579,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s64), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s64_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s64), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s64_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -584,6 +620,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p1), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_p1_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -591,12 +628,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p1), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_p1_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (p1), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_p1_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -630,6 +669,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_v2s32_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -637,12 +677,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_v2s32_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<2 x s32>), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_v2s32_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -676,6 +718,7 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_v4s16_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -683,12 +726,14 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_v4s16_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<4 x s16>), addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_v4s16_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -724,6 +769,7 @@ body: |
; GFX6-NEXT: [[PTR_ADD:%[0-9]+]]:vgpr(p3) = G_PTR_ADD [[COPY1]], [[C]](s32)
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[PTR_ADD]](p3) :: (store (s64), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -733,6 +779,7 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 254, 255, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -741,6 +788,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 254, 255, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -780,17 +828,19 @@ body: |
; GFX6-NEXT: [[PTR_ADD:%[0-9]+]]:vgpr(p3) = G_PTR_ADD [[COPY1]], [[C]](s32)
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[PTR_ADD]](p3) :: (store (s64), align 4, addrspace 3)
+ ;
; GFX7-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1020, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY1]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[V_ADD_CO_U32_e64_]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX9-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -801,6 +851,7 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[V_ADD_U32_e64_]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
+ ;
; GFX10-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
index 60357abbc7721..4c966e474dedf 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
@@ -1006,7 +1006,7 @@ body: |
; GFX6-NEXT: [[V_LSHRREV_B32_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B32_e64 6, $sgpr32, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 4096
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]]
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_LSHRREV_B32_e64_]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_LSHRREV_B32_e64_]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: BUFFER_STORE_DWORD_OFFEN [[COPY]], [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 5)
;
; GFX9-LABEL: name: function_store_private_s32_to_4_wave_address_offset_copy_constant_4096
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
index 29671c13e173f..0fad45e817f51 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
@@ -113,7 +113,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
index 0b9f31e3a765e..e62cfda72eefb 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
@@ -5,7 +5,7 @@
@lds.defined = unnamed_addr addrspace(3) global [8 x i32] poison, align 8
; GCN-LABEL: {{^}}test_basic:
-; GCN: s_add_u32 s0, lds.defined at abs32@lo, s0 ; encoding: [0xff,0x00,0x00,0x80,A,A,A,A]
+; GCN: s_add_i32 s0, s0, lds.defined at abs32@lo ; encoding: [0x00,0xff,0x00,0x81,A,A,A,A]
; GCN: v_mov_b32_e32 v2, s0 ; encoding: [0x00,0x02,0x04,0x7e]
; GCN: .globl lds.external
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
index 39dde4bc86bec..7a82ef94c78f3 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
@@ -15,10 +15,10 @@ define amdgpu_kernel void @load_zeroinit_lds_global(ptr addrspace(1) %out, i1 %p
; GCN: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr4_sgpr5
; GFX8: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 40
; GCN: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 target-flags(amdgpu-abs32-lo) @lds
- ; GFX8: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[S_MOV_B32_1]], [[S_MOV_B32_]], implicit-def dead $scc
+ ; GFX8: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = nuw S_ADD_I32 [[S_MOV_B32_1]], [[S_MOV_B32_]], implicit-def dead $scc
; GFX8: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM [[COPY]], 9, 0
; GFX9: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM [[COPY]], 36, 0
- ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_ADD_U32_]]
+ ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_ADD_I32_]]
; GCN: $m0 = S_MOV_B32 -1
; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_1]]
; GFX8: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY1]], 0, 0, implicit $m0, implicit $exec
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
index 07d5ff2036d93..8167f071be05d 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
@@ -96,10 +96,12 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_offset4294967296(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_store_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: v_dual_mov_b32 v2, 0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: s_mov_b32 s0, 0
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: v_mov_b32_e32 v2, 0
+; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i64 4294967296
@@ -134,10 +136,12 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_offset4294967297(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_store_sgpr_ptr_offset4294967297:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 4
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: v_dual_mov_b32 v2, 0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: s_mov_b32 s0, 4
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: v_mov_b32_e32 v2, 0
+; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i64 4294967297
@@ -343,8 +347,7 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_sgpr_offset(ptr addrspace(1) inreg %
; GFX12-NEXT: v_mov_b32_e32 v0, 0
; GFX12-NEXT: s_lshl_b64 s[0:1], s[4:5], 2
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_co_u32 s0, s2, s0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, s1
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: global_store_b32 v0, v0, s[0:1]
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i32 %soffset
@@ -691,9 +694,10 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_offset4294967296(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_load_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_mov_b32 s0, 0
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_load_b32 v0, v[0:1], off scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
@@ -730,9 +734,10 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_offset4294967297(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_load_sgpr_ptr_offset4294967297:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 4
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_mov_b32 s0, 4
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_load_b32 v0, v[0:1], off scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
@@ -941,8 +946,7 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_sgpr_offset(ptr addrspace(1) inreg %
; GFX12-NEXT: v_mov_b32_e32 v0, 0
; GFX12-NEXT: s_lshl_b64 s[0:1], s[4:5], 2
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_co_u32 s0, s2, s0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, s1
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: global_load_b32 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: ; return to shader part epilog
@@ -1272,11 +1276,12 @@ define amdgpu_ps float @mubuf_atomicrmw_sgpr_ptr_offset4294967296(ptr addrspace(
;
; GFX12-LABEL: mubuf_atomicrmw_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: s_mov_b32 s0, 0
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: v_mov_b32_e32 v2, 2
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
-; GFX12-NEXT: v_mov_b32_e32 v2, 2
; GFX12-NEXT: global_atomic_add_u32 v0, v[0:1], v2, off th:TH_ATOMIC_RETURN scope:SCOPE_DEV
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: global_inv scope:SCOPE_DEV
@@ -1508,8 +1513,10 @@ define amdgpu_ps float @mubuf_cmpxchg_sgpr_ptr_offset4294967296(ptr addrspace(1)
;
; GFX12-LABEL: mubuf_cmpxchg_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: s_mov_b32 s0, 0
+; GFX12-NEXT: s_mov_b32 s1, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-NEXT: v_mov_b32_e32 v2, v0
; GFX12-NEXT: v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0
; GFX12-NEXT: global_atomic_cmpswap_b32 v0, v[3:4], v[1:2], off th:TH_ATOMIC_RETURN scope:SCOPE_DEV
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
index 8bb060fc4ed54..6d5b6d6fb1df2 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
@@ -34,13 +34,13 @@ define amdgpu_kernel void @kernel_non_entry_block_static_alloca_uniformly_reache
; GCN-NEXT: v_mov_b32_e32 v3, 1
; GCN-NEXT: s_waitcnt lgkmcnt(0)
; GCN-NEXT: s_lshl_b32 s4, s4, 2
-; GCN-NEXT: s_add_u32 s4, s6, s4
+; GCN-NEXT: s_add_i32 s4, s6, s4
; GCN-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GCN-NEXT: buffer_store_dword v3, v2, s[0:3], 0 offen offset:4
; GCN-NEXT: v_mov_b32_e32 v2, s4
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x0
-; GCN-NEXT: s_add_u32 s32, s6, 0x1000
+; GCN-NEXT: s_add_i32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v0, v2, v0
; GCN-NEXT: s_waitcnt lgkmcnt(0)
@@ -95,20 +95,20 @@ define amdgpu_kernel void @kernel_non_entry_block_static_alloca_uniformly_reache
; GCN-NEXT: s_cbranch_scc1 .LBB1_2
; GCN-NEXT: ; %bb.1: ; %bb.0
; GCN-NEXT: s_load_dword s4, s[8:9], 0xc
-; GCN-NEXT: s_add_u32 s5, s32, 0xfff
+; GCN-NEXT: s_add_i32 s5, s32, 0xfff
; GCN-NEXT: s_and_b32 s6, s5, 0xfffff000
; GCN-NEXT: v_mov_b32_e32 v1, 0
; GCN-NEXT: v_mov_b32_e32 v2, s6
; GCN-NEXT: s_waitcnt lgkmcnt(0)
; GCN-NEXT: s_lshl_b32 s4, s4, 2
; GCN-NEXT: v_mov_b32_e32 v3, 1
-; GCN-NEXT: s_add_u32 s4, s6, s4
+; GCN-NEXT: s_add_i32 s4, s6, s4
; GCN-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GCN-NEXT: buffer_store_dword v3, v2, s[0:3], 0 offen offset:4
; GCN-NEXT: v_mov_b32_e32 v2, s4
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x0
-; GCN-NEXT: s_add_u32 s32, s6, 0x1000
+; GCN-NEXT: s_add_i32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v0, v2, v0
; GCN-NEXT: s_waitcnt lgkmcnt(0)
@@ -171,7 +171,7 @@ define void @func_non_entry_block_static_alloca_align4(ptr addrspace(1) %out, i3
; GCN-NEXT: v_add_u32_e32 v2, s6, v2
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: v_and_b32_e32 v3, 0x3ff, v31
-; GCN-NEXT: s_add_u32 s32, s6, 0x1000
+; GCN-NEXT: s_add_i32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v2, v2, v3
; GCN-NEXT: global_store_dword v[0:1], v2, off
@@ -224,7 +224,7 @@ define void @func_non_entry_block_static_alloca_align64(ptr addrspace(1) %out, i
; GCN-NEXT: s_and_saveexec_b64 s[4:5], vcc
; GCN-NEXT: s_cbranch_execz .LBB3_2
; GCN-NEXT: ; %bb.1: ; %bb.0
-; GCN-NEXT: s_add_u32 s6, s32, 0xfff
+; GCN-NEXT: s_add_i32 s6, s32, 0xfff
; GCN-NEXT: s_and_b32 s6, s6, 0xfffff000
; GCN-NEXT: v_mov_b32_e32 v2, 0
; GCN-NEXT: v_mov_b32_e32 v4, s6
@@ -235,7 +235,7 @@ define void @func_non_entry_block_static_alloca_align64(ptr addrspace(1) %out, i
; GCN-NEXT: v_add_u32_e32 v2, s6, v2
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: v_and_b32_e32 v3, 0x3ff, v31
-; GCN-NEXT: s_add_u32 s32, s6, 0x1000
+; GCN-NEXT: s_add_i32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v2, v2, v3
; GCN-NEXT: global_store_dword v[0:1], v2, off
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
index 6e8a5a1266a15..e88b036f6402a 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
@@ -103,38 +103,38 @@ define amdgpu_cs_chain void @amdgpu_cs_chain_spill(<24 x i32> inreg %sgprs, <24
; GISEL-GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GISEL-GFX11-NEXT: s_mov_b32 s32, 0
; GISEL-GFX11-NEXT: v_dual_mov_b32 v32, v8 :: v_dual_mov_b32 v33, v9
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 4
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 4
; GISEL-GFX11-NEXT: scratch_store_b32 off, v16, s32
; GISEL-GFX11-NEXT: scratch_store_b32 off, v17, s24
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 8
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 12
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 8
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 12
; GISEL-GFX11-NEXT: scratch_store_b32 off, v18, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v19, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 16
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 20
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 16
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 20
; GISEL-GFX11-NEXT: scratch_store_b32 off, v20, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v21, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 24
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 28
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 24
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 28
; GISEL-GFX11-NEXT: scratch_store_b32 off, v22, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v23, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 32
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 36
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 32
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 36
; GISEL-GFX11-NEXT: scratch_store_b32 off, v24, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v25, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 40
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 40
; GISEL-GFX11-NEXT: v_dual_mov_b32 v34, v10 :: v_dual_mov_b32 v35, v11
; GISEL-GFX11-NEXT: v_dual_mov_b32 v36, v12 :: v_dual_mov_b32 v37, v13
; GISEL-GFX11-NEXT: v_dual_mov_b32 v38, v14 :: v_dual_mov_b32 v39, v15
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 44
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 44
; GISEL-GFX11-NEXT: scratch_store_b32 off, v26, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v27, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 48
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 52
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 48
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 52
; GISEL-GFX11-NEXT: scratch_store_b32 off, v28, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v29, s25
-; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 56
-; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 60
+; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 56
+; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 60
; GISEL-GFX11-NEXT: scratch_store_b32 off, v30, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v31, s25
; GISEL-GFX11-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
diff --git a/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll b/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
index e74fd21365c9d..76ff285fdab54 100644
--- a/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
+++ b/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
@@ -179,7 +179,7 @@ define amdgpu_kernel void @ds_cond_sub_no_rtn_u32(ptr addrspace(3) %addr, i32 %i
; GFX12-GISEL: ; %bb.0: ; %entry
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, -16
+; GFX12-GISEL-NEXT: s_add_co_i32 s0, s0, -16
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v1, s1 :: v_dual_mov_b32 v0, s0
; GFX12-GISEL-NEXT: ds_cond_sub_rtn_u32 v0, v0, v1
@@ -205,7 +205,7 @@ define amdgpu_kernel void @ds_cond_sub_no_rtn_u32_forced(ptr addrspace(3) %addr,
; GFX12-GISEL: ; %bb.0: ; %entry
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, -16
+; GFX12-GISEL-NEXT: s_add_co_i32 s0, s0, -16
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v1, s1 :: v_dual_mov_b32 v0, s0
; GFX12-GISEL-NEXT: ds_cond_sub_u32 v0, v1
diff --git a/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll b/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
index 9ae7c4aaa1e95..8b578add0fad5 100644
--- a/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+++ b/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
@@ -38,7 +38,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform(i32 %n) {
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -77,7 +77,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform(i32 %n) {
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s0, s1
+; GFX11-GISEL-NEXT: s_add_i32 s32, s0, s1
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, addrspace(5)
store volatile i32 123, ptr addrspace(5) %alloca
@@ -112,7 +112,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX9-GISEL-NEXT: s_movk_i32 s32, 0x2000
; GFX9-GISEL-NEXT: s_add_u32 s0, s0, s17
; GFX9-GISEL-NEXT: s_addc_u32 s1, s1, 0
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
@@ -121,7 +121,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -151,7 +151,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX11-GISEL-NEXT: s_load_b32 s0, s[4:5], 0x0
; GFX11-GISEL-NEXT: s_movk_i32 s32, 0x80
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 10
-; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: s_mov_b32 s33, 0
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
@@ -162,7 +162,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX11-GISEL-NEXT: s_and_b32 s0, s0, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 128, addrspace(5)
store volatile i32 10, ptr addrspace(5) %alloca
@@ -203,7 +203,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_under_aligned(
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -242,7 +242,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_under_aligned(
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s0, s1
+; GFX11-GISEL-NEXT: s_add_i32 s32, s0, s1
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 2, addrspace(5)
store volatile i32 22, ptr addrspace(5) %alloca
@@ -299,7 +299,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent() {
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -354,7 +354,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -411,12 +411,12 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_over_aligned
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB4_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -470,11 +470,11 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_over_aligned
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB4_1
; GFX11-GISEL-NEXT: ; %bb.2:
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
-; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -533,7 +533,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_under_aligne
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -586,7 +586,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_under_aligne
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -668,10 +668,10 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, -16
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s5, 6
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xfffff000
; GFX9-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s6
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s6
; GFX9-GISEL-NEXT: v_and_b32_e32 v0, -16, v0
; GFX9-GISEL-NEXT: s_mov_b64 s[6:7], exec
; GFX9-GISEL-NEXT: .LBB6_2: ; =>This Inner Loop Header: Depth=1
@@ -690,7 +690,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB6_4: ; %bb.1
@@ -703,7 +703,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -777,14 +777,14 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: ; %bb.1: ; %bb.0
; GFX11-GISEL-NEXT: v_and_b32_e32 v0, 0x3ff, v0
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s1, s1, 15
-; GFX11-GISEL-NEXT: s_add_u32 s3, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_i32 s3, s32, 0x7ff
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_4) | instid1(VALU_DEP_1)
; GFX11-GISEL-NEXT: s_lshl_b32 s4, s1, 5
; GFX11-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
; GFX11-GISEL-NEXT: s_and_b32 s1, s3, 0xfffff800
; GFX11-GISEL-NEXT: s_mov_b32 s3, exec_lo
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s4
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s4
; GFX11-GISEL-NEXT: v_and_b32_e32 v0, -16, v0
; GFX11-GISEL-NEXT: .LBB6_2: ; =>This Inner Loop Header: Depth=1
; GFX11-GISEL-NEXT: s_ctz_i32_b32 s4, s3
@@ -802,7 +802,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s3 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s3, s2
+; GFX11-GISEL-NEXT: s_add_i32 s32, s3, s2
; GFX11-GISEL-NEXT: .LBB6_4: ; %bb.1
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s0, s0, 15
; GFX11-GISEL-NEXT: v_dual_mov_b32 v0, 1 :: v_dual_mov_b32 v1, 2
@@ -813,7 +813,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: s_endpgm
entry:
%cond = icmp eq i32 %n, 0
@@ -910,7 +910,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s6
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s6
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_mov_b32 s4, 0
@@ -920,13 +920,13 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB7_6
; GFX9-GISEL-NEXT: ; %bb.5: ; %bb.0
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s4, s5, 15
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s4, s4, -16
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xfffff000
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB7_6: ; %bb.2
@@ -1008,7 +1008,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX11-GISEL-NEXT: s_mov_b32 s3, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s2, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s3, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s3, s0
; GFX11-GISEL-NEXT: s_mov_b32 s0, 0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s3 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
@@ -1020,13 +1020,13 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX11-GISEL-NEXT: ; %bb.5: ; %bb.0
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s0, s1, 15
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 2
-; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0x7ff
; GFX11-GISEL-NEXT: s_and_b32 s0, s0, -16
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff800
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: .LBB7_6: ; %bb.2
; GFX11-GISEL-NEXT: s_endpgm
entry:
@@ -1095,7 +1095,7 @@ define void @test_dynamic_stackalloc_device_uniform(i32 %n) {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1156,7 +1156,7 @@ define void @test_dynamic_stackalloc_device_uniform(i32 %n) {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1223,10 +1223,10 @@ define void @test_dynamic_stackalloc_device_uniform_over_aligned(i32 %n) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB9_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 10
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1292,12 +1292,12 @@ define void @test_dynamic_stackalloc_device_uniform_over_aligned(i32 %n) {
; GFX11-GISEL-NEXT: s_cmp_lg_u32 s1, 0
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB9_1
; GFX11-GISEL-NEXT: ; %bb.2:
-; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 10
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_mov_b32 s33, s4
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
@@ -1358,7 +1358,7 @@ define void @test_dynamic_stackalloc_device_uniform_under_aligned(i32 %n) {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 22
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1419,7 +1419,7 @@ define void @test_dynamic_stackalloc_device_uniform_under_aligned(i32 %n) {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1482,7 +1482,7 @@ define void @test_dynamic_stackalloc_device_divergent() {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1547,7 +1547,7 @@ define void @test_dynamic_stackalloc_device_divergent() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1617,10 +1617,10 @@ define void @test_dynamic_stackalloc_device_divergent_over_aligned() {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB12_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
-; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1689,12 +1689,12 @@ define void @test_dynamic_stackalloc_device_divergent_over_aligned() {
; GFX11-GISEL-NEXT: s_cmp_lg_u32 s1, 0
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB12_1
; GFX11-GISEL-NEXT: ; %bb.2:
-; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_mov_b32 s33, s4
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
@@ -1758,7 +1758,7 @@ define void @test_dynamic_stackalloc_device_divergent_under_aligned() {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1823,7 +1823,7 @@ define void @test_dynamic_stackalloc_device_divergent_under_aligned() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1946,11 +1946,11 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[6:7], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB14_2
; GFX9-GISEL-NEXT: ; %bb.3:
-; GFX9-GISEL-NEXT: s_add_u32 s7, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_i32 s7, s32, 0xfff
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s9, 6
; GFX9-GISEL-NEXT: s_and_b32 s9, s7, 0xfffff000
; GFX9-GISEL-NEXT: v_lshl_add_u32 v1, v2, 2, 15
-; GFX9-GISEL-NEXT: s_add_u32 s32, s9, s6
+; GFX9-GISEL-NEXT: s_add_i32 s32, s9, s6
; GFX9-GISEL-NEXT: v_and_b32_e32 v1, -16, v1
; GFX9-GISEL-NEXT: s_mov_b64 s[6:7], exec
; GFX9-GISEL-NEXT: s_mov_b32 s10, 0
@@ -1970,7 +1970,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, 4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v2, s6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB14_6: ; %bb.1
@@ -1989,7 +1989,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, off, s[0:3], s33
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
@@ -2117,12 +2117,12 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: ; %bb.3:
; GFX11-GISEL-NEXT: v_lshl_add_u32 v1, v1, 2, 15
; GFX11-GISEL-NEXT: s_lshl_b32 s5, s2, 5
-; GFX11-GISEL-NEXT: s_add_u32 s2, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_i32 s2, s32, 0x7ff
; GFX11-GISEL-NEXT: s_mov_b32 s4, exec_lo
; GFX11-GISEL-NEXT: s_and_b32 s2, s2, 0xfffff800
; GFX11-GISEL-NEXT: v_and_b32_e32 v1, -16, v1
; GFX11-GISEL-NEXT: s_mov_b32 s3, 0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s5
+; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s5
; GFX11-GISEL-NEXT: .LBB14_4: ; =>This Inner Loop Header: Depth=1
; GFX11-GISEL-NEXT: s_ctz_i32_b32 s5, s4
; GFX11-GISEL-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
@@ -2139,7 +2139,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v2, s4 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s4, s3
+; GFX11-GISEL-NEXT: s_add_i32 s32, s4, s3
; GFX11-GISEL-NEXT: .LBB14_6: ; %bb.1
; GFX11-GISEL-NEXT: s_or_b32 exec_lo, exec_lo, s1
; GFX11-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
@@ -2162,7 +2162,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
; GFX11-GISEL-NEXT: s_mov_b32 s34, s8
; GFX11-GISEL-NEXT: s_mov_b32 s33, s7
@@ -2279,12 +2279,12 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[6:7], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB15_2
; GFX9-GISEL-NEXT: ; %bb.3:
-; GFX9-GISEL-NEXT: s_add_u32 s7, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_i32 s7, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s7, s7, 0xfffff000
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s7
-; GFX9-GISEL-NEXT: s_add_u32 s32, s7, s6
+; GFX9-GISEL-NEXT: s_add_i32 s32, s7, s6
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: ; implicit-def: $vgpr31
@@ -2309,7 +2309,7 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_lshl_b32 s7, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB15_8: ; %bb.2
@@ -2418,12 +2418,12 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB15_2
; GFX11-GISEL-NEXT: ; %bb.3:
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 2
-; GFX11-GISEL-NEXT: s_add_u32 s2, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_i32 s2, s32, 0x7ff
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_and_b32 s2, s2, 0xfffff800
; GFX11-GISEL-NEXT: ; implicit-def: $vgpr31
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s1
+; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s1
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s2 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: .LBB15_4: ; %Flow
@@ -2449,7 +2449,7 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_mov_b32 s2, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s1
+; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s1
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s2 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: .LBB15_8: ; %bb.2
@@ -2526,7 +2526,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i16(i16
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -2591,7 +2591,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i16(i16
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -2652,7 +2652,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i64(i64
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -2713,7 +2713,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i64(i64
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
diff --git a/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll b/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
index 737985c27c5d3..11bd0536307d7 100644
--- a/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
+++ b/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
@@ -17,27 +17,17 @@ define amdgpu_ps void @test_s_load_i8(ptr addrspace(4) inreg %in, ptr addrspace(
}
define amdgpu_ps void @test_s_load_i8_imm(ptr addrspace(4) inreg %in, ptr addrspace(1) %out) {
-; DAG-LABEL: test_s_load_i8_imm:
-; DAG: ; %bb.0:
-; DAG-NEXT: s_movk_i32 s2, 0xff9c
-; DAG-NEXT: s_mov_b32 s3, -1
-; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; DAG-NEXT: s_load_i8 s0, s[0:1], 0x0
-; DAG-NEXT: s_wait_kmcnt 0x0
-; DAG-NEXT: v_mov_b32_e32 v2, s0
-; DAG-NEXT: global_store_b32 v[0:1], v2, off
-; DAG-NEXT: s_endpgm
-;
-; GISEL-LABEL: test_s_load_i8_imm:
-; GISEL: ; %bb.0:
-; GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffff9c
-; GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
-; GISEL-NEXT: s_load_i8 s0, s[0:1], 0x0
-; GISEL-NEXT: s_wait_kmcnt 0x0
-; GISEL-NEXT: v_mov_b32_e32 v2, s0
-; GISEL-NEXT: global_store_b32 v[0:1], v2, off
-; GISEL-NEXT: s_endpgm
+; GCN-LABEL: test_s_load_i8_imm:
+; GCN: ; %bb.0:
+; GCN-NEXT: s_movk_i32 s2, 0xff9c
+; GCN-NEXT: s_mov_b32 s3, -1
+; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GCN-NEXT: s_load_i8 s0, s[0:1], 0x0
+; GCN-NEXT: s_wait_kmcnt 0x0
+; GCN-NEXT: v_mov_b32_e32 v2, s0
+; GCN-NEXT: global_store_b32 v[0:1], v2, off
+; GCN-NEXT: s_endpgm
%gep = getelementptr i8, ptr addrspace(4) %in, i64 -100
%ld = load i8, ptr addrspace(4) %gep
%sext = sext i8 %ld to i32
@@ -187,27 +177,17 @@ define amdgpu_ps void @test_s_load_i16(ptr addrspace(4) inreg %in, ptr addrspace
}
define amdgpu_ps void @test_s_load_i16_imm(ptr addrspace(4) inreg %in, ptr addrspace(1) %out) {
-; DAG-LABEL: test_s_load_i16_imm:
-; DAG: ; %bb.0:
-; DAG-NEXT: s_movk_i32 s2, 0xff38
-; DAG-NEXT: s_mov_b32 s3, -1
-; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; DAG-NEXT: s_load_i16 s0, s[0:1], 0x0
-; DAG-NEXT: s_wait_kmcnt 0x0
-; DAG-NEXT: v_mov_b32_e32 v2, s0
-; DAG-NEXT: global_store_b32 v[0:1], v2, off
-; DAG-NEXT: s_endpgm
-;
-; GISEL-LABEL: test_s_load_i16_imm:
-; GISEL: ; %bb.0:
-; GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffff38
-; GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
-; GISEL-NEXT: s_load_i16 s0, s[0:1], 0x0
-; GISEL-NEXT: s_wait_kmcnt 0x0
-; GISEL-NEXT: v_mov_b32_e32 v2, s0
-; GISEL-NEXT: global_store_b32 v[0:1], v2, off
-; GISEL-NEXT: s_endpgm
+; GCN-LABEL: test_s_load_i16_imm:
+; GCN: ; %bb.0:
+; GCN-NEXT: s_movk_i32 s2, 0xff38
+; GCN-NEXT: s_mov_b32 s3, -1
+; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GCN-NEXT: s_load_i16 s0, s[0:1], 0x0
+; GCN-NEXT: s_wait_kmcnt 0x0
+; GCN-NEXT: v_mov_b32_e32 v2, s0
+; GCN-NEXT: global_store_b32 v[0:1], v2, off
+; GCN-NEXT: s_endpgm
%gep = getelementptr i16, ptr addrspace(4) %in, i64 -100
%ld = load i16, ptr addrspace(4) %gep
%sext = sext i16 %ld to i32
@@ -232,30 +212,17 @@ define amdgpu_ps void @test_s_load_i16_sgpr(ptr addrspace(4) inreg %in, i32 inre
}
define amdgpu_ps void @test_s_load_i16_sgpr_imm(ptr addrspace(4) inreg %in, i32 inreg %offset, ptr addrspace(1) %out) {
-; DAG-LABEL: test_s_load_i16_sgpr_imm:
-; DAG: ; %bb.0:
-; DAG-NEXT: s_mov_b32 s3, 0
-; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; DAG-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; DAG-NEXT: s_load_i16 s0, s[0:1], 0x20
-; DAG-NEXT: s_wait_kmcnt 0x0
-; DAG-NEXT: v_mov_b32_e32 v2, s0
-; DAG-NEXT: global_store_b32 v[0:1], v2, off
-; DAG-NEXT: s_endpgm
-;
-; GISEL-LABEL: test_s_load_i16_sgpr_imm:
-; GISEL: ; %bb.0:
-; GISEL-NEXT: s_mov_b32 s3, 0
-; GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GISEL-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; GISEL-NEXT: s_add_co_u32 s0, s0, s2
-; GISEL-NEXT: s_add_co_ci_u32 s1, s1, s3
-; GISEL-NEXT: s_load_i16 s0, s[0:1], 0x20
-; GISEL-NEXT: s_wait_kmcnt 0x0
-; GISEL-NEXT: v_mov_b32_e32 v2, s0
-; GISEL-NEXT: global_store_b32 v[0:1], v2, off
-; GISEL-NEXT: s_endpgm
+; GCN-LABEL: test_s_load_i16_sgpr_imm:
+; GCN: ; %bb.0:
+; GCN-NEXT: s_mov_b32 s3, 0
+; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GCN-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GCN-NEXT: s_load_i16 s0, s[0:1], 0x20
+; GCN-NEXT: s_wait_kmcnt 0x0
+; GCN-NEXT: v_mov_b32_e32 v2, s0
+; GCN-NEXT: global_store_b32 v[0:1], v2, off
+; GCN-NEXT: s_endpgm
%gep1 = getelementptr i16, ptr addrspace(4) %in, i64 16
%zext = zext i32 %offset to i64
%gep2 = getelementptr i16, ptr addrspace(4) %gep1, i64 %zext
@@ -348,30 +315,17 @@ define amdgpu_ps void @test_s_load_u16_sgpr(ptr addrspace(4) inreg %in, i32 inre
}
define amdgpu_ps void @test_s_load_u16_sgpr_imm(ptr addrspace(4) inreg %in, i32 inreg %offset, ptr addrspace(1) %out) {
-; DAG-LABEL: test_s_load_u16_sgpr_imm:
-; DAG: ; %bb.0:
-; DAG-NEXT: s_mov_b32 s3, 0
-; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; DAG-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; DAG-NEXT: s_load_u16 s0, s[0:1], 0x20
-; DAG-NEXT: s_wait_kmcnt 0x0
-; DAG-NEXT: v_mov_b32_e32 v2, s0
-; DAG-NEXT: global_store_b32 v[0:1], v2, off
-; DAG-NEXT: s_endpgm
-;
-; GISEL-LABEL: test_s_load_u16_sgpr_imm:
-; GISEL: ; %bb.0:
-; GISEL-NEXT: s_mov_b32 s3, 0
-; GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GISEL-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; GISEL-NEXT: s_add_co_u32 s0, s0, s2
-; GISEL-NEXT: s_add_co_ci_u32 s1, s1, s3
-; GISEL-NEXT: s_load_u16 s0, s[0:1], 0x20
-; GISEL-NEXT: s_wait_kmcnt 0x0
-; GISEL-NEXT: v_mov_b32_e32 v2, s0
-; GISEL-NEXT: global_store_b32 v[0:1], v2, off
-; GISEL-NEXT: s_endpgm
+; GCN-LABEL: test_s_load_u16_sgpr_imm:
+; GCN: ; %bb.0:
+; GCN-NEXT: s_mov_b32 s3, 0
+; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GCN-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GCN-NEXT: s_load_u16 s0, s[0:1], 0x20
+; GCN-NEXT: s_wait_kmcnt 0x0
+; GCN-NEXT: v_mov_b32_e32 v2, s0
+; GCN-NEXT: global_store_b32 v[0:1], v2, off
+; GCN-NEXT: s_endpgm
%gep1 = getelementptr i16, ptr addrspace(4) %in, i64 16
%zext1= zext i32 %offset to i64
%gep2 = getelementptr i16, ptr addrspace(4) %gep1, i64 %zext1
diff --git a/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll b/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
index d588f0e0897b7..fbbb242a467b4 100644
--- a/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
@@ -160,25 +160,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4096(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4096:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf000
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4096:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg4096:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xf000
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4096
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -214,25 +205,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4097(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4097:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xefff
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4097:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xffffefff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg4097:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xefff
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4097
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -268,25 +250,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4098(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4098:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xeffe
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4098:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xffffeffe
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg4098:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xeffe
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4098
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -418,25 +391,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2048(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2048:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf800
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2048:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff800
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg2048:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xf800
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2048
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -468,25 +432,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2049(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2049:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf7ff
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2049:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff7ff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg2049:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xf7ff
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2049
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -518,25 +473,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2050(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2050:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf7fe
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2050:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff7fe
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg2050:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xf7fe
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2050
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -606,25 +552,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0xFFFFFF(ptr addrspace(1) in
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_mov_b32 s0, 0xff800000
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xff800000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_mov_b32 s0, 0xff800000
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -8388608
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -703,8 +640,10 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000000(ptr addrspace(1)
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000000:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
+; GFX12-GISEL-NEXT: s_mov_b32 s0, 0
+; GFX12-GISEL-NEXT: s_mov_b32 s1, 1
+; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -756,8 +695,10 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000001(ptr addrspace(1)
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000001:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 1
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
+; GFX12-GISEL-NEXT: s_mov_b32 s0, 1
+; GFX12-GISEL-NEXT: s_mov_b32 s1, 1
+; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -796,25 +737,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000FFF(ptr addrspace(1)
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0x100000FFF:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xfff
-; GFX12-SDAG-NEXT: s_mov_b32 s1, 1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000FFF:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_0x100000FFF:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0xfff
+; GFX12-NEXT: s_mov_b32 s1, 1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 4294971391
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -849,25 +781,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100001000(ptr addrspace(1)
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0x100001000:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_movk_i32 s0, 0x1000
-; GFX12-SDAG-NEXT: s_mov_b32 s1, 1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100001000:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0x1000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_0x100001000:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_movk_i32 s0, 0x1000
+; GFX12-NEXT: s_mov_b32 s1, 1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 4294971392
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -903,25 +826,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0xFFFFFFFF(ptr addrspace(
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_mov_b32 s0, 1
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 1
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_mov_b32 s0, 1
+; GFX12-NEXT: s_mov_b32 s1, -1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4294967295
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -964,8 +878,10 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0x100000000(ptr addrspace
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0x100000000:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_mov_b32 s0, 0
+; GFX12-GISEL-NEXT: s_mov_b32 s1, -1
+; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -1004,25 +920,16 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0x100000001(ptr addrspace
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg0x100000001:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_mov_b32 s0, -1
-; GFX12-SDAG-NEXT: s_mov_b32 s1, -2
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0x100000001:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, -1
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -2
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_offset_neg0x100000001:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_mov_b32 s0, -1
+; GFX12-NEXT: s_mov_b32 s1, -2
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4294967297
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -1807,29 +1714,18 @@ define amdgpu_ps float @global_load_saddr_i8_zext_uniform_offset_immoffset(ptr a
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-SDAG-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
-; GFX12-SDAG: ; %bb.0:
-; GFX12-SDAG-NEXT: s_mov_b32 s5, 0
-; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_2) | instid1(SALU_CYCLE_1)
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[4:5]
-; GFX12-SDAG-NEXT: s_movk_i32 s2, 0xffe8
-; GFX12-SDAG-NEXT: s_mov_b32 s3, -1
-; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
-; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-SDAG-NEXT: ; return to shader part epilog
-;
-; GFX12-GISEL-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
-; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, s4
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffffe8
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
-; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-GISEL-NEXT: ; return to shader part epilog
+; GFX12-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
+; GFX12: ; %bb.0:
+; GFX12-NEXT: s_mov_b32 s5, 0
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_2) | instid1(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[4:5]
+; GFX12-NEXT: s_movk_i32 s2, 0xffe8
+; GFX12-NEXT: s_mov_b32 s3, -1
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: ; return to shader part epilog
%zext.offset = zext i32 %soffset to i64
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 %zext.offset
%gep1 = getelementptr inbounds i8, ptr addrspace(1) %gep0, i64 -24
diff --git a/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll b/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
index befe0d405307b..39f9c866f8131 100644
--- a/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
+++ b/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
@@ -1084,401 +1084,401 @@ define amdgpu_cs_chain void @amdgpu_cs_chain_many_regs(<36 x i32> inreg %a, <128
; GISEL-GFX11-NEXT: [[COPY165:%[0-9]+]]:vgpr_32 = COPY [[COPY32]]
; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY165]], [[COPY164]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 4
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY166:%[0-9]+]]:vgpr_32 = COPY [[COPY33]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY166]], [[S_ADD_U32_]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 4, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY166]], [[S_ADD_I32_]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 4, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 8
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_1:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_1]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_1:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_1]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY167:%[0-9]+]]:vgpr_32 = COPY [[COPY34]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY167]], [[S_ADD_U32_1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 8, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY167]], [[S_ADD_I32_1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 8, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_2:%[0-9]+]]:sreg_32 = S_MOV_B32 12
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_2:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_2]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_2:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_2]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY168:%[0-9]+]]:vgpr_32 = COPY [[COPY35]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY168]], [[S_ADD_U32_2]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 12, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY168]], [[S_ADD_I32_2]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 12, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_3:%[0-9]+]]:sreg_32 = S_MOV_B32 16
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_3:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_3]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY36]], [[S_ADD_U32_3]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 16, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_3:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_3]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY36]], [[S_ADD_I32_3]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 16, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_4:%[0-9]+]]:sreg_32 = S_MOV_B32 20
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_4:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_4]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY37]], [[S_ADD_U32_4]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 20, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_4:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_4]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY37]], [[S_ADD_I32_4]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 20, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_5:%[0-9]+]]:sreg_32 = S_MOV_B32 24
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_5:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_5]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY38]], [[S_ADD_U32_5]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 24, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_5:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_5]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY38]], [[S_ADD_I32_5]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 24, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_6:%[0-9]+]]:sreg_32 = S_MOV_B32 28
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_6:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_6]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY39]], [[S_ADD_U32_6]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 28, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_6:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_6]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY39]], [[S_ADD_I32_6]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 28, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_7:%[0-9]+]]:sreg_32 = S_MOV_B32 32
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_7:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_7]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY40]], [[S_ADD_U32_7]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 32, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_7:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_7]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY40]], [[S_ADD_I32_7]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 32, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_8:%[0-9]+]]:sreg_32 = S_MOV_B32 36
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_8:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_8]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY41]], [[S_ADD_U32_8]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 36, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_8:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_8]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY41]], [[S_ADD_I32_8]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 36, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_9:%[0-9]+]]:sreg_32 = S_MOV_B32 40
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_9:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_9]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY42]], [[S_ADD_U32_9]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 40, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_9:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_9]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY42]], [[S_ADD_I32_9]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 40, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_10:%[0-9]+]]:sreg_32 = S_MOV_B32 44
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_10:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_10]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY43]], [[S_ADD_U32_10]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 44, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_10:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_10]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY43]], [[S_ADD_I32_10]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 44, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_11:%[0-9]+]]:sreg_32 = S_MOV_B32 48
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_11:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_11]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY44]], [[S_ADD_U32_11]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 48, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_11:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_11]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY44]], [[S_ADD_I32_11]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 48, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_12:%[0-9]+]]:sreg_32 = S_MOV_B32 52
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_12:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_12]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY45]], [[S_ADD_U32_12]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 52, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_12:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_12]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY45]], [[S_ADD_I32_12]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 52, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_13:%[0-9]+]]:sreg_32 = S_MOV_B32 56
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_13:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_13]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY46]], [[S_ADD_U32_13]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 56, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_13:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_13]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY46]], [[S_ADD_I32_13]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 56, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_14:%[0-9]+]]:sreg_32 = S_MOV_B32 60
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_14:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_14]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY47]], [[S_ADD_U32_14]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 60, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_14:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_14]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY47]], [[S_ADD_I32_14]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 60, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_15:%[0-9]+]]:sreg_32 = S_MOV_B32 64
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_15:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_15]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY48]], [[S_ADD_U32_15]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 64, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_15:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_15]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY48]], [[S_ADD_I32_15]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 64, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_16:%[0-9]+]]:sreg_32 = S_MOV_B32 68
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_16:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_16]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY49]], [[S_ADD_U32_16]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 68, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_16:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_16]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY49]], [[S_ADD_I32_16]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 68, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_17:%[0-9]+]]:sreg_32 = S_MOV_B32 72
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_17:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_17]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY50]], [[S_ADD_U32_17]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 72, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_17:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_17]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY50]], [[S_ADD_I32_17]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 72, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_18:%[0-9]+]]:sreg_32 = S_MOV_B32 76
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_18:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_18]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY51]], [[S_ADD_U32_18]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 76, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_18:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_18]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY51]], [[S_ADD_I32_18]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 76, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_19:%[0-9]+]]:sreg_32 = S_MOV_B32 80
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_19:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_19]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY52]], [[S_ADD_U32_19]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 80, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_19:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_19]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY52]], [[S_ADD_I32_19]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 80, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_20:%[0-9]+]]:sreg_32 = S_MOV_B32 84
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_20:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_20]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY53]], [[S_ADD_U32_20]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 84, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_20:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_20]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY53]], [[S_ADD_I32_20]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 84, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_21:%[0-9]+]]:sreg_32 = S_MOV_B32 88
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_21:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_21]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY54]], [[S_ADD_U32_21]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 88, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_21:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_21]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY54]], [[S_ADD_I32_21]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 88, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_22:%[0-9]+]]:sreg_32 = S_MOV_B32 92
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_22:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_22]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY55]], [[S_ADD_U32_22]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 92, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_22:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_22]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY55]], [[S_ADD_I32_22]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 92, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_23:%[0-9]+]]:sreg_32 = S_MOV_B32 96
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_23:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_23]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY56]], [[S_ADD_U32_23]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 96, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_23:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_23]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY56]], [[S_ADD_I32_23]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 96, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_24:%[0-9]+]]:sreg_32 = S_MOV_B32 100
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_24:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_24]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY57]], [[S_ADD_U32_24]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 100, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_24:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_24]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY57]], [[S_ADD_I32_24]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 100, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_25:%[0-9]+]]:sreg_32 = S_MOV_B32 104
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_25:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_25]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY58]], [[S_ADD_U32_25]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 104, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_25:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_25]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY58]], [[S_ADD_I32_25]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 104, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_26:%[0-9]+]]:sreg_32 = S_MOV_B32 108
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_26:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_26]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY59]], [[S_ADD_U32_26]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 108, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_26:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_26]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY59]], [[S_ADD_I32_26]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 108, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_27:%[0-9]+]]:sreg_32 = S_MOV_B32 112
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_27:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_27]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY60]], [[S_ADD_U32_27]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 112, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_27:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_27]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY60]], [[S_ADD_I32_27]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 112, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_28:%[0-9]+]]:sreg_32 = S_MOV_B32 116
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_28:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_28]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY61]], [[S_ADD_U32_28]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 116, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_28:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_28]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY61]], [[S_ADD_I32_28]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 116, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_29:%[0-9]+]]:sreg_32 = S_MOV_B32 120
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_29:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_29]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY62]], [[S_ADD_U32_29]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 120, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_29:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_29]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY62]], [[S_ADD_I32_29]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 120, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_30:%[0-9]+]]:sreg_32 = S_MOV_B32 124
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_30:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_30]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY63]], [[S_ADD_U32_30]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 124, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_30:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_30]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY63]], [[S_ADD_I32_30]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 124, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_31:%[0-9]+]]:sreg_32 = S_MOV_B32 128
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_31:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_31]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY64]], [[S_ADD_U32_31]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 128, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_31:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_31]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY64]], [[S_ADD_I32_31]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 128, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_32:%[0-9]+]]:sreg_32 = S_MOV_B32 132
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_32:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_32]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY65]], [[S_ADD_U32_32]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 132, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_32:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_32]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY65]], [[S_ADD_I32_32]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 132, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_33:%[0-9]+]]:sreg_32 = S_MOV_B32 136
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_33:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_33]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY66]], [[S_ADD_U32_33]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 136, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_33:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_33]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY66]], [[S_ADD_I32_33]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 136, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_34:%[0-9]+]]:sreg_32 = S_MOV_B32 140
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_34:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_34]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY67]], [[S_ADD_U32_34]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 140, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_34:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_34]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY67]], [[S_ADD_I32_34]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 140, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_35:%[0-9]+]]:sreg_32 = S_MOV_B32 144
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_35:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_35]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY68]], [[S_ADD_U32_35]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 144, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_35:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_35]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY68]], [[S_ADD_I32_35]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 144, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_36:%[0-9]+]]:sreg_32 = S_MOV_B32 148
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_36:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_36]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY69]], [[S_ADD_U32_36]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 148, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_36:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_36]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY69]], [[S_ADD_I32_36]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 148, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_37:%[0-9]+]]:sreg_32 = S_MOV_B32 152
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_37:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_37]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY70]], [[S_ADD_U32_37]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 152, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_37:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_37]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY70]], [[S_ADD_I32_37]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 152, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_38:%[0-9]+]]:sreg_32 = S_MOV_B32 156
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_38:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_38]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY71]], [[S_ADD_U32_38]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 156, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_38:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_38]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY71]], [[S_ADD_I32_38]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 156, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_39:%[0-9]+]]:sreg_32 = S_MOV_B32 160
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_39:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_39]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY72]], [[S_ADD_U32_39]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 160, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_39:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_39]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY72]], [[S_ADD_I32_39]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 160, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_40:%[0-9]+]]:sreg_32 = S_MOV_B32 164
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_40:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_40]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY73]], [[S_ADD_U32_40]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 164, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_40:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_40]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY73]], [[S_ADD_I32_40]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 164, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_41:%[0-9]+]]:sreg_32 = S_MOV_B32 168
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_41:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_41]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY74]], [[S_ADD_U32_41]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 168, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_41:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_41]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY74]], [[S_ADD_I32_41]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 168, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_42:%[0-9]+]]:sreg_32 = S_MOV_B32 172
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_42:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_42]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY75]], [[S_ADD_U32_42]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 172, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_42:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_42]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY75]], [[S_ADD_I32_42]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 172, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_43:%[0-9]+]]:sreg_32 = S_MOV_B32 176
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_43:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_43]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY76]], [[S_ADD_U32_43]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 176, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_43:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_43]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY76]], [[S_ADD_I32_43]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 176, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_44:%[0-9]+]]:sreg_32 = S_MOV_B32 180
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_44:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_44]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY77]], [[S_ADD_U32_44]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 180, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_44:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_44]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY77]], [[S_ADD_I32_44]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 180, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_45:%[0-9]+]]:sreg_32 = S_MOV_B32 184
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_45:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_45]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY78]], [[S_ADD_U32_45]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 184, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_45:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_45]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY78]], [[S_ADD_I32_45]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 184, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_46:%[0-9]+]]:sreg_32 = S_MOV_B32 188
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_46:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_46]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY79]], [[S_ADD_U32_46]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 188, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_46:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_46]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY79]], [[S_ADD_I32_46]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 188, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_47:%[0-9]+]]:sreg_32 = S_MOV_B32 192
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_47:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_47]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY80]], [[S_ADD_U32_47]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 192, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_47:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_47]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY80]], [[S_ADD_I32_47]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 192, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_48:%[0-9]+]]:sreg_32 = S_MOV_B32 196
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_48:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_48]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY81]], [[S_ADD_U32_48]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 196, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_48:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_48]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY81]], [[S_ADD_I32_48]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 196, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_49:%[0-9]+]]:sreg_32 = S_MOV_B32 200
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_49:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_49]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY82]], [[S_ADD_U32_49]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 200, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_49:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_49]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY82]], [[S_ADD_I32_49]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 200, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_50:%[0-9]+]]:sreg_32 = S_MOV_B32 204
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_50:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_50]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY83]], [[S_ADD_U32_50]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 204, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_50:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_50]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY83]], [[S_ADD_I32_50]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 204, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_51:%[0-9]+]]:sreg_32 = S_MOV_B32 208
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_51:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_51]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY84]], [[S_ADD_U32_51]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 208, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_51:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_51]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY84]], [[S_ADD_I32_51]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 208, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_52:%[0-9]+]]:sreg_32 = S_MOV_B32 212
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_52:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_52]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY85]], [[S_ADD_U32_52]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 212, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_52:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_52]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY85]], [[S_ADD_I32_52]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 212, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_53:%[0-9]+]]:sreg_32 = S_MOV_B32 216
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_53:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_53]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY86]], [[S_ADD_U32_53]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 216, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_53:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_53]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY86]], [[S_ADD_I32_53]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 216, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_54:%[0-9]+]]:sreg_32 = S_MOV_B32 220
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_54:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_54]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY87]], [[S_ADD_U32_54]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 220, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_54:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_54]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY87]], [[S_ADD_I32_54]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 220, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_55:%[0-9]+]]:sreg_32 = S_MOV_B32 224
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_55:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_55]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY88]], [[S_ADD_U32_55]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 224, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_55:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_55]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY88]], [[S_ADD_I32_55]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 224, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_56:%[0-9]+]]:sreg_32 = S_MOV_B32 228
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_56:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_56]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY89]], [[S_ADD_U32_56]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 228, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_56:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_56]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY89]], [[S_ADD_I32_56]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 228, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_57:%[0-9]+]]:sreg_32 = S_MOV_B32 232
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_57:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_57]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY90]], [[S_ADD_U32_57]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 232, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_57:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_57]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY90]], [[S_ADD_I32_57]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 232, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_58:%[0-9]+]]:sreg_32 = S_MOV_B32 236
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_58:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_58]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY91]], [[S_ADD_U32_58]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 236, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_58:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_58]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY91]], [[S_ADD_I32_58]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 236, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_59:%[0-9]+]]:sreg_32 = S_MOV_B32 240
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_59:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_59]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY92]], [[S_ADD_U32_59]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 240, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_59:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_59]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY92]], [[S_ADD_I32_59]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 240, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_60:%[0-9]+]]:sreg_32 = S_MOV_B32 244
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_60:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_60]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY93]], [[S_ADD_U32_60]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 244, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_60:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_60]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY93]], [[S_ADD_I32_60]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 244, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_61:%[0-9]+]]:sreg_32 = S_MOV_B32 248
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_61:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_61]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY94]], [[S_ADD_U32_61]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 248, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_61:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_61]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY94]], [[S_ADD_I32_61]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 248, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_62:%[0-9]+]]:sreg_32 = S_MOV_B32 252
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_62:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_62]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY95]], [[S_ADD_U32_62]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 252, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_62:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_62]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY95]], [[S_ADD_I32_62]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 252, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_63:%[0-9]+]]:sreg_32 = S_MOV_B32 256
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_63:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_63]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY96]], [[S_ADD_U32_63]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 256, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_63:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_63]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY96]], [[S_ADD_I32_63]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 256, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_64:%[0-9]+]]:sreg_32 = S_MOV_B32 260
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_64:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_64]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY97]], [[S_ADD_U32_64]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 260, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_64:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_64]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY97]], [[S_ADD_I32_64]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 260, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_65:%[0-9]+]]:sreg_32 = S_MOV_B32 264
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_65:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_65]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY98]], [[S_ADD_U32_65]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 264, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_65:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_65]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY98]], [[S_ADD_I32_65]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 264, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_66:%[0-9]+]]:sreg_32 = S_MOV_B32 268
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_66:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_66]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY99]], [[S_ADD_U32_66]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 268, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_66:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_66]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY99]], [[S_ADD_I32_66]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 268, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_67:%[0-9]+]]:sreg_32 = S_MOV_B32 272
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_67:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_67]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY100]], [[S_ADD_U32_67]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 272, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_67:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_67]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY100]], [[S_ADD_I32_67]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 272, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_68:%[0-9]+]]:sreg_32 = S_MOV_B32 276
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_68:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_68]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY101]], [[S_ADD_U32_68]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 276, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_68:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_68]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY101]], [[S_ADD_I32_68]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 276, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_69:%[0-9]+]]:sreg_32 = S_MOV_B32 280
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_69:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_69]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY102]], [[S_ADD_U32_69]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 280, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_69:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_69]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY102]], [[S_ADD_I32_69]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 280, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_70:%[0-9]+]]:sreg_32 = S_MOV_B32 284
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_70:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_70]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY103]], [[S_ADD_U32_70]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 284, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_70:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_70]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY103]], [[S_ADD_I32_70]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 284, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_71:%[0-9]+]]:sreg_32 = S_MOV_B32 288
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_71:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_71]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY104]], [[S_ADD_U32_71]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 288, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_71:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_71]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY104]], [[S_ADD_I32_71]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 288, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_72:%[0-9]+]]:sreg_32 = S_MOV_B32 292
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_72:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_72]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY105]], [[S_ADD_U32_72]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 292, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_72:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_72]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY105]], [[S_ADD_I32_72]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 292, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_73:%[0-9]+]]:sreg_32 = S_MOV_B32 296
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_73:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_73]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY106]], [[S_ADD_U32_73]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 296, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_73:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_73]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY106]], [[S_ADD_I32_73]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 296, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_74:%[0-9]+]]:sreg_32 = S_MOV_B32 300
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_74:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_74]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY107]], [[S_ADD_U32_74]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 300, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_74:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_74]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY107]], [[S_ADD_I32_74]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 300, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_75:%[0-9]+]]:sreg_32 = S_MOV_B32 304
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_75:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_75]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY108]], [[S_ADD_U32_75]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 304, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_75:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_75]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY108]], [[S_ADD_I32_75]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 304, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_76:%[0-9]+]]:sreg_32 = S_MOV_B32 308
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_76:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_76]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY109]], [[S_ADD_U32_76]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 308, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_76:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_76]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY109]], [[S_ADD_I32_76]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 308, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_77:%[0-9]+]]:sreg_32 = S_MOV_B32 312
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_77:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_77]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY110]], [[S_ADD_U32_77]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 312, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_77:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_77]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY110]], [[S_ADD_I32_77]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 312, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_78:%[0-9]+]]:sreg_32 = S_MOV_B32 316
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_78:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_78]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY111]], [[S_ADD_U32_78]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 316, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_78:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_78]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY111]], [[S_ADD_I32_78]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 316, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_79:%[0-9]+]]:sreg_32 = S_MOV_B32 320
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_79:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_79]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY112]], [[S_ADD_U32_79]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 320, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_79:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_79]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY112]], [[S_ADD_I32_79]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 320, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_80:%[0-9]+]]:sreg_32 = S_MOV_B32 324
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_80:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_80]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY113]], [[S_ADD_U32_80]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 324, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_80:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_80]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY113]], [[S_ADD_I32_80]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 324, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_81:%[0-9]+]]:sreg_32 = S_MOV_B32 328
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_81:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_81]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY114]], [[S_ADD_U32_81]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 328, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_81:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_81]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY114]], [[S_ADD_I32_81]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 328, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_82:%[0-9]+]]:sreg_32 = S_MOV_B32 332
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_82:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_82]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY115]], [[S_ADD_U32_82]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 332, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_82:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_82]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY115]], [[S_ADD_I32_82]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 332, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_83:%[0-9]+]]:sreg_32 = S_MOV_B32 336
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_83:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_83]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY116]], [[S_ADD_U32_83]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 336, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_83:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_83]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY116]], [[S_ADD_I32_83]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 336, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_84:%[0-9]+]]:sreg_32 = S_MOV_B32 340
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_84:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_84]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY117]], [[S_ADD_U32_84]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 340, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_84:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_84]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY117]], [[S_ADD_I32_84]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 340, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_85:%[0-9]+]]:sreg_32 = S_MOV_B32 344
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_85:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_85]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY118]], [[S_ADD_U32_85]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 344, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_85:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_85]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY118]], [[S_ADD_I32_85]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 344, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_86:%[0-9]+]]:sreg_32 = S_MOV_B32 348
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_86:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_86]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY119]], [[S_ADD_U32_86]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 348, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_86:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_86]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY119]], [[S_ADD_I32_86]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 348, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_87:%[0-9]+]]:sreg_32 = S_MOV_B32 352
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_87:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_87]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY120]], [[S_ADD_U32_87]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 352, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_87:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_87]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY120]], [[S_ADD_I32_87]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 352, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_88:%[0-9]+]]:sreg_32 = S_MOV_B32 356
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_88:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_88]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY121]], [[S_ADD_U32_88]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 356, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_88:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_88]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY121]], [[S_ADD_I32_88]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 356, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_89:%[0-9]+]]:sreg_32 = S_MOV_B32 360
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_89:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_89]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY122]], [[S_ADD_U32_89]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 360, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_89:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_89]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY122]], [[S_ADD_I32_89]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 360, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_90:%[0-9]+]]:sreg_32 = S_MOV_B32 364
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_90:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_90]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY123]], [[S_ADD_U32_90]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 364, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_90:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_90]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY123]], [[S_ADD_I32_90]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 364, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_91:%[0-9]+]]:sreg_32 = S_MOV_B32 368
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_91:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_91]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY124]], [[S_ADD_U32_91]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 368, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_91:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_91]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY124]], [[S_ADD_I32_91]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 368, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_92:%[0-9]+]]:sreg_32 = S_MOV_B32 372
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_92:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_92]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY125]], [[S_ADD_U32_92]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 372, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_92:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_92]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY125]], [[S_ADD_I32_92]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 372, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_93:%[0-9]+]]:sreg_32 = S_MOV_B32 376
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_93:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_93]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY126]], [[S_ADD_U32_93]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 376, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_93:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_93]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY126]], [[S_ADD_I32_93]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 376, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_94:%[0-9]+]]:sreg_32 = S_MOV_B32 380
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_94:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_94]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY127]], [[S_ADD_U32_94]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 380, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_94:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_94]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY127]], [[S_ADD_I32_94]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 380, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_95:%[0-9]+]]:sreg_32 = S_MOV_B32 384
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_95:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_95]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY128]], [[S_ADD_U32_95]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 384, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_95:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_95]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY128]], [[S_ADD_I32_95]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 384, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_96:%[0-9]+]]:sreg_32 = S_MOV_B32 388
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_96:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_96]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY129]], [[S_ADD_U32_96]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 388, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_96:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_96]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY129]], [[S_ADD_I32_96]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 388, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_97:%[0-9]+]]:sreg_32 = S_MOV_B32 392
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_97:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_97]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY130]], [[S_ADD_U32_97]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 392, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_97:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_97]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY130]], [[S_ADD_I32_97]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 392, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_98:%[0-9]+]]:sreg_32 = S_MOV_B32 396
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_98:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_98]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY131]], [[S_ADD_U32_98]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 396, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_98:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_98]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY131]], [[S_ADD_I32_98]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 396, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_99:%[0-9]+]]:sreg_32 = S_MOV_B32 400
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_99:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_99]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY132]], [[S_ADD_U32_99]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 400, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_99:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_99]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY132]], [[S_ADD_I32_99]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 400, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_100:%[0-9]+]]:sreg_32 = S_MOV_B32 404
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_100:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_100]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY133]], [[S_ADD_U32_100]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 404, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_100:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_100]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY133]], [[S_ADD_I32_100]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 404, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_101:%[0-9]+]]:sreg_32 = S_MOV_B32 408
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_101:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_101]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY134]], [[S_ADD_U32_101]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 408, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_101:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_101]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY134]], [[S_ADD_I32_101]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 408, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_102:%[0-9]+]]:sreg_32 = S_MOV_B32 412
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_102:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_102]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY135]], [[S_ADD_U32_102]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 412, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_102:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_102]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY135]], [[S_ADD_I32_102]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 412, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_103:%[0-9]+]]:sreg_32 = S_MOV_B32 416
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_103:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_103]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY136]], [[S_ADD_U32_103]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 416, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_103:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_103]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY136]], [[S_ADD_I32_103]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 416, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_104:%[0-9]+]]:sreg_32 = S_MOV_B32 420
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_104:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_104]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY137]], [[S_ADD_U32_104]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 420, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_104:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_104]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY137]], [[S_ADD_I32_104]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 420, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_105:%[0-9]+]]:sreg_32 = S_MOV_B32 424
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_105:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_105]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY138]], [[S_ADD_U32_105]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 424, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_105:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_105]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY138]], [[S_ADD_I32_105]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 424, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_106:%[0-9]+]]:sreg_32 = S_MOV_B32 428
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_106:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_106]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY139]], [[S_ADD_U32_106]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 428, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_106:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_106]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY139]], [[S_ADD_I32_106]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 428, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_107:%[0-9]+]]:sreg_32 = S_MOV_B32 432
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_107:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_107]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY140]], [[S_ADD_U32_107]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 432, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_107:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_107]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY140]], [[S_ADD_I32_107]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 432, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_108:%[0-9]+]]:sreg_32 = S_MOV_B32 436
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_108:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_108]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY141]], [[S_ADD_U32_108]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 436, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_108:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_108]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY141]], [[S_ADD_I32_108]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 436, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_109:%[0-9]+]]:sreg_32 = S_MOV_B32 440
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_109:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_109]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY142]], [[S_ADD_U32_109]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 440, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_109:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_109]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY142]], [[S_ADD_I32_109]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 440, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_110:%[0-9]+]]:sreg_32 = S_MOV_B32 444
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_110:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_110]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY143]], [[S_ADD_U32_110]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 444, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_110:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_110]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY143]], [[S_ADD_I32_110]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 444, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_111:%[0-9]+]]:sreg_32 = S_MOV_B32 448
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_111:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_111]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY144]], [[S_ADD_U32_111]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 448, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_111:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_111]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY144]], [[S_ADD_I32_111]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 448, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_112:%[0-9]+]]:sreg_32 = S_MOV_B32 452
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_112:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_112]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY145]], [[S_ADD_U32_112]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 452, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_112:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_112]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY145]], [[S_ADD_I32_112]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 452, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_113:%[0-9]+]]:sreg_32 = S_MOV_B32 456
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_113:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_113]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY146]], [[S_ADD_U32_113]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 456, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_113:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_113]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY146]], [[S_ADD_I32_113]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 456, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_114:%[0-9]+]]:sreg_32 = S_MOV_B32 460
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_114:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_114]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY147]], [[S_ADD_U32_114]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 460, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_114:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_114]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY147]], [[S_ADD_I32_114]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 460, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_115:%[0-9]+]]:sreg_32 = S_MOV_B32 464
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_115:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_115]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY148]], [[S_ADD_U32_115]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 464, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_115:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_115]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY148]], [[S_ADD_I32_115]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 464, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_116:%[0-9]+]]:sreg_32 = S_MOV_B32 468
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_116:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_116]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY149]], [[S_ADD_U32_116]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 468, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_116:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_116]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY149]], [[S_ADD_I32_116]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 468, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_117:%[0-9]+]]:sreg_32 = S_MOV_B32 472
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_117:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_117]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY150]], [[S_ADD_U32_117]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 472, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_117:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_117]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY150]], [[S_ADD_I32_117]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 472, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_118:%[0-9]+]]:sreg_32 = S_MOV_B32 476
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_118:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_118]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY151]], [[S_ADD_U32_118]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 476, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_118:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_118]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY151]], [[S_ADD_I32_118]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 476, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_119:%[0-9]+]]:sreg_32 = S_MOV_B32 480
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_119:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_119]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY152]], [[S_ADD_U32_119]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 480, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_119:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_119]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY152]], [[S_ADD_I32_119]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 480, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_120:%[0-9]+]]:sreg_32 = S_MOV_B32 484
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_120:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_120]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY153]], [[S_ADD_U32_120]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 484, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_120:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_120]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY153]], [[S_ADD_I32_120]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 484, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_121:%[0-9]+]]:sreg_32 = S_MOV_B32 488
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_121:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_121]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY154]], [[S_ADD_U32_121]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 488, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_121:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_121]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY154]], [[S_ADD_I32_121]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 488, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_122:%[0-9]+]]:sreg_32 = S_MOV_B32 492
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_122:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_122]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY155]], [[S_ADD_U32_122]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 492, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_122:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_122]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY155]], [[S_ADD_I32_122]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 492, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_123:%[0-9]+]]:sreg_32 = S_MOV_B32 496
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_123:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_123]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY156]], [[S_ADD_U32_123]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 496, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_123:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_123]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY156]], [[S_ADD_I32_123]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 496, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_124:%[0-9]+]]:sreg_32 = S_MOV_B32 500
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_124:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_124]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY157]], [[S_ADD_U32_124]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 500, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_124:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_124]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY157]], [[S_ADD_I32_124]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 500, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_125:%[0-9]+]]:sreg_32 = S_MOV_B32 504
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_125:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_125]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY158]], [[S_ADD_U32_125]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 504, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_125:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_125]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY158]], [[S_ADD_I32_125]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 504, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_126:%[0-9]+]]:sreg_32 = S_MOV_B32 508
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_126:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_126]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY159]], [[S_ADD_U32_126]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 508, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_126:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_126]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY159]], [[S_ADD_I32_126]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 508, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_127:%[0-9]+]]:sreg_32 = S_MOV_B32 512
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_127:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_127]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY160]], [[S_ADD_U32_127]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 512, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_127:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_127]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY160]], [[S_ADD_I32_127]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 512, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_128:%[0-9]+]]:sreg_32 = S_MOV_B32 516
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_128:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_128]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY161]], [[S_ADD_U32_128]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 516, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_128:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_128]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY161]], [[S_ADD_I32_128]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 516, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_129:%[0-9]+]]:sreg_32 = S_MOV_B32 520
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_129:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_129]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY162]], [[S_ADD_U32_129]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 520, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_129:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_129]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY162]], [[S_ADD_I32_129]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 520, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_130:%[0-9]+]]:sreg_32 = S_MOV_B32 524
- ; GISEL-GFX11-NEXT: [[S_ADD_U32_130:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_130]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY163]], [[S_ADD_U32_130]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 524, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_I32_130:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_130]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY163]], [[S_ADD_I32_130]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 524, addrspace 5)
; GISEL-GFX11-NEXT: $vgpr0 = COPY [[COPY]]
; GISEL-GFX11-NEXT: $vgpr1 = COPY [[COPY1]]
; GISEL-GFX11-NEXT: $vgpr2 = COPY [[COPY2]]
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll b/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
index 72260e0b99715..b5657b81d3192 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
@@ -59,21 +59,14 @@ define amdgpu_ps void @prefetch_data_sgpr_min_offset(ptr addrspace(4) inreg %ptr
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-SDAG-LABEL: prefetch_data_sgpr_min_offset:
-; SPREFETCH-SDAG: ; %bb.0: ; %entry
-; SPREFETCH-SDAG-NEXT: s_mov_b32 s2, 0xff800000
-; SPREFETCH-SDAG-NEXT: s_mov_b32 s3, -1
-; SPREFETCH-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; SPREFETCH-SDAG-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-SDAG-NEXT: s_endpgm
-;
-; SPREFETCH-GISEL-LABEL: prefetch_data_sgpr_min_offset:
-; SPREFETCH-GISEL: ; %bb.0: ; %entry
-; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0xff800000
-; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
-; SPREFETCH-GISEL-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-GISEL-NEXT: s_endpgm
+; SPREFETCH-LABEL: prefetch_data_sgpr_min_offset:
+; SPREFETCH: ; %bb.0: ; %entry
+; SPREFETCH-NEXT: s_mov_b32 s2, 0xff800000
+; SPREFETCH-NEXT: s_mov_b32 s3, -1
+; SPREFETCH-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; SPREFETCH-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 -8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 1)
@@ -85,18 +78,11 @@ define amdgpu_ps void @prefetch_data_sgpr_too_large_offset(ptr addrspace(4) inre
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-SDAG-LABEL: prefetch_data_sgpr_too_large_offset:
-; SPREFETCH-SDAG: ; %bb.0: ; %entry
-; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
-; SPREFETCH-SDAG-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-SDAG-NEXT: s_endpgm
-;
-; SPREFETCH-GISEL-LABEL: prefetch_data_sgpr_too_large_offset:
-; SPREFETCH-GISEL: ; %bb.0: ; %entry
-; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800000
-; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0
-; SPREFETCH-GISEL-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-GISEL-NEXT: s_endpgm
+; SPREFETCH-LABEL: prefetch_data_sgpr_too_large_offset:
+; SPREFETCH: ; %bb.0: ; %entry
+; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
+; SPREFETCH-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 1)
@@ -232,21 +218,14 @@ define amdgpu_ps void @prefetch_inst_sgpr_min_offset(ptr addrspace(4) inreg %ptr
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-SDAG-LABEL: prefetch_inst_sgpr_min_offset:
-; SPREFETCH-SDAG: ; %bb.0: ; %entry
-; SPREFETCH-SDAG-NEXT: s_mov_b32 s2, 0xff800000
-; SPREFETCH-SDAG-NEXT: s_mov_b32 s3, -1
-; SPREFETCH-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; SPREFETCH-SDAG-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-SDAG-NEXT: s_endpgm
-;
-; SPREFETCH-GISEL-LABEL: prefetch_inst_sgpr_min_offset:
-; SPREFETCH-GISEL: ; %bb.0: ; %entry
-; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0xff800000
-; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
-; SPREFETCH-GISEL-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-GISEL-NEXT: s_endpgm
+; SPREFETCH-LABEL: prefetch_inst_sgpr_min_offset:
+; SPREFETCH: ; %bb.0: ; %entry
+; SPREFETCH-NEXT: s_mov_b32 s2, 0xff800000
+; SPREFETCH-NEXT: s_mov_b32 s3, -1
+; SPREFETCH-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; SPREFETCH-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 -8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 0)
@@ -258,18 +237,11 @@ define amdgpu_ps void @prefetch_inst_sgpr_too_large_offset(ptr addrspace(4) inre
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-SDAG-LABEL: prefetch_inst_sgpr_too_large_offset:
-; SPREFETCH-SDAG: ; %bb.0: ; %entry
-; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
-; SPREFETCH-SDAG-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-SDAG-NEXT: s_endpgm
-;
-; SPREFETCH-GISEL-LABEL: prefetch_inst_sgpr_too_large_offset:
-; SPREFETCH-GISEL: ; %bb.0: ; %entry
-; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800000
-; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0
-; SPREFETCH-GISEL-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-GISEL-NEXT: s_endpgm
+; SPREFETCH-LABEL: prefetch_inst_sgpr_too_large_offset:
+; SPREFETCH: ; %bb.0: ; %entry
+; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
+; SPREFETCH-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 0)
@@ -282,3 +254,6 @@ declare void @llvm.prefetch.p3(ptr addrspace(3) nocapture readonly, i32, i32, i3
declare void @llvm.prefetch.p4(ptr addrspace(4) nocapture readonly, i32, i32, i32)
declare void @llvm.prefetch.p5(ptr addrspace(5) nocapture readonly, i32, i32, i32)
declare void @llvm.prefetch.p6(ptr addrspace(6) nocapture readonly, i32, i32, i32)
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; SPREFETCH-GISEL: {{.*}}
+; SPREFETCH-SDAG: {{.*}}
diff --git a/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll b/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
index 61ac1fe92c278..203696d2c0e73 100644
--- a/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
+++ b/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
@@ -4223,9 +4223,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4342,9 +4343,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4461,9 +4463,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4581,9 +4584,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4701,9 +4705,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4821,9 +4826,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4946,9 +4952,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5071,9 +5078,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5196,9 +5204,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5321,9 +5330,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5446,9 +5456,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5571,9 +5582,10 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
diff --git a/llvm/test/CodeGen/AMDGPU/offset-split-global.ll b/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
index de5f4f931070e..acb3b1911ff67 100644
--- a/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
+++ b/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
@@ -4105,9 +4105,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_11bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4222,9 +4223,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_11bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4339,9 +4341,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_12bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4456,9 +4459,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_12bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4573,9 +4577,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_13bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4690,9 +4695,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_13bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
+; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4806,9 +4812,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -4911,9 +4918,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5016,9 +5024,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5121,9 +5130,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5226,9 +5236,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5331,9 +5342,10 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
+; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
+; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
-; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
+; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
diff --git a/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll b/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
new file mode 100644
index 0000000000000..618456a9978f5
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
@@ -0,0 +1,1013 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX8,GFX8_PTRADD
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=fiji -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX8,GFX8_LEGACY
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX942,GFX942_PTRADD
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx942 -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX942,GFX942_LEGACY
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX10,GFX10_PTRADD
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1030 -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX10,GFX10_LEGACY
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX11,GFX11_PTRADD
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX11,GFX11_LEGACY
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX12,GFX12_PTRADD
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX12,GFX12_LEGACY
+
+; Tests for the ISD::PTRADD SelectionDAG opcode.
+
+; fiji is included since it does not have FeatureAddNoCarryInsts.
+
+; Check that basic pointer arithmetic can be lowered.
+define ptr @gep_as0(ptr %p, i64 %offset) {
+; GFX8-LABEL: gep_as0:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v2
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, v1, v3, vcc
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 5, v0
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_PTRADD-LABEL: gep_as0:
+; GFX942_PTRADD: ; %bb.0: ; %entry
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX942_PTRADD-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, v[2:3]
+; GFX942_PTRADD-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, 5
+; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: gep_as0:
+; GFX942_LEGACY: ; %bb.0: ; %entry
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT: v_lshl_add_u64 v[0:1], v[2:3], 2, v[0:1]
+; GFX942_LEGACY-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, 5
+; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: gep_as0:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX10-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX10-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX10-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX10-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: gep_as0:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX11-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX11-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: gep_as0:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_lshlrev_b64_e32 v[2:3], 2, v[2:3]
+; GFX12-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX12-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX12-NEXT: s_wait_alu 0xfffd
+; GFX12-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX12-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX12-NEXT: s_wait_alu 0xfffd
+; GFX12-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep1 = getelementptr inbounds i32, ptr %p, i64 %offset
+ %gep2 = getelementptr inbounds i8, ptr %gep1, i64 5
+ ret ptr %gep2
+}
+
+define ptr addrspace(3) @gep_as3(ptr addrspace(3) %p, i32 %offset) {
+; GFX8-LABEL: gep_as3:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v1
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_PTRADD-LABEL: gep_as3:
+; GFX942_PTRADD: ; %bb.0: ; %entry
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX942_PTRADD-NEXT: v_add_u32_e32 v0, v0, v1
+; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: gep_as3:
+; GFX942_LEGACY: ; %bb.0: ; %entry
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
+; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10_PTRADD-LABEL: gep_as3:
+; GFX10_PTRADD: ; %bb.0: ; %entry
+; GFX10_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX10_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
+; GFX10_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10_LEGACY-LABEL: gep_as3:
+; GFX10_LEGACY: ; %bb.0: ; %entry
+; GFX10_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
+; GFX10_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11_PTRADD-LABEL: gep_as3:
+; GFX11_PTRADD: ; %bb.0: ; %entry
+; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX11_PTRADD-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
+; GFX11_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11_LEGACY-LABEL: gep_as3:
+; GFX11_LEGACY: ; %bb.0: ; %entry
+; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
+; GFX11_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12_PTRADD-LABEL: gep_as3:
+; GFX12_PTRADD: ; %bb.0: ; %entry
+; GFX12_PTRADD-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12_PTRADD-NEXT: s_wait_expcnt 0x0
+; GFX12_PTRADD-NEXT: s_wait_samplecnt 0x0
+; GFX12_PTRADD-NEXT: s_wait_bvhcnt 0x0
+; GFX12_PTRADD-NEXT: s_wait_kmcnt 0x0
+; GFX12_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX12_PTRADD-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX12_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
+; GFX12_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12_LEGACY-LABEL: gep_as3:
+; GFX12_LEGACY: ; %bb.0: ; %entry
+; GFX12_LEGACY-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12_LEGACY-NEXT: s_wait_expcnt 0x0
+; GFX12_LEGACY-NEXT: s_wait_samplecnt 0x0
+; GFX12_LEGACY-NEXT: s_wait_bvhcnt 0x0
+; GFX12_LEGACY-NEXT: s_wait_kmcnt 0x0
+; GFX12_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
+; GFX12_LEGACY-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
+ ret ptr addrspace(3) %gep
+}
+
+define amdgpu_kernel void @gep_as0_uniform(ptr %p, i64 %offset, ptr %ret) {
+; GFX8-LABEL: gep_as0_uniform:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX8-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x10
+; GFX8-NEXT: s_add_i32 s12, s12, s17
+; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
+; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX8-NEXT: s_add_u32 s0, s0, s2
+; GFX8-NEXT: s_addc_u32 s1, s1, s3
+; GFX8-NEXT: v_mov_b32_e32 v4, s0
+; GFX8-NEXT: s_add_u32 s0, s4, 4
+; GFX8-NEXT: v_mov_b32_e32 v5, s1
+; GFX8-NEXT: s_addc_u32 s1, s5, 0
+; GFX8-NEXT: v_mov_b32_e32 v3, s1
+; GFX8-NEXT: v_mov_b32_e32 v0, s4
+; GFX8-NEXT: v_mov_b32_e32 v2, s0
+; GFX8-NEXT: v_mov_b32_e32 v1, s5
+; GFX8-NEXT: flat_store_dword v[2:3], v5
+; GFX8-NEXT: flat_store_dword v[0:1], v4
+; GFX8-NEXT: s_endpgm
+;
+; GFX942-LABEL: gep_as0_uniform:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
+; GFX942-NEXT: s_load_dwordx2 s[6:7], s[4:5], 0x10
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX942-NEXT: s_add_u32 s0, s0, s2
+; GFX942-NEXT: s_addc_u32 s1, s1, s3
+; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[6:7]
+; GFX942-NEXT: v_mov_b64_e32 v[2:3], s[0:1]
+; GFX942-NEXT: flat_store_dwordx2 v[0:1], v[2:3]
+; GFX942-NEXT: s_endpgm
+;
+; GFX10-LABEL: gep_as0_uniform:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_add_u32 s12, s12, s17
+; GFX10-NEXT: s_addc_u32 s13, s13, 0
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
+; GFX10-NEXT: s_clause 0x1
+; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX10-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x10
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX10-NEXT: v_mov_b32_e32 v0, s4
+; GFX10-NEXT: s_add_u32 s0, s0, s2
+; GFX10-NEXT: s_addc_u32 s1, s1, s3
+; GFX10-NEXT: v_mov_b32_e32 v1, s5
+; GFX10-NEXT: v_mov_b32_e32 v3, s1
+; GFX10-NEXT: v_mov_b32_e32 v2, s0
+; GFX10-NEXT: flat_store_dwordx2 v[0:1], v[2:3]
+; GFX10-NEXT: s_endpgm
+;
+; GFX11-LABEL: gep_as0_uniform:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_clause 0x1
+; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX11-NEXT: s_load_b64 s[4:5], s[4:5], 0x10
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX11-NEXT: v_dual_mov_b32 v0, s4 :: v_dual_mov_b32 v1, s5
+; GFX11-NEXT: s_add_u32 s0, s0, s2
+; GFX11-NEXT: s_addc_u32 s1, s1, s3
+; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX11-NEXT: v_dual_mov_b32 v3, s1 :: v_dual_mov_b32 v2, s0
+; GFX11-NEXT: flat_store_b64 v[0:1], v[2:3]
+; GFX11-NEXT: s_endpgm
+;
+; GFX12-LABEL: gep_as0_uniform:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_clause 0x1
+; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX12-NEXT: s_load_b64 s[4:5], s[4:5], 0x10
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX12-NEXT: v_dual_mov_b32 v0, s4 :: v_dual_mov_b32 v1, s5
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: v_dual_mov_b32 v3, s1 :: v_dual_mov_b32 v2, s0
+; GFX12-NEXT: flat_store_b64 v[0:1], v[2:3]
+; GFX12-NEXT: s_endpgm
+entry:
+ %gep = getelementptr inbounds i32, ptr %p, i64 %offset
+ store ptr %gep, ptr %ret
+ ret void
+}
+
+define amdgpu_kernel void @gep_as3_uniform(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
+; GFX8-LABEL: gep_as3_uniform:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX8-NEXT: s_add_i32 s12, s12, s17
+; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
+; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_lshl_b32 s1, s1, 2
+; GFX8-NEXT: s_add_i32 s0, s0, s1
+; GFX8-NEXT: v_mov_b32_e32 v0, s2
+; GFX8-NEXT: v_mov_b32_e32 v1, s3
+; GFX8-NEXT: v_mov_b32_e32 v2, s0
+; GFX8-NEXT: flat_store_dword v[0:1], v2
+; GFX8-NEXT: s_endpgm
+;
+; GFX942-LABEL: gep_as3_uniform:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_lshl_b32 s1, s1, 2
+; GFX942-NEXT: s_add_i32 s0, s0, s1
+; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[2:3]
+; GFX942-NEXT: v_mov_b32_e32 v2, s0
+; GFX942-NEXT: flat_store_dword v[0:1], v2
+; GFX942-NEXT: s_endpgm
+;
+; GFX10-LABEL: gep_as3_uniform:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_add_u32 s12, s12, s17
+; GFX10-NEXT: s_addc_u32 s13, s13, 0
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
+; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_lshl_b32 s1, s1, 2
+; GFX10-NEXT: v_mov_b32_e32 v0, s2
+; GFX10-NEXT: s_add_i32 s0, s0, s1
+; GFX10-NEXT: v_mov_b32_e32 v1, s3
+; GFX10-NEXT: v_mov_b32_e32 v2, s0
+; GFX10-NEXT: flat_store_dword v[0:1], v2
+; GFX10-NEXT: s_endpgm
+;
+; GFX11-LABEL: gep_as3_uniform:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_lshl_b32 s1, s1, 2
+; GFX11-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
+; GFX11-NEXT: s_add_i32 s0, s0, s1
+; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX11-NEXT: v_mov_b32_e32 v2, s0
+; GFX11-NEXT: flat_store_b32 v[0:1], v2
+; GFX11-NEXT: s_endpgm
+;
+; GFX12-LABEL: gep_as3_uniform:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: s_lshl_b32 s1, s1, 2
+; GFX12-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
+; GFX12-NEXT: s_add_co_i32 s0, s0, s1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-NEXT: v_mov_b32_e32 v2, s0
+; GFX12-NEXT: flat_store_b32 v[0:1], v2
+; GFX12-NEXT: s_endpgm
+entry:
+ %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
+ store ptr addrspace(3) %gep, ptr %ret
+ ret void
+}
+
+; Check that pointer arithmetic with multiple indexing steps can be lowered.
+define ptr @multi_gep_as0(ptr %p, i64 %offset) {
+; GFX8-LABEL: multi_gep_as0:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v2
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, v1, v3, vcc
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 5, v0
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_PTRADD-LABEL: multi_gep_as0:
+; GFX942_PTRADD: ; %bb.0: ; %entry
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX942_PTRADD-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, v[2:3]
+; GFX942_PTRADD-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, 5
+; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: multi_gep_as0:
+; GFX942_LEGACY: ; %bb.0: ; %entry
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT: v_lshl_add_u64 v[0:1], v[2:3], 2, v[0:1]
+; GFX942_LEGACY-NEXT: v_lshl_add_u64 v[0:1], v[0:1], 0, 5
+; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: multi_gep_as0:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX10-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX10-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX10-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX10-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: multi_gep_as0:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_lshlrev_b64 v[2:3], 2, v[2:3]
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX11-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_2) | instskip(NEXT) | instid1(VALU_DEP_1)
+; GFX11-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX11-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: multi_gep_as0:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_lshlrev_b64_e32 v[2:3], 2, v[2:3]
+; GFX12-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX12-NEXT: v_add_co_u32 v0, vcc_lo, v0, v2
+; GFX12-NEXT: s_wait_alu 0xfffd
+; GFX12-NEXT: v_add_co_ci_u32_e64 v1, null, v1, v3, vcc_lo
+; GFX12-NEXT: v_add_co_u32 v0, vcc_lo, v0, 5
+; GFX12-NEXT: s_wait_alu 0xfffd
+; GFX12-NEXT: v_add_co_ci_u32_e64 v1, null, 0, v1, vcc_lo
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep1 = getelementptr inbounds i32, ptr %p, i64 %offset
+ %gep2 = getelementptr inbounds i8, ptr %gep1, i64 5
+ ret ptr %gep2
+}
+
+define ptr addrspace(3) @multi_gep_as3(ptr addrspace(3) %p, i32 %offset) {
+; GFX8-LABEL: multi_gep_as3:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v1
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 5, v0
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942-LABEL: multi_gep_as3:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX942-NEXT: v_add3_u32 v0, v0, v1, 5
+; GFX942-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: multi_gep_as3:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX10-NEXT: v_add3_u32 v0, v0, v1, 5
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: multi_gep_as3:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX11-NEXT: v_add3_u32 v0, v0, v1, 5
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: multi_gep_as3:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: v_lshlrev_b32_e32 v1, 2, v1
+; GFX12-NEXT: s_delay_alu instid0(VALU_DEP_1)
+; GFX12-NEXT: v_add3_u32 v0, v0, v1, 5
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep1 = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
+ %gep2 = getelementptr inbounds i8, ptr addrspace(3) %gep1, i32 5
+ ret ptr addrspace(3) %gep2
+}
+
+define amdgpu_kernel void @multi_gep_as0_uniform(ptr %p, i64 %offset, ptr %ret) {
+; GFX8-LABEL: multi_gep_as0_uniform:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX8-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x10
+; GFX8-NEXT: s_add_i32 s12, s12, s17
+; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
+; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX8-NEXT: s_add_u32 s0, s0, s2
+; GFX8-NEXT: s_addc_u32 s1, s1, s3
+; GFX8-NEXT: s_add_u32 s0, s0, 5
+; GFX8-NEXT: s_addc_u32 s1, s1, 0
+; GFX8-NEXT: v_mov_b32_e32 v4, s0
+; GFX8-NEXT: s_add_u32 s0, s4, 4
+; GFX8-NEXT: v_mov_b32_e32 v5, s1
+; GFX8-NEXT: s_addc_u32 s1, s5, 0
+; GFX8-NEXT: v_mov_b32_e32 v3, s1
+; GFX8-NEXT: v_mov_b32_e32 v0, s4
+; GFX8-NEXT: v_mov_b32_e32 v2, s0
+; GFX8-NEXT: v_mov_b32_e32 v1, s5
+; GFX8-NEXT: flat_store_dword v[2:3], v5
+; GFX8-NEXT: flat_store_dword v[0:1], v4
+; GFX8-NEXT: s_endpgm
+;
+; GFX942-LABEL: multi_gep_as0_uniform:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
+; GFX942-NEXT: s_load_dwordx2 s[6:7], s[4:5], 0x10
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX942-NEXT: s_add_u32 s0, s0, s2
+; GFX942-NEXT: s_addc_u32 s1, s1, s3
+; GFX942-NEXT: s_add_u32 s0, s0, 5
+; GFX942-NEXT: s_addc_u32 s1, s1, 0
+; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[6:7]
+; GFX942-NEXT: v_mov_b64_e32 v[2:3], s[0:1]
+; GFX942-NEXT: flat_store_dwordx2 v[0:1], v[2:3]
+; GFX942-NEXT: s_endpgm
+;
+; GFX10-LABEL: multi_gep_as0_uniform:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_add_u32 s12, s12, s17
+; GFX10-NEXT: s_addc_u32 s13, s13, 0
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
+; GFX10-NEXT: s_clause 0x1
+; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX10-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x10
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX10-NEXT: v_mov_b32_e32 v0, s4
+; GFX10-NEXT: s_add_u32 s0, s0, s2
+; GFX10-NEXT: s_addc_u32 s1, s1, s3
+; GFX10-NEXT: s_add_u32 s0, s0, 5
+; GFX10-NEXT: s_addc_u32 s1, s1, 0
+; GFX10-NEXT: v_mov_b32_e32 v1, s5
+; GFX10-NEXT: v_mov_b32_e32 v3, s1
+; GFX10-NEXT: v_mov_b32_e32 v2, s0
+; GFX10-NEXT: flat_store_dwordx2 v[0:1], v[2:3]
+; GFX10-NEXT: s_endpgm
+;
+; GFX11-LABEL: multi_gep_as0_uniform:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_clause 0x1
+; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX11-NEXT: s_load_b64 s[4:5], s[4:5], 0x10
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX11-NEXT: v_dual_mov_b32 v0, s4 :: v_dual_mov_b32 v1, s5
+; GFX11-NEXT: s_add_u32 s0, s0, s2
+; GFX11-NEXT: s_addc_u32 s1, s1, s3
+; GFX11-NEXT: s_add_u32 s0, s0, 5
+; GFX11-NEXT: s_addc_u32 s1, s1, 0
+; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX11-NEXT: v_dual_mov_b32 v3, s1 :: v_dual_mov_b32 v2, s0
+; GFX11-NEXT: flat_store_b64 v[0:1], v[2:3]
+; GFX11-NEXT: s_endpgm
+;
+; GFX12-LABEL: multi_gep_as0_uniform:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_clause 0x1
+; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX12-NEXT: s_load_b64 s[4:5], s[4:5], 0x10
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: s_lshl_b64 s[2:3], s[2:3], 2
+; GFX12-NEXT: v_dual_mov_b32 v0, s4 :: v_dual_mov_b32 v1, s5
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_nc_u64 s[0:1], s[0:1], 5
+; GFX12-NEXT: v_dual_mov_b32 v3, s1 :: v_dual_mov_b32 v2, s0
+; GFX12-NEXT: flat_store_b64 v[0:1], v[2:3]
+; GFX12-NEXT: s_endpgm
+entry:
+ %gep1 = getelementptr inbounds i32, ptr %p, i64 %offset
+ %gep2 = getelementptr inbounds i8, ptr %gep1, i64 5
+ store ptr %gep2, ptr %ret
+ ret void
+}
+
+define amdgpu_kernel void @multi_gep_as3_uniform(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
+; GFX8-LABEL: multi_gep_as3_uniform:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX8-NEXT: s_add_i32 s12, s12, s17
+; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
+; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_lshl_b32 s1, s1, 2
+; GFX8-NEXT: s_add_i32 s0, s0, s1
+; GFX8-NEXT: s_add_i32 s0, s0, 5
+; GFX8-NEXT: v_mov_b32_e32 v0, s2
+; GFX8-NEXT: v_mov_b32_e32 v1, s3
+; GFX8-NEXT: v_mov_b32_e32 v2, s0
+; GFX8-NEXT: flat_store_dword v[0:1], v2
+; GFX8-NEXT: s_endpgm
+;
+; GFX942-LABEL: multi_gep_as3_uniform:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_lshl_b32 s1, s1, 2
+; GFX942-NEXT: s_add_i32 s0, s0, s1
+; GFX942-NEXT: s_add_i32 s0, s0, 5
+; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[2:3]
+; GFX942-NEXT: v_mov_b32_e32 v2, s0
+; GFX942-NEXT: flat_store_dword v[0:1], v2
+; GFX942-NEXT: s_endpgm
+;
+; GFX10-LABEL: multi_gep_as3_uniform:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_add_u32 s12, s12, s17
+; GFX10-NEXT: s_addc_u32 s13, s13, 0
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
+; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_lshl_b32 s1, s1, 2
+; GFX10-NEXT: v_mov_b32_e32 v0, s2
+; GFX10-NEXT: s_add_i32 s0, s0, s1
+; GFX10-NEXT: v_mov_b32_e32 v1, s3
+; GFX10-NEXT: s_add_i32 s0, s0, 5
+; GFX10-NEXT: v_mov_b32_e32 v2, s0
+; GFX10-NEXT: flat_store_dword v[0:1], v2
+; GFX10-NEXT: s_endpgm
+;
+; GFX11-LABEL: multi_gep_as3_uniform:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_lshl_b32 s1, s1, 2
+; GFX11-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
+; GFX11-NEXT: s_add_i32 s0, s0, s1
+; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GFX11-NEXT: s_add_i32 s0, s0, 5
+; GFX11-NEXT: v_mov_b32_e32 v2, s0
+; GFX11-NEXT: flat_store_b32 v[0:1], v2
+; GFX11-NEXT: s_endpgm
+;
+; GFX12-LABEL: multi_gep_as3_uniform:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: s_lshl_b32 s1, s1, 2
+; GFX12-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
+; GFX12-NEXT: s_add_co_i32 s0, s0, s1
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GFX12-NEXT: s_add_co_i32 s0, s0, 5
+; GFX12-NEXT: v_mov_b32_e32 v2, s0
+; GFX12-NEXT: flat_store_b32 v[0:1], v2
+; GFX12-NEXT: s_endpgm
+entry:
+ %gep1 = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
+ %gep2 = getelementptr inbounds i8, ptr addrspace(3) %gep1, i32 5
+ store ptr addrspace(3) %gep2, ptr %ret
+ ret void
+}
+
+; Check that SIShrinkInstructions triggers and generates s_addk_i32.
+define amdgpu_kernel void @use_s_addk_i32(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
+; GFX8-LABEL: use_s_addk_i32:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_load_dword s2, s[8:9], 0x0
+; GFX8-NEXT: s_load_dwordx2 s[0:1], s[8:9], 0x8
+; GFX8-NEXT: s_add_i32 s12, s12, s17
+; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
+; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_addk_i32 s2, 0x4b0
+; GFX8-NEXT: v_mov_b32_e32 v0, s0
+; GFX8-NEXT: v_mov_b32_e32 v1, s1
+; GFX8-NEXT: v_mov_b32_e32 v2, s2
+; GFX8-NEXT: flat_store_dword v[0:1], v2
+; GFX8-NEXT: s_endpgm
+;
+; GFX942-LABEL: use_s_addk_i32:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_load_dword s2, s[4:5], 0x0
+; GFX942-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x8
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_addk_i32 s2, 0x4b0
+; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[0:1]
+; GFX942-NEXT: v_mov_b32_e32 v2, s2
+; GFX942-NEXT: flat_store_dword v[0:1], v2
+; GFX942-NEXT: s_endpgm
+;
+; GFX10-LABEL: use_s_addk_i32:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_add_u32 s12, s12, s17
+; GFX10-NEXT: s_addc_u32 s13, s13, 0
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
+; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
+; GFX10-NEXT: s_clause 0x1
+; GFX10-NEXT: s_load_dword s2, s[8:9], 0x0
+; GFX10-NEXT: s_load_dwordx2 s[0:1], s[8:9], 0x8
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_addk_i32 s2, 0x4b0
+; GFX10-NEXT: v_mov_b32_e32 v0, s0
+; GFX10-NEXT: v_mov_b32_e32 v1, s1
+; GFX10-NEXT: v_mov_b32_e32 v2, s2
+; GFX10-NEXT: flat_store_dword v[0:1], v2
+; GFX10-NEXT: s_endpgm
+;
+; GFX11-LABEL: use_s_addk_i32:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_clause 0x1
+; GFX11-NEXT: s_load_b32 s2, s[4:5], 0x0
+; GFX11-NEXT: s_load_b64 s[0:1], s[4:5], 0x8
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_addk_i32 s2, 0x4b0
+; GFX11-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
+; GFX11-NEXT: v_mov_b32_e32 v2, s2
+; GFX11-NEXT: flat_store_b32 v[0:1], v2
+; GFX11-NEXT: s_endpgm
+;
+; GFX12-LABEL: use_s_addk_i32:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_clause 0x1
+; GFX12-NEXT: s_load_b32 s2, s[4:5], 0x0
+; GFX12-NEXT: s_load_b64 s[0:1], s[4:5], 0x8
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: s_addk_co_i32 s2, 0x4b0
+; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: v_mov_b32_e32 v2, s2
+; GFX12-NEXT: flat_store_b32 v[0:1], v2
+; GFX12-NEXT: s_endpgm
+entry:
+ %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 300
+ store ptr addrspace(3) %gep, ptr %ret
+ ret void
+}
+
+; Check that constant offsets are folded into memory instructions.
+
+define void @fold_as0(ptr %from, ptr %to) {
+; GFX8-LABEL: fold_as0:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v0
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; GFX8-NEXT: flat_load_dword v4, v[0:1]
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v2
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v3, vcc
+; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX8-NEXT: flat_store_dword v[0:1], v4
+; GFX8-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942-LABEL: fold_as0:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT: flat_load_dword v0, v[0:1] offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT: flat_store_dword v[2:3], v0 offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX942-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: fold_as0:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: flat_load_dword v0, v[0:1] offset:8
+; GFX10-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX10-NEXT: flat_store_dword v[2:3], v0 offset:8
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: fold_as0:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: flat_load_b32 v0, v[0:1] offset:8
+; GFX11-NEXT: s_waitcnt vmcnt(0) lgkmcnt(0)
+; GFX11-NEXT: flat_store_b32 v[2:3], v0 offset:8
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: fold_as0:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: flat_load_b32 v0, v[0:1] offset:8
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: flat_store_b32 v[2:3], v0 offset:8
+; GFX12-NEXT: s_wait_dscnt 0x0
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep.from = getelementptr inbounds i8, ptr %from, i64 8
+ %val = load i32, ptr %gep.from, align 4
+ %gep.to = getelementptr inbounds i8, ptr %to, i64 8
+ store i32 %val, ptr %gep.to, align 4
+ ret void
+}
+
+define void @fold_as1(ptr addrspace(1) %from, ptr addrspace(1) %to) {
+; GFX8-LABEL: fold_as1:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v0
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; GFX8-NEXT: flat_load_dword v4, v[0:1]
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v2
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v3, vcc
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: flat_store_dword v[0:1], v4
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942-LABEL: fold_as1:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT: global_load_dword v0, v[0:1], off offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0)
+; GFX942-NEXT: global_store_dword v[2:3], v0, off offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0)
+; GFX942-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: fold_as1:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: global_load_dword v0, v[0:1], off offset:8
+; GFX10-NEXT: s_waitcnt vmcnt(0)
+; GFX10-NEXT: global_store_dword v[2:3], v0, off offset:8
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: fold_as1:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: global_load_b32 v0, v[0:1], off offset:8
+; GFX11-NEXT: s_waitcnt vmcnt(0)
+; GFX11-NEXT: global_store_b32 v[2:3], v0, off offset:8
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: fold_as1:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: global_load_b32 v0, v[0:1], off offset:8
+; GFX12-NEXT: s_wait_loadcnt 0x0
+; GFX12-NEXT: global_store_b32 v[2:3], v0, off offset:8
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep.from = getelementptr inbounds i8, ptr addrspace(1) %from, i64 8
+ %val = load i32, ptr addrspace(1) %gep.from, align 4
+ %gep.to = getelementptr inbounds i8, ptr addrspace(1) %to, i64 8
+ store i32 %val, ptr addrspace(1) %gep.to, align 4
+ ret void
+}
+
+define void @fold_as3(ptr addrspace(3) %from, ptr addrspace(3) %to) {
+; GFX8-LABEL: fold_as3:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: s_mov_b32 m0, -1
+; GFX8-NEXT: ds_read_b32 v0, v0 offset:8
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: ds_write_b32 v1, v0 offset:8
+; GFX8-NEXT: s_waitcnt lgkmcnt(0)
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942-LABEL: fold_as3:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT: ds_read_b32 v0, v0 offset:8
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: ds_write_b32 v1, v0 offset:8
+; GFX942-NEXT: s_waitcnt lgkmcnt(0)
+; GFX942-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: fold_as3:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: ds_read_b32 v0, v0 offset:8
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: ds_write_b32 v1, v0 offset:8
+; GFX10-NEXT: s_waitcnt lgkmcnt(0)
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: fold_as3:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: ds_load_b32 v0, v0 offset:8
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: ds_store_b32 v1, v0 offset:8
+; GFX11-NEXT: s_waitcnt lgkmcnt(0)
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: fold_as3:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: ds_load_b32 v0, v0 offset:8
+; GFX12-NEXT: s_wait_dscnt 0x0
+; GFX12-NEXT: ds_store_b32 v1, v0 offset:8
+; GFX12-NEXT: s_wait_dscnt 0x0
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep.from = getelementptr inbounds i8, ptr addrspace(3) %from, i32 8
+ %val = load i32, ptr addrspace(3) %gep.from, align 4
+ %gep.to = getelementptr inbounds i8, ptr addrspace(3) %to, i32 8
+ store i32 %val, ptr addrspace(3) %gep.to, align 4
+ ret void
+}
+
+define void @fold_as4(ptr addrspace(4) %from, ptr addrspace(1) %to) {
+; GFX8-LABEL: fold_as4:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v0
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v1, vcc
+; GFX8-NEXT: flat_load_dword v4, v[0:1]
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v2
+; GFX8-NEXT: v_addc_u32_e32 v1, vcc, 0, v3, vcc
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: flat_store_dword v[0:1], v4
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942-LABEL: fold_as4:
+; GFX942: ; %bb.0: ; %entry
+; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942-NEXT: global_load_dword v0, v[0:1], off offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0)
+; GFX942-NEXT: global_store_dword v[2:3], v0, off offset:8
+; GFX942-NEXT: s_waitcnt vmcnt(0)
+; GFX942-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: fold_as4:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: global_load_dword v0, v[0:1], off offset:8
+; GFX10-NEXT: s_waitcnt vmcnt(0)
+; GFX10-NEXT: global_store_dword v[2:3], v0, off offset:8
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11-LABEL: fold_as4:
+; GFX11: ; %bb.0: ; %entry
+; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-NEXT: global_load_b32 v0, v[0:1], off offset:8
+; GFX11-NEXT: s_waitcnt vmcnt(0)
+; GFX11-NEXT: global_store_b32 v[2:3], v0, off offset:8
+; GFX11-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: fold_as4:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: global_load_b32 v0, v[0:1], off offset:8
+; GFX12-NEXT: s_wait_loadcnt 0x0
+; GFX12-NEXT: global_store_b32 v[2:3], v0, off offset:8
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep.from = getelementptr inbounds i8, ptr addrspace(4) %from, i64 8
+ %val = load i32, ptr addrspace(4) %gep.from, align 4
+ %gep.to = getelementptr inbounds i8, ptr addrspace(1) %to, i64 8
+ store i32 %val, ptr addrspace(1) %gep.to, align 4
+ ret void
+}
+
+define void @fold_as5(ptr addrspace(5) %from, ptr addrspace(5) %to) {
+; GFX8-LABEL: fold_as5:
+; GFX8: ; %bb.0: ; %entry
+; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v0
+; GFX8-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen
+; GFX8-NEXT: v_add_u32_e32 v1, vcc, 8, v1
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
+; GFX8-NEXT: s_waitcnt vmcnt(0)
+; GFX8-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_PTRADD-LABEL: fold_as5:
+; GFX942_PTRADD: ; %bb.0: ; %entry
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_PTRADD-NEXT: v_add_u32_e32 v0, 8, v0
+; GFX942_PTRADD-NEXT: scratch_load_dword v0, v0, off
+; GFX942_PTRADD-NEXT: v_add_u32_e32 v1, 8, v1
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0)
+; GFX942_PTRADD-NEXT: scratch_store_dword v1, v0, off
+; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0)
+; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX942_LEGACY-LABEL: fold_as5:
+; GFX942_LEGACY: ; %bb.0: ; %entry
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX942_LEGACY-NEXT: scratch_load_dword v0, v0, off offset:8
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0)
+; GFX942_LEGACY-NEXT: scratch_store_dword v1, v0, off offset:8
+; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0)
+; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX10-LABEL: fold_as5:
+; GFX10: ; %bb.0: ; %entry
+; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX10-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen offset:8
+; GFX10-NEXT: s_waitcnt vmcnt(0)
+; GFX10-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen offset:8
+; GFX10-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11_PTRADD-LABEL: fold_as5:
+; GFX11_PTRADD: ; %bb.0: ; %entry
+; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v0, 8, v0
+; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v1, 8, v1
+; GFX11_PTRADD-NEXT: scratch_load_b32 v0, v0, off
+; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0)
+; GFX11_PTRADD-NEXT: scratch_store_b32 v1, v0, off
+; GFX11_PTRADD-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX11_LEGACY-LABEL: fold_as5:
+; GFX11_LEGACY: ; %bb.0: ; %entry
+; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11_LEGACY-NEXT: scratch_load_b32 v0, v0, off offset:8
+; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0)
+; GFX11_LEGACY-NEXT: scratch_store_b32 v1, v0, off offset:8
+; GFX11_LEGACY-NEXT: s_setpc_b64 s[30:31]
+;
+; GFX12-LABEL: fold_as5:
+; GFX12: ; %bb.0: ; %entry
+; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
+; GFX12-NEXT: s_wait_expcnt 0x0
+; GFX12-NEXT: s_wait_samplecnt 0x0
+; GFX12-NEXT: s_wait_bvhcnt 0x0
+; GFX12-NEXT: s_wait_kmcnt 0x0
+; GFX12-NEXT: scratch_load_b32 v0, v0, off offset:8
+; GFX12-NEXT: s_wait_loadcnt 0x0
+; GFX12-NEXT: scratch_store_b32 v1, v0, off offset:8
+; GFX12-NEXT: s_setpc_b64 s[30:31]
+entry:
+ %gep.from = getelementptr inbounds i8, ptr addrspace(5) %from, i32 8
+ %val = load i32, ptr addrspace(5) %gep.from, align 4
+ %gep.to = getelementptr inbounds i8, ptr addrspace(5) %to, i32 8
+ store i32 %val, ptr addrspace(5) %gep.to, align 4
+ ret void
+}
+
+;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; GFX8_LEGACY: {{.*}}
+; GFX8_PTRADD: {{.*}}
diff --git a/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll b/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
index bfadfd860edf6..3fc4cf1ab3e51 100644
--- a/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
+++ b/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
@@ -57,10 +57,9 @@ define amdgpu_kernel void @entry_func(i32 %x) {
; GISEL-NEXT: ;;#ASMSTART
; GISEL-NEXT: s_nop
; GISEL-NEXT: ;;#ASMEND
-; GISEL-NEXT: s_add_co_u32 s8, s4, 4
; GISEL-NEXT: s_mov_b32 s0, non_entry_func at abs32@lo
; GISEL-NEXT: s_mov_b32 s1, non_entry_func at abs32@hi
-; GISEL-NEXT: s_add_co_ci_u32 s9, s5, 0
+; GISEL-NEXT: s_add_nc_u64 s[8:9], s[4:5], 4
; GISEL-NEXT: s_mov_b64 s[4:5], s[12:13]
; GISEL-NEXT: s_mov_b32 s32, 0
; GISEL-NEXT: s_wait_kmcnt 0x0
>From 1495ed979cb00cc1c40e86e7b0de8491a2adbca2 Mon Sep 17 00:00:00 2001
From: Fabian Ritter <fabian.ritter at amd.com>
Date: Wed, 28 May 2025 05:42:47 -0400
Subject: [PATCH 2/4] Set GISelShouldIgnore for the 64-bit PTRADD patterns.
---
llvm/lib/Target/AMDGPU/SIInstructions.td | 10 +-
...inst-select-amdgpu-atomic-cmpxchg-flat.mir | 80 +-
...st-select-amdgpu-atomic-cmpxchg-global.mir | 124 ++-
.../inst-select-atomicrmw-add-flat.mir | 240 ++++-
.../inst-select-atomicrmw-add-global.mir | 220 ++++-
.../inst-select-load-atomic-flat.mir | 60 +-
.../inst-select-load-atomic-global.mir | 96 +-
.../GlobalISel/inst-select-load-constant.mir | 100 +-
.../GlobalISel/inst-select-load-flat.mir | 760 ++++++++++++---
...st-select-load-global-old-legalization.mir | 874 ++++++++++++++----
.../inst-select-load-global-saddr.mir | 312 +++++--
.../GlobalISel/inst-select-load-global.mir | 874 ++++++++++++++----
.../GlobalISel/inst-select-load-smrd.mir | 24 +-
.../AMDGPU/GlobalISel/inst-select-ptr-add.mir | 250 ++++-
.../GlobalISel/inst-select-store-flat.mir | 250 ++++-
.../GlobalISel/inst-select-store-global.mir | 256 +++--
.../CodeGen/AMDGPU/GlobalISel/mubuf-global.ll | 53 +-
.../AMDGPU/gfx12_scalar_subword_loads.ll | 134 ++-
llvm/test/CodeGen/AMDGPU/global-saddr-load.ll | 372 +++++---
llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll | 83 +-
llvm/test/CodeGen/AMDGPU/offset-split-flat.ll | 60 +-
.../CodeGen/AMDGPU/offset-split-global.ll | 60 +-
llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll | 3 +-
23 files changed, 4014 insertions(+), 1281 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index c93584bc8b5ce..26b194dffa07d 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1380,9 +1380,15 @@ def : GCNPat <
// PTRADD Patterns
//===----------------------------------------------------------------------===//
+// GlobalISel shouldn't generate 64-bit addition pseudos.
+let GISelShouldIgnore = 1 in {
def : GCNPat<
(DivergentBinFrag<ptradd> i64:$src0, i64:$src1),
(V_ADD_U64_PSEUDO $src0, $src1)>;
+def : GCNPat<
+ (UniformBinFrag<ptradd> i64:$src0, i64:$src1),
+ (S_ADD_U64_PSEUDO $src0, $src1)>;
+}
def : GCNPat<
(DivergentBinFrag<ptradd> i32:$src0, i32:$src1),
@@ -1396,10 +1402,6 @@ def : GCNPat<
let SubtargetPredicate = NotHasAddNoCarryInsts;
}
-def : GCNPat<
- (UniformBinFrag<ptradd> i64:$src0, i64:$src1),
- (S_ADD_U64_PSEUDO $src0, $src1)>;
-
// Whether we select S_ADD_I32 or S_ADD_U32 does not make much of a
// difference. Most notably, S_ADD_I32 instructions can be transformed
// to S_ADDK_I32, so we select that.
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
index e9db317ec7817..85d852fc779b2 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-flat.mir
@@ -91,8 +91,14 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gep4
@@ -113,8 +119,14 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gep4
@@ -231,8 +243,14 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s64_flat_gep4
@@ -253,8 +271,14 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s64_flat_gep4
@@ -304,8 +328,14 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -316,8 +346,14 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX10-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -328,8 +364,14 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX11-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
@@ -340,8 +382,14 @@ body: |
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX12-LABEL: name: amdgpu_atomic_cmpxchg_s32_flat_gepm4
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
index 7545ddc8a436c..09eb77fcbdd9d 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgpu-atomic-cmpxchg-global.mir
@@ -146,8 +146,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gep4
@@ -158,8 +164,14 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gep4
@@ -330,8 +342,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s64_global_gep4
@@ -342,8 +360,14 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr4_vgpr5
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_128 = REG_SEQUENCE [[COPY1]], %subreg.sub0_sub1, [[COPY2]], %subreg.sub2_sub3
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_CMPSWAP_X2_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX8-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_CMPSWAP_X2_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s64_global_gep4
@@ -393,15 +417,21 @@ body: |
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[V_ADD_U]], [[REG_SEQUENCE2]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
- ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
- ; GFX6-NEXT: $vgpr0 = COPY [[COPY3]]
+ ; GFX6-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE2]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[REG_SEQUENCE1]], [[REG_SEQUENCE3]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX6-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
+ ; GFX6-NEXT: $vgpr0 = COPY [[COPY7]]
;
; GFX7-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2, $vgpr3
@@ -411,15 +441,21 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[V_ADD_U]], [[REG_SEQUENCE2]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
- ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
- ; GFX7-NEXT: $vgpr0 = COPY [[COPY3]]
+ ; GFX7-NEXT: [[REG_SEQUENCE3:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE2]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN:%[0-9]+]]:vreg_64 = BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN [[REG_SEQUENCE]], [[REG_SEQUENCE1]], [[REG_SEQUENCE3]], 0, 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[BUFFER_ATOMIC_CMPSWAP_ADDR64_RTN]].sub0
+ ; GFX7-NEXT: $vgpr0 = COPY [[COPY7]]
;
; GFX7-FLAT-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2, $vgpr3
@@ -429,8 +465,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
@@ -441,8 +483,14 @@ body: |
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[V_ADD_U]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[REG_SEQUENCE1]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_gepm4
@@ -772,10 +820,16 @@ body: |
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX7-FLAT-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4095
- ; GFX7-FLAT-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
- ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY3]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY3]], [[COPY4]], implicit-def $scc
+ ; GFX7-FLAT-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY5]], [[COPY6]], implicit-def dead $scc, implicit $scc
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[COPY7:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX7-FLAT-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY7]], [[REG_SEQUENCE1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX8-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_sgpr_ptr_offset_4095
@@ -785,10 +839,16 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr3
; GFX8-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4095
- ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
- ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY3]], [[REG_SEQUENCE]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY6:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY3]], [[COPY4]], implicit-def $scc
+ ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY5]], [[COPY6]], implicit-def dead $scc, implicit $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX8-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, [[COPY2]], %subreg.sub1
+ ; GFX8-NEXT: [[COPY7:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX8-NEXT: [[FLAT_ATOMIC_CMPSWAP_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_CMPSWAP_RTN [[COPY7]], [[REG_SEQUENCE1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_CMPSWAP_RTN]]
;
; GFX9-LABEL: name: amdgpu_atomic_cmpxchg_s32_global_sgpr_ptr_offset_4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
index 305a05a5f1950..dc317a8413cd5 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-flat.mir
@@ -65,8 +65,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2047
@@ -83,8 +89,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -110,8 +122,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2047_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -126,8 +144,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 2047
@@ -151,8 +175,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2048
@@ -169,8 +199,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -196,8 +232,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset2048_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -212,8 +254,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 2048
@@ -237,8 +285,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset4095
@@ -255,8 +309,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
@@ -282,8 +342,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s32_offset4095_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2
@@ -298,8 +364,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 4095
@@ -323,8 +395,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -333,8 +411,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX10-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -343,8 +427,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -353,8 +443,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX12-LABEL: name: flat_atomicrmw_add_s32_offset4097
@@ -388,8 +484,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX9-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -397,8 +499,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX10-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -406,8 +514,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX11-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -415,8 +529,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32))
;
; GFX12-LABEL: name: flat_atomicrmw_add_s32_offset4097_nortn
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -491,8 +611,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s64_offset4095
@@ -509,8 +635,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
@@ -536,8 +668,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
;
; LARGE_IOFFSET-LABEL: name: flat_atomicrmw_add_s64_offset4095_nortn
; LARGE_IOFFSET: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -552,8 +690,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64))
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(s64) = G_CONSTANT i64 4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
index 44c1a43e1ba18..0d1508900a98a 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-add-global.mir
@@ -164,8 +164,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2047
@@ -235,8 +241,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2047_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -301,8 +313,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2048
@@ -319,8 +337,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset2048
@@ -374,8 +398,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset2048_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -390,8 +420,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset2048_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -442,8 +478,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4095
@@ -460,8 +502,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4095
@@ -515,8 +563,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4095_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -531,8 +585,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4095_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -584,8 +644,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = FLAT_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_ATOMIC_ADD_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -594,8 +660,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX10-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -604,8 +676,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -614,8 +692,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_ATOMIC_ADD_RTN:%[0-9]+]]:vgpr_32 = GLOBAL_ATOMIC_ADD_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_ATOMIC_ADD_RTN]]
;
; GFX12-LABEL: name: global_atomicrmw_add_s32_offset4097
@@ -662,8 +746,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s32), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -671,8 +761,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX10-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -680,8 +776,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -689,8 +791,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4097, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: GLOBAL_ATOMIC_ADD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: GLOBAL_ATOMIC_ADD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 1)
;
; GFX12-LABEL: name: global_atomicrmw_add_s32_offset4097_nortn
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -863,8 +971,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = FLAT_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_ATOMIC_ADD_X2_RTN]]
;
; GFX9-LABEL: name: global_atomicrmw_add_s64_offset4095
@@ -881,8 +995,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = GLOBAL_ATOMIC_ADD_X2_RTN [[V_ADD_U]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s64), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_ATOMIC_ADD_X2_RTN:%[0-9]+]]:vreg_64 = GLOBAL_ATOMIC_ADD_X2_RTN [[REG_SEQUENCE]], [[COPY1]], 0, 1, implicit $exec :: (load store seq_cst (s64), addrspace 1)
; GFX10-NEXT: $vgpr0_vgpr1 = COPY [[GLOBAL_ATOMIC_ADD_X2_RTN]]
;
; GFX11-LABEL: name: global_atomicrmw_add_s64_offset4095
@@ -936,8 +1056,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (load store seq_cst (s64), addrspace 1)
;
; GFX9-LABEL: name: global_atomicrmw_add_s64_offset4095_nortn
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
@@ -952,8 +1078,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_ATOMIC_ADD_X2 [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_ATOMIC_ADD_X2 [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 1)
;
; GFX11-LABEL: name: global_atomicrmw_add_s64_offset4095_nortn
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
index 05f399a88e9e4..cebdffc74847c 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-flat.mir
@@ -429,8 +429,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -438,8 +444,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -447,8 +459,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -456,8 +474,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_atomic_flat_s32_seq_cst_gep_m2048
@@ -490,8 +514,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_flat_s32_seq_cst_gep_4095
@@ -506,8 +536,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_atomic_flat_s32_seq_cst_gep_4095
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
index 70fda045cef7e..eafc96dd32bdd 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-global.mir
@@ -451,13 +451,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_DWORD_ADDR64_]]
;
; GFX7-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -465,13 +471,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_DWORD_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_DWORD_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_DWORD_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -479,8 +491,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_global_s32_seq_cst_gep_m2048
@@ -544,8 +562,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORD:%[0-9]+]]:vgpr_32 = FLAT_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s32), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_DWORD]]
;
; GFX9-LABEL: name: load_atomic_global_s32_seq_cst_gep_4095
@@ -560,8 +584,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load seq_cst (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 4095
@@ -587,13 +617,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[BUFFER_LOAD_DWORDX2_ADDR64_]]
;
; GFX7-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
@@ -601,13 +637,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_DWORDX2_ADDR64_:%[0-9]+]]:vreg_64 = BUFFER_LOAD_DWORDX2_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load seq_cst (s64), addrspace 1)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[BUFFER_LOAD_DWORDX2_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
@@ -615,8 +657,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = FLAT_LOAD_DWORDX2 [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s64), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_DWORDX2_:%[0-9]+]]:vreg_64 = FLAT_LOAD_DWORDX2 [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load seq_cst (s64), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0_vgpr1 = COPY [[FLAT_LOAD_DWORDX2_]]
;
; GFX9-LABEL: name: load_atomic_global_s64_seq_cst_gep_m2048
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
index 442bdb2a3b044..ada80da490fc5 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-constant.mir
@@ -1375,8 +1375,14 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
- ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX6-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX7-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1384,8 +1390,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX7-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
- ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
+ ; GFX7-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX7-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX7-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX8-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1393,8 +1405,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
- ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX8-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX10-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1402,8 +1420,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
- ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX10-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX11-LABEL: name: load_constant_s32_from_4_gep_negative_1
@@ -1411,8 +1435,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 -1
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B64_]], implicit-def dead $scc
- ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B64_]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX11-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
%0:sgpr(p4) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -1
@@ -1438,8 +1468,14 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX6-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX6-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX7-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1447,8 +1483,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX7-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX7-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX7-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX7-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX7-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX7-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX8-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1456,8 +1498,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX8-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX8-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX10-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1465,8 +1513,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX10-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
;
; GFX11-LABEL: name: load_constant_s32_from_4_gep_negative_524288
@@ -1474,8 +1528,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -524288
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[S_ADD_U]], 0, 0 :: (load (s32), addrspace 4)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[S_LOAD_DWORD_IMM:%[0-9]+]]:sreg_32_xm0_xexec = S_LOAD_DWORD_IMM [[REG_SEQUENCE]], 0, 0 :: (load (s32), addrspace 4)
; GFX11-NEXT: $sgpr0 = COPY [[S_LOAD_DWORD_IMM]]
%0:sgpr(p4) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -524288
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
index 9bd5700e1e81c..e1325a0a0bc50 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-flat.mir
@@ -1129,8 +1129,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1138,8 +1144,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1154,8 +1166,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2047
@@ -1195,8 +1213,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1204,8 +1228,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1220,8 +1250,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2048
@@ -1261,8 +1297,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1270,8 +1312,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1279,8 +1327,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1288,8 +1342,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1297,8 +1357,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m2047
@@ -1331,8 +1397,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1340,8 +1412,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1349,8 +1427,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1358,8 +1442,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1367,8 +1457,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m2048
@@ -1401,8 +1497,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1410,8 +1512,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1426,8 +1534,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_4095
@@ -1467,8 +1581,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1476,8 +1596,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1485,8 +1611,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1494,8 +1626,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1503,8 +1641,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_4096
@@ -1537,8 +1681,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1546,8 +1696,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1555,8 +1711,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1564,8 +1726,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1573,8 +1741,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m4095
@@ -1607,8 +1781,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1616,8 +1796,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1625,8 +1811,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1634,8 +1826,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1643,8 +1841,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m4096
@@ -1677,8 +1881,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1686,8 +1896,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1695,8 +1911,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1704,8 +1926,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1713,8 +1941,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_8191
@@ -1747,8 +1981,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1756,8 +1996,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1765,8 +2011,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1774,8 +2026,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1783,8 +2041,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_8192
@@ -1817,8 +2081,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -1826,8 +2096,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -1835,8 +2111,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -1844,8 +2126,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -1853,8 +2141,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m8191
@@ -1887,8 +2181,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -1896,8 +2196,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -1905,8 +2211,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -1914,8 +2226,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -1923,8 +2241,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_m8192
@@ -1957,8 +2281,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -1966,8 +2296,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -1975,8 +2311,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -1984,8 +2326,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -1993,8 +2341,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_24bit_max
@@ -2027,8 +2381,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2036,8 +2396,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2045,8 +2411,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2054,8 +2426,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2063,8 +2441,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_max
@@ -2072,8 +2456,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX12-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -2099,8 +2489,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2108,8 +2504,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2117,8 +2519,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2126,8 +2534,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2135,8 +2549,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_24bit_min
@@ -2169,8 +2589,14 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX7-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2178,8 +2604,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2187,8 +2619,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX9-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2196,8 +2634,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX10-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2205,8 +2649,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX11-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_flat_s32_from_1_gep_2x_24bit_min
@@ -2214,8 +2664,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8))
; GFX12-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
index aabcab6c9e40e..864fa2ab438df 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-old-legalization.mir
@@ -1360,8 +1360,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1369,8 +1375,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1448,8 +1460,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1457,8 +1475,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1473,8 +1497,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1514,13 +1544,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1528,13 +1564,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1542,8 +1584,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1551,8 +1599,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1606,13 +1660,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1620,13 +1680,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1634,8 +1700,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1643,8 +1715,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1722,8 +1800,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1731,8 +1815,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1747,8 +1837,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1814,8 +1910,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1823,8 +1925,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1832,8 +1940,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1841,8 +1955,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1850,8 +1970,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1884,13 +2010,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1898,13 +2030,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1912,8 +2050,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1921,8 +2065,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1937,8 +2087,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1978,13 +2134,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -1992,13 +2154,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2006,8 +2174,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2015,8 +2189,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2031,8 +2211,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2098,8 +2284,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2107,8 +2299,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2116,8 +2314,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2125,8 +2329,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2134,8 +2344,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2194,8 +2410,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2203,8 +2425,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2212,8 +2440,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2221,8 +2455,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2230,8 +2470,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2264,13 +2510,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2278,13 +2530,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2292,8 +2550,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2301,8 +2565,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2310,8 +2580,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2319,8 +2595,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2328,8 +2610,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2362,13 +2650,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2376,13 +2670,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2390,8 +2690,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2399,8 +2705,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2408,8 +2720,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2417,8 +2735,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2426,8 +2750,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2486,8 +2816,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2495,8 +2831,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2504,8 +2846,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2513,8 +2861,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2522,8 +2876,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2582,8 +2942,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2591,8 +2957,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2600,8 +2972,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2609,8 +2987,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2618,8 +3002,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2627,8 +3017,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -2654,13 +3050,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2668,13 +3070,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2682,8 +3090,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2691,8 +3105,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2700,8 +3120,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2709,8 +3135,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2718,8 +3150,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2752,13 +3190,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2766,13 +3210,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2780,8 +3230,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2789,8 +3245,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2798,8 +3260,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2807,8 +3275,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2816,8 +3290,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2825,8 +3305,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
index f4de883d67900..65f6b8879e16f 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
@@ -178,8 +178,14 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX9-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX9-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -190,8 +196,14 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -202,8 +214,14 @@ body: |
; GFX11-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX11-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX11-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_merge_not_0_vgpr
@@ -214,8 +232,14 @@ body: |
; GFX12-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX12-NEXT: %notzero:vgpr_32 = V_MOV_B32_e32 1, implicit $exec
; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %notzero, %subreg.sub1
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], [[REG_SEQUENCE]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX12-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:vgpr(s32) = COPY $vgpr0
@@ -255,10 +279,22 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %zero:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX10-NEXT: %zext:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %zero, %subreg.sub1
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], %zext, implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY %zext.sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY %zext.sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U1:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[V_ADD_U]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX10-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX10-NEXT: [[COPY10:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY7]], [[COPY8]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY9]], [[COPY10]], killed [[V_ADD_CO_U32_e64_3]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_2]], %subreg.sub0, [[V_ADDC_U32_e64_2]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_zext_vgpr_offset4095
@@ -316,10 +352,22 @@ body: |
; GFX10-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY [[COPY]]
; GFX10-NEXT: %zero:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX10-NEXT: %zext:vreg_64 = REG_SEQUENCE [[COPY1]], %subreg.sub0, %zero, %subreg.sub1
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY2]], %zext, implicit-def dead $vcc_lo, implicit $exec
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY %zext.sub0
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY2]].sub1
+ ; GFX10-NEXT: [[COPY6:%[0-9]+]]:vgpr_32 = COPY %zext.sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY3]], [[COPY4]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY5]], [[COPY6]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U1:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[V_ADD_U]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[V_ADD_U1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY7:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub0
+ ; GFX10-NEXT: [[COPY8:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY9:%[0-9]+]]:vgpr_32 = COPY [[REG_SEQUENCE]].sub1
+ ; GFX10-NEXT: [[COPY10:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY7]], [[COPY8]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_3:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY9]], [[COPY10]], killed [[V_ADD_CO_U32_e64_3]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_2]], %subreg.sub0, [[V_ADDC_U32_e64_2]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[REG_SEQUENCE1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_zext_vgpr_offset_neg4096
@@ -464,9 +512,15 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64_xexec_xnull = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec_xnull = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
; GFX9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[S_ADD_U]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[REG_SEQUENCE]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD_SADDR]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -474,9 +528,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -484,9 +544,15 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4097
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4097
@@ -578,9 +644,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -2049
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg2049
@@ -670,9 +742,15 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -680,9 +758,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -690,9 +774,15 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_4294967296
@@ -700,9 +790,15 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
- ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 4294967296
@@ -728,9 +824,15 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64_xexec_xnull = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec_xnull = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
; GFX9-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[S_ADD_U]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD_SADDR:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD_SADDR [[REG_SEQUENCE]], [[V_MOV_B32_e32_]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD_SADDR]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -738,9 +840,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -748,9 +856,15 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_4294971390
@@ -758,9 +872,15 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294971390
- ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 4294971390
@@ -786,9 +906,15 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -796,9 +922,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -806,9 +938,15 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967295
@@ -816,9 +954,15 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967295
- ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -4294967295
@@ -843,9 +987,15 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX9-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX10-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -853,9 +1003,15 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX10-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX10-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX10-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX10-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX11-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -863,9 +1019,15 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX11-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX11-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX11-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX11-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
;
; GFX12-LABEL: name: load_global_s32_from_sgpr_base_offset_neg4294967296
@@ -873,9 +1035,15 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX12-NEXT: [[S_MOV_B:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO -4294967296
- ; GFX12-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[S_MOV_B]], implicit-def dead $scc
- ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY [[S_ADD_U]]
- ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY1]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[S_MOV_B]].sub1
+ ; GFX12-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY1]], [[COPY2]], implicit-def $scc
+ ; GFX12-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY3]], [[COPY4]], implicit-def dead $scc, implicit $scc
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vreg_64 = COPY [[REG_SEQUENCE]]
+ ; GFX12-NEXT: [[GLOBAL_LOAD_DWORD:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_DWORD [[COPY5]], 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_DWORD]]
%0:sgpr(p1) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = G_CONSTANT i64 -4294967296
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
index 6fef1da521ff9..d67c2a694c1d0 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global.mir
@@ -1326,8 +1326,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1335,8 +1341,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2047
@@ -1414,8 +1426,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1423,8 +1441,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1439,8 +1463,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2048, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2048
@@ -1480,13 +1510,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1494,13 +1530,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1508,8 +1550,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1517,8 +1565,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2047
@@ -1572,13 +1626,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1586,13 +1646,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1600,8 +1666,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1609,8 +1681,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -2048, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m2048
@@ -1688,8 +1766,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1697,8 +1781,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1713,8 +1803,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4095
@@ -1780,8 +1876,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1789,8 +1891,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1798,8 +1906,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1807,8 +1921,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1816,8 +1936,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 4096, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_4096
@@ -1850,13 +1976,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1864,13 +1996,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1878,8 +2016,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1887,8 +2031,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1903,8 +2053,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4095, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4095
@@ -1944,13 +2100,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -1958,13 +2120,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -1972,8 +2140,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -1981,8 +2155,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -1997,8 +2177,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -4096, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m4096
@@ -2064,8 +2250,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2073,8 +2265,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2082,8 +2280,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2091,8 +2295,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2100,8 +2310,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8191
@@ -2160,8 +2376,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2169,8 +2391,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2178,8 +2406,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2187,8 +2421,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2196,8 +2436,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_8192
@@ -2230,13 +2476,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2244,13 +2496,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2258,8 +2516,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2267,8 +2531,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2276,8 +2546,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2285,8 +2561,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2294,8 +2576,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8191, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8191
@@ -2328,13 +2616,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2342,13 +2636,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2356,8 +2656,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2365,8 +2671,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2374,8 +2686,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2383,8 +2701,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2392,8 +2716,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8192, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_m8192
@@ -2452,8 +2782,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2461,8 +2797,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2470,8 +2812,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2479,8 +2827,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2488,8 +2842,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_max
@@ -2548,8 +2908,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2557,8 +2923,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2566,8 +2938,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2575,8 +2953,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2584,8 +2968,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_max
@@ -2593,8 +2983,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 16777214
@@ -2620,13 +3016,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2634,13 +3036,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2648,8 +3056,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2657,8 +3071,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2666,8 +3086,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2675,8 +3101,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2684,8 +3116,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_24bit_min
@@ -2718,13 +3156,19 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2732,13 +3176,19 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: [[BUFFER_LOAD_UBYTE_ADDR64_:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_ADDR64 [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX7-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_ADDR64_]]
;
; GFX7-FLAT-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2746,8 +3196,14 @@ body: |
; GFX7-FLAT-NEXT: {{ $}}
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX7-FLAT-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX8-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2755,8 +3211,14 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
+ ; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: [[FLAT_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = FLAT_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec, implicit $flat_scr :: (load (s8), addrspace 1)
; GFX8-NEXT: $vgpr0 = COPY [[FLAT_LOAD_UBYTE]]
;
; GFX9-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2764,8 +3226,14 @@ body: |
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX9-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX10-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2773,8 +3241,14 @@ body: |
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX10-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX11-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2782,8 +3256,14 @@ body: |
; GFX11-NEXT: {{ $}}
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX11-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
;
; GFX12-LABEL: name: load_global_s32_from_1_gep_2x_24bit_min
@@ -2791,8 +3271,14 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[V_ADD_U]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
+ ; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[COPY2]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY3]], [[COPY4]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: [[GLOBAL_LOAD_UBYTE:%[0-9]+]]:vgpr_32 = GLOBAL_LOAD_UBYTE [[REG_SEQUENCE]], 0, 0, implicit $exec :: (load (s8), addrspace 1)
; GFX12-NEXT: $vgpr0 = COPY [[GLOBAL_LOAD_UBYTE]]
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
index 99fa052415a12..6f971788727b2 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-smrd.mir
@@ -46,13 +46,25 @@ regBankSelected: true
# Max immediate for CI
# SIVI: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 17179869180
-# SIVI-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
+# SIVI-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
+# SIVI-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
+# SIVI-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
+# SIVI-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
+# SIVI-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
+# SIVI: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
+# SIVI: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
# SIVI: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# CI: S_LOAD_DWORD_IMM_ci [[PTR]], 4294967295, 0
# Immediate overflow for CI
# GCN: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 17179869184
-# GCN-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
+# GCN-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
+# GCN-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
+# GCN-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
+# GCN-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
+# GCN-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
+# GCN: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
+# GCN: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
# GCN: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# Max 32-bit byte offset
@@ -62,7 +74,13 @@ regBankSelected: true
# Overflow 32-bit byte offset
# SIVI: [[K:%[0-9]+]]:sreg_64 = S_MOV_B64_IMM_PSEUDO 4294967296
-# SIVI-DAG: [[ADD_PTR:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[PTR]], [[K]]
+# SIVI-DAG: [[K_SUB0:%[0-9]+]]:sreg_32 = COPY [[K]].sub0
+# SIVI-DAG: [[PTR_LO:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub0
+# SIVI-DAG: [[ADD_PTR_LO:%[0-9]+]]:sreg_32 = S_ADD_U32 [[PTR_LO]], [[K_SUB0]]
+# SIVI-DAG: [[K_SUB1:%[0-9]+]]:sreg_32 = COPY [[K]].sub1
+# SIVI-DAG: [[PTR_HI:%[0-9]+]]:sreg_32 = COPY [[PTR]].sub1
+# SIVI: [[ADD_PTR_HI:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[PTR_HI]], [[K_SUB1]]
+# SIVI: [[ADD_PTR:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[ADD_PTR_LO]], %subreg.sub0, [[ADD_PTR_HI]], %subreg.sub1
# SIVI: S_LOAD_DWORD_IMM [[ADD_PTR]], 0, 0
# CI: S_LOAD_DWORD_IMM_ci [[PTR]], 1073741824, 0
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
index 12703b7b35f68..d1cf3bfc29ff0 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
@@ -20,40 +20,70 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX8-LABEL: name: gep_p0_sgpr_sgpr
; GFX8: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX9-LABEL: name: gep_p0_sgpr_sgpr
; GFX9: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE64-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX10-WAVE64-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE32-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX10-WAVE32-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
%0:sgpr(p0) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = COPY $sgpr2_sgpr3
%2:sgpr(p0) = G_PTR_ADD %0, %1
@@ -74,40 +104,70 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX8-LABEL: name: gep_p0_vgpr_vgpr
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX9-LABEL: name: gep_p0_vgpr_vgpr
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_vgpr_vgpr
; GFX10-WAVE64: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_vgpr_vgpr
; GFX10-WAVE32: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
%0:vgpr(p0) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(p0) = G_PTR_ADD %0, %1
@@ -128,40 +188,70 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX8-LABEL: name: gep_p0_sgpr_vgpr
; GFX8: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX9-LABEL: name: gep_p0_sgpr_vgpr
; GFX9: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE64-LABEL: name: gep_p0_sgpr_vgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE32-LABEL: name: gep_p0_sgpr_vgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
- ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
%0:sgpr(p0) = COPY $sgpr0_sgpr1
%1:vgpr(s64) = COPY $vgpr0_vgpr1
%2:vgpr(p0) = G_PTR_ADD %0, %1
@@ -452,40 +542,70 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX6-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX6-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX8-LABEL: name: gep_p999_sgpr_sgpr
; GFX8: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX8-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX8-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX9-LABEL: name: gep_p999_sgpr_sgpr
; GFX9: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX9-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX9-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE64-LABEL: name: gep_p999_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE64-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX10-WAVE64-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE32-LABEL: name: gep_p999_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0_sgpr1, $sgpr2_sgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr0_sgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_64 = COPY $sgpr2_sgpr3
- ; GFX10-WAVE32-NEXT: [[S_ADD_U:%[0-9]+]]:sreg_64 = S_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U]]
+ ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:sreg_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:sreg_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY2]], [[COPY3]], implicit-def $scc
+ ; GFX10-WAVE32-NEXT: [[S_ADDC_U32_:%[0-9]+]]:sreg_32 = S_ADDC_U32 [[COPY4]], [[COPY5]], implicit-def dead $scc, implicit $scc
+ ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64_xexec = REG_SEQUENCE [[S_ADD_U32_]], %subreg.sub0, [[S_ADDC_U32_]], %subreg.sub1
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
%0:sgpr(p999) = COPY $sgpr0_sgpr1
%1:sgpr(s64) = COPY $sgpr2_sgpr3
%2:sgpr(p999) = G_PTR_ADD %0, %1
@@ -506,40 +626,70 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX8-LABEL: name: gep_p999_vgpr_vgpr
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX9-LABEL: name: gep_p999_vgpr_vgpr
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE64-LABEL: name: gep_p999_vgpr_vgpr
; GFX10-WAVE64: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE64-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc, implicit $exec
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE64-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE64-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE64-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE64-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE64-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
;
; GFX10-WAVE32-LABEL: name: gep_p999_vgpr_vgpr
; GFX10-WAVE32: liveins: $vgpr0_vgpr1, $vgpr2_vgpr3
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:vreg_64 = COPY $vgpr2_vgpr3
- ; GFX10-WAVE32-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[COPY1]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[V_ADD_U]]
+ ; GFX10-WAVE32-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub0
+ ; GFX10-WAVE32-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-WAVE32-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[COPY1]].sub1
+ ; GFX10-WAVE32-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-WAVE32-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[REG_SEQUENCE]]
%0:vgpr(p999) = COPY $vgpr0_vgpr1
%1:vgpr(s64) = COPY $vgpr2_vgpr3
%2:vgpr(p999) = G_PTR_ADD %0, %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
index 92b5d6da6597c..6e92d851dee2e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-flat.mir
@@ -1175,8 +1175,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_gep_2047
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1184,8 +1190,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_gep_2047
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1200,8 +1212,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_gep_2047
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1241,8 +1259,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1250,8 +1274,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1259,8 +1289,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1268,8 +1304,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1277,8 +1319,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1311,8 +1359,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1320,8 +1374,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1329,8 +1389,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1338,8 +1404,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1347,8 +1419,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1356,8 +1434,14 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 16777214
@@ -1383,8 +1467,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1392,8 +1482,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1401,8 +1497,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1410,8 +1512,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1419,8 +1527,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1453,8 +1567,14 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX8-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1462,8 +1582,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX9-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1471,8 +1597,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX10-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1480,8 +1612,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX11-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX11: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1489,8 +1627,14 @@ body: |
; GFX11-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX11-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX11-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX11-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX11-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX11-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX11-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX11-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX11-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX11-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX11-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX11-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX11-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
;
; GFX12-LABEL: name: store_flat_s32_to_1_gep_2x_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1498,8 +1642,14 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32))
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
index d704dee43c3e1..9136f44dfc227 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-global.mir
@@ -1289,8 +1289,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_gep_2047
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1298,8 +1304,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 2047, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_gep_2047
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1372,8 +1384,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1381,8 +1399,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1390,8 +1414,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1399,8 +1429,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 8388607, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1432,13 +1468,19 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1446,13 +1488,19 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-FLAT-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1460,8 +1508,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1469,8 +1523,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1478,8 +1538,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1487,8 +1553,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -8388608, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1547,8 +1619,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1556,8 +1634,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1565,8 +1649,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1574,8 +1664,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_2x_24bit_max
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1583,8 +1679,14 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO 16777214, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 16777214
@@ -1610,13 +1712,19 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX6-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX6-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX6-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX6-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX6-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX6-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX6-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX6-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX6-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX6-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1624,13 +1732,19 @@ body: |
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
+ ; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 0
; GFX7-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 61440
- ; GFX7-NEXT: [[REG_SEQUENCE:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
+ ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sreg_64 = REG_SEQUENCE [[S_MOV_B32_]], %subreg.sub0, [[S_MOV_B32_1]], %subreg.sub1
; GFX7-NEXT: [[S_MOV_B64_:%[0-9]+]]:sreg_64 = S_MOV_B64 0
- ; GFX7-NEXT: [[REG_SEQUENCE1:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE]], %subreg.sub2_sub3
- ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[V_ADD_U]], [[REG_SEQUENCE1]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX7-NEXT: [[REG_SEQUENCE2:%[0-9]+]]:sgpr_128 = REG_SEQUENCE [[S_MOV_B64_]], %subreg.sub0_sub1, [[REG_SEQUENCE1]], %subreg.sub2_sub3
+ ; GFX7-NEXT: BUFFER_STORE_DWORD_ADDR64 [[COPY1]], [[REG_SEQUENCE]], [[REG_SEQUENCE2]], 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX7-FLAT-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX7-FLAT: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1638,8 +1752,14 @@ body: |
; GFX7-FLAT-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-FLAT-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-FLAT-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX7-FLAT-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX7-FLAT-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX7-FLAT-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX7-FLAT-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX7-FLAT-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX7-FLAT-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX7-FLAT-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX8-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX8: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1647,8 +1767,14 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX8-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX8-NEXT: FLAT_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
+ ; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX8-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX8-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX8-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX8-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX8-NEXT: FLAT_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32), addrspace 1)
;
; GFX9-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1656,8 +1782,14 @@ body: |
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX9-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc, implicit $exec
- ; GFX9-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX9-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX9-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX9-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX9-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX9-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX9-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX10-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1665,8 +1797,14 @@ body: |
; GFX10-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX10-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX10-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX10-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX10-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX10-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX10-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX10-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX10-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX10-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX10-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX10-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX10-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
;
; GFX12-LABEL: name: store_global_s32_to_1_gep_2x_24bit_min
; GFX12: liveins: $vgpr0_vgpr1, $vgpr2
@@ -1674,8 +1812,14 @@ body: |
; GFX12-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX12-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX12-NEXT: [[V_MOV_B:%[0-9]+]]:vreg_64 = V_MOV_B64_PSEUDO -16777215, implicit $exec
- ; GFX12-NEXT: [[V_ADD_U:%[0-9]+]]:vreg_64 = V_ADD_U64_PSEUDO [[COPY]], [[V_MOV_B]], implicit-def dead $vcc_lo, implicit $exec
- ; GFX12-NEXT: GLOBAL_STORE_DWORD [[V_ADD_U]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
+ ; GFX12-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
+ ; GFX12-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub0
+ ; GFX12-NEXT: [[COPY4:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
+ ; GFX12-NEXT: [[COPY5:%[0-9]+]]:vgpr_32 = COPY [[V_MOV_B]].sub1
+ ; GFX12-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[COPY3]], 0, implicit $exec
+ ; GFX12-NEXT: [[V_ADDC_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADDC_U32_e64_1:%[0-9]+]]:sreg_32_xm0_xexec = V_ADDC_U32_e64 [[COPY4]], [[COPY5]], killed [[V_ADD_CO_U32_e64_1]], 0, implicit $exec
+ ; GFX12-NEXT: [[REG_SEQUENCE:%[0-9]+]]:vreg_64 = REG_SEQUENCE [[V_ADD_CO_U32_e64_]], %subreg.sub0, [[V_ADDC_U32_e64_]], %subreg.sub1
+ ; GFX12-NEXT: GLOBAL_STORE_DWORD [[REG_SEQUENCE]], [[COPY1]], 0, 0, implicit $exec :: (store (s32), addrspace 1)
%0:vgpr(p1) = COPY $vgpr0_vgpr1
%1:vgpr(s32) = COPY $vgpr2
%2:vgpr(s64) = G_CONSTANT i64 -16777215
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
index 8167f071be05d..07d5ff2036d93 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/mubuf-global.ll
@@ -96,12 +96,10 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_offset4294967296(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_store_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 0
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: v_mov_b32_e32 v2, 0
-; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: v_dual_mov_b32 v2, 0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i64 4294967296
@@ -136,12 +134,10 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_offset4294967297(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_store_sgpr_ptr_offset4294967297:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 4
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: v_mov_b32_e32 v2, 0
-; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: s_add_co_u32 s0, s2, 4
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-NEXT: v_dual_mov_b32 v2, 0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i64 4294967297
@@ -347,7 +343,8 @@ define amdgpu_ps void @mubuf_store_sgpr_ptr_sgpr_offset(ptr addrspace(1) inreg %
; GFX12-NEXT: v_mov_b32_e32 v0, 0
; GFX12-NEXT: s_lshl_b64 s[0:1], s[4:5], 2
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, s0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, s1
; GFX12-NEXT: global_store_b32 v0, v0, s[0:1]
; GFX12-NEXT: s_endpgm
%gep = getelementptr i32, ptr addrspace(1) %ptr, i32 %soffset
@@ -694,10 +691,9 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_offset4294967296(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_load_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 0
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_load_b32 v0, v[0:1], off scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
@@ -734,10 +730,9 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_offset4294967297(ptr addrspace(1) in
;
; GFX12-LABEL: mubuf_load_sgpr_ptr_offset4294967297:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 4
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, 4
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
+; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-NEXT: global_load_b32 v0, v[0:1], off scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
@@ -946,7 +941,8 @@ define amdgpu_ps float @mubuf_load_sgpr_ptr_sgpr_offset(ptr addrspace(1) inreg %
; GFX12-NEXT: v_mov_b32_e32 v0, 0
; GFX12-NEXT: s_lshl_b64 s[0:1], s[4:5], 2
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, s0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, s1
; GFX12-NEXT: global_load_b32 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: ; return to shader part epilog
@@ -1276,12 +1272,11 @@ define amdgpu_ps float @mubuf_atomicrmw_sgpr_ptr_offset4294967296(ptr addrspace(
;
; GFX12-LABEL: mubuf_atomicrmw_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 0
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: v_mov_b32_e32 v2, 2
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
+; GFX12-NEXT: v_mov_b32_e32 v2, 2
; GFX12-NEXT: global_atomic_add_u32 v0, v[0:1], v2, off th:TH_ATOMIC_RETURN scope:SCOPE_DEV
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: global_inv scope:SCOPE_DEV
@@ -1513,10 +1508,8 @@ define amdgpu_ps float @mubuf_cmpxchg_sgpr_ptr_offset4294967296(ptr addrspace(1)
;
; GFX12-LABEL: mubuf_cmpxchg_sgpr_ptr_offset4294967296:
; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 0
-; GFX12-NEXT: s_mov_b32 s1, 4
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-NEXT: s_add_co_ci_u32 s1, s3, 4
; GFX12-NEXT: v_mov_b32_e32 v2, v0
; GFX12-NEXT: v_dual_mov_b32 v4, s1 :: v_dual_mov_b32 v3, s0
; GFX12-NEXT: global_atomic_cmpswap_b32 v0, v[3:4], v[1:2], off th:TH_ATOMIC_RETURN scope:SCOPE_DEV
diff --git a/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll b/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
index 11bd0536307d7..737985c27c5d3 100644
--- a/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
+++ b/llvm/test/CodeGen/AMDGPU/gfx12_scalar_subword_loads.ll
@@ -17,17 +17,27 @@ define amdgpu_ps void @test_s_load_i8(ptr addrspace(4) inreg %in, ptr addrspace(
}
define amdgpu_ps void @test_s_load_i8_imm(ptr addrspace(4) inreg %in, ptr addrspace(1) %out) {
-; GCN-LABEL: test_s_load_i8_imm:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_movk_i32 s2, 0xff9c
-; GCN-NEXT: s_mov_b32 s3, -1
-; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GCN-NEXT: s_load_i8 s0, s[0:1], 0x0
-; GCN-NEXT: s_wait_kmcnt 0x0
-; GCN-NEXT: v_mov_b32_e32 v2, s0
-; GCN-NEXT: global_store_b32 v[0:1], v2, off
-; GCN-NEXT: s_endpgm
+; DAG-LABEL: test_s_load_i8_imm:
+; DAG: ; %bb.0:
+; DAG-NEXT: s_movk_i32 s2, 0xff9c
+; DAG-NEXT: s_mov_b32 s3, -1
+; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; DAG-NEXT: s_load_i8 s0, s[0:1], 0x0
+; DAG-NEXT: s_wait_kmcnt 0x0
+; DAG-NEXT: v_mov_b32_e32 v2, s0
+; DAG-NEXT: global_store_b32 v[0:1], v2, off
+; DAG-NEXT: s_endpgm
+;
+; GISEL-LABEL: test_s_load_i8_imm:
+; GISEL: ; %bb.0:
+; GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffff9c
+; GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
+; GISEL-NEXT: s_load_i8 s0, s[0:1], 0x0
+; GISEL-NEXT: s_wait_kmcnt 0x0
+; GISEL-NEXT: v_mov_b32_e32 v2, s0
+; GISEL-NEXT: global_store_b32 v[0:1], v2, off
+; GISEL-NEXT: s_endpgm
%gep = getelementptr i8, ptr addrspace(4) %in, i64 -100
%ld = load i8, ptr addrspace(4) %gep
%sext = sext i8 %ld to i32
@@ -177,17 +187,27 @@ define amdgpu_ps void @test_s_load_i16(ptr addrspace(4) inreg %in, ptr addrspace
}
define amdgpu_ps void @test_s_load_i16_imm(ptr addrspace(4) inreg %in, ptr addrspace(1) %out) {
-; GCN-LABEL: test_s_load_i16_imm:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_movk_i32 s2, 0xff38
-; GCN-NEXT: s_mov_b32 s3, -1
-; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GCN-NEXT: s_load_i16 s0, s[0:1], 0x0
-; GCN-NEXT: s_wait_kmcnt 0x0
-; GCN-NEXT: v_mov_b32_e32 v2, s0
-; GCN-NEXT: global_store_b32 v[0:1], v2, off
-; GCN-NEXT: s_endpgm
+; DAG-LABEL: test_s_load_i16_imm:
+; DAG: ; %bb.0:
+; DAG-NEXT: s_movk_i32 s2, 0xff38
+; DAG-NEXT: s_mov_b32 s3, -1
+; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; DAG-NEXT: s_load_i16 s0, s[0:1], 0x0
+; DAG-NEXT: s_wait_kmcnt 0x0
+; DAG-NEXT: v_mov_b32_e32 v2, s0
+; DAG-NEXT: global_store_b32 v[0:1], v2, off
+; DAG-NEXT: s_endpgm
+;
+; GISEL-LABEL: test_s_load_i16_imm:
+; GISEL: ; %bb.0:
+; GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffff38
+; GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
+; GISEL-NEXT: s_load_i16 s0, s[0:1], 0x0
+; GISEL-NEXT: s_wait_kmcnt 0x0
+; GISEL-NEXT: v_mov_b32_e32 v2, s0
+; GISEL-NEXT: global_store_b32 v[0:1], v2, off
+; GISEL-NEXT: s_endpgm
%gep = getelementptr i16, ptr addrspace(4) %in, i64 -100
%ld = load i16, ptr addrspace(4) %gep
%sext = sext i16 %ld to i32
@@ -212,17 +232,30 @@ define amdgpu_ps void @test_s_load_i16_sgpr(ptr addrspace(4) inreg %in, i32 inre
}
define amdgpu_ps void @test_s_load_i16_sgpr_imm(ptr addrspace(4) inreg %in, i32 inreg %offset, ptr addrspace(1) %out) {
-; GCN-LABEL: test_s_load_i16_sgpr_imm:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_mov_b32 s3, 0
-; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GCN-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GCN-NEXT: s_load_i16 s0, s[0:1], 0x20
-; GCN-NEXT: s_wait_kmcnt 0x0
-; GCN-NEXT: v_mov_b32_e32 v2, s0
-; GCN-NEXT: global_store_b32 v[0:1], v2, off
-; GCN-NEXT: s_endpgm
+; DAG-LABEL: test_s_load_i16_sgpr_imm:
+; DAG: ; %bb.0:
+; DAG-NEXT: s_mov_b32 s3, 0
+; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; DAG-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; DAG-NEXT: s_load_i16 s0, s[0:1], 0x20
+; DAG-NEXT: s_wait_kmcnt 0x0
+; DAG-NEXT: v_mov_b32_e32 v2, s0
+; DAG-NEXT: global_store_b32 v[0:1], v2, off
+; DAG-NEXT: s_endpgm
+;
+; GISEL-LABEL: test_s_load_i16_sgpr_imm:
+; GISEL: ; %bb.0:
+; GISEL-NEXT: s_mov_b32 s3, 0
+; GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GISEL-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; GISEL-NEXT: s_add_co_u32 s0, s0, s2
+; GISEL-NEXT: s_add_co_ci_u32 s1, s1, s3
+; GISEL-NEXT: s_load_i16 s0, s[0:1], 0x20
+; GISEL-NEXT: s_wait_kmcnt 0x0
+; GISEL-NEXT: v_mov_b32_e32 v2, s0
+; GISEL-NEXT: global_store_b32 v[0:1], v2, off
+; GISEL-NEXT: s_endpgm
%gep1 = getelementptr i16, ptr addrspace(4) %in, i64 16
%zext = zext i32 %offset to i64
%gep2 = getelementptr i16, ptr addrspace(4) %gep1, i64 %zext
@@ -315,17 +348,30 @@ define amdgpu_ps void @test_s_load_u16_sgpr(ptr addrspace(4) inreg %in, i32 inre
}
define amdgpu_ps void @test_s_load_u16_sgpr_imm(ptr addrspace(4) inreg %in, i32 inreg %offset, ptr addrspace(1) %out) {
-; GCN-LABEL: test_s_load_u16_sgpr_imm:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_mov_b32 s3, 0
-; GCN-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GCN-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
-; GCN-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GCN-NEXT: s_load_u16 s0, s[0:1], 0x20
-; GCN-NEXT: s_wait_kmcnt 0x0
-; GCN-NEXT: v_mov_b32_e32 v2, s0
-; GCN-NEXT: global_store_b32 v[0:1], v2, off
-; GCN-NEXT: s_endpgm
+; DAG-LABEL: test_s_load_u16_sgpr_imm:
+; DAG: ; %bb.0:
+; DAG-NEXT: s_mov_b32 s3, 0
+; DAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; DAG-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; DAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; DAG-NEXT: s_load_u16 s0, s[0:1], 0x20
+; DAG-NEXT: s_wait_kmcnt 0x0
+; DAG-NEXT: v_mov_b32_e32 v2, s0
+; DAG-NEXT: global_store_b32 v[0:1], v2, off
+; DAG-NEXT: s_endpgm
+;
+; GISEL-LABEL: test_s_load_u16_sgpr_imm:
+; GISEL: ; %bb.0:
+; GISEL-NEXT: s_mov_b32 s3, 0
+; GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
+; GISEL-NEXT: s_lshl_b64 s[2:3], s[2:3], 1
+; GISEL-NEXT: s_add_co_u32 s0, s0, s2
+; GISEL-NEXT: s_add_co_ci_u32 s1, s1, s3
+; GISEL-NEXT: s_load_u16 s0, s[0:1], 0x20
+; GISEL-NEXT: s_wait_kmcnt 0x0
+; GISEL-NEXT: v_mov_b32_e32 v2, s0
+; GISEL-NEXT: global_store_b32 v[0:1], v2, off
+; GISEL-NEXT: s_endpgm
%gep1 = getelementptr i16, ptr addrspace(4) %in, i64 16
%zext1= zext i32 %offset to i64
%gep2 = getelementptr i16, ptr addrspace(4) %gep1, i64 %zext1
diff --git a/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll b/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
index fbbb242a467b4..d588f0e0897b7 100644
--- a/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
+++ b/llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
@@ -160,16 +160,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4096(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg4096:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xf000
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4096:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf000
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4096:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4096
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -205,16 +214,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4097(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg4097:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xefff
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4097:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xefff
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4097:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xffffefff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4097
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -250,16 +268,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg4098(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg4098:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xeffe
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg4098:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xeffe
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg4098:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xffffeffe
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4098
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -391,16 +418,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2048(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg2048:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xf800
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2048:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf800
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2048:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff800
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2048
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -432,16 +468,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2049(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg2049:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xf7ff
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2049:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf7ff
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2049:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff7ff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2049
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -473,16 +518,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg2050(ptr addrspace(1) inr
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg2050:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xf7fe
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg2050:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xf7fe
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg2050:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfffff7fe
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -2050
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -552,16 +606,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0xFFFFFF(ptr addrspace(1) in
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 0xff800000
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_mov_b32 s0, 0xff800000
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0xFFFFFF:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xff800000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -8388608
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -640,10 +703,8 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000000(ptr addrspace(1)
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000000:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_mov_b32 s0, 0
-; GFX12-GISEL-NEXT: s_mov_b32 s1, 1
-; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -695,10 +756,8 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000001(ptr addrspace(1)
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000001:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_mov_b32 s0, 1
-; GFX12-GISEL-NEXT: s_mov_b32 s1, 1
-; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 1
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -737,16 +796,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100000FFF(ptr addrspace(1)
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_0x100000FFF:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0xfff
-; GFX12-NEXT: s_mov_b32 s1, 1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0x100000FFF:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0xfff
+; GFX12-SDAG-NEXT: s_mov_b32 s1, 1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100000FFF:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0xfff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 4294971391
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -781,16 +849,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_0x100001000(ptr addrspace(1)
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_0x100001000:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_movk_i32 s0, 0x1000
-; GFX12-NEXT: s_mov_b32 s1, 1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_0x100001000:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_movk_i32 s0, 0x1000
+; GFX12-SDAG-NEXT: s_mov_b32 s1, 1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_0x100001000:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0x1000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 4294971392
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -826,16 +903,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0xFFFFFFFF(ptr addrspace(
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, 1
-; GFX12-NEXT: s_mov_b32 s1, -1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_mov_b32 s0, 1
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -1
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0xFFFFFFFF:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 1
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4294967295
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -878,10 +964,8 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0x100000000(ptr addrspace
;
; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0x100000000:
; GFX12-GISEL: ; %bb.0:
-; GFX12-GISEL-NEXT: s_mov_b32 s0, 0
-; GFX12-GISEL-NEXT: s_mov_b32 s1, -1
-; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, 0
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -1
; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
@@ -920,16 +1004,25 @@ define amdgpu_ps float @global_load_saddr_i8_offset_neg0x100000001(ptr addrspace
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_offset_neg0x100000001:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s0, -1
-; GFX12-NEXT: s_mov_b32 s1, -2
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_offset_neg0x100000001:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_mov_b32 s0, -1
+; GFX12-SDAG-NEXT: s_mov_b32 s1, -2
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[0:1]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_offset_neg0x100000001:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, -1
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, -2
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 -4294967297
%load = load i8, ptr addrspace(1) %gep0
%zext = zext i8 %load to i32
@@ -1714,18 +1807,29 @@ define amdgpu_ps float @global_load_saddr_i8_zext_uniform_offset_immoffset(ptr a
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: ; return to shader part epilog
;
-; GFX12-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
-; GFX12: ; %bb.0:
-; GFX12-NEXT: s_mov_b32 s5, 0
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_2) | instid1(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[4:5]
-; GFX12-NEXT: s_movk_i32 s2, 0xffe8
-; GFX12-NEXT: s_mov_b32 s3, -1
-; GFX12-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; GFX12-NEXT: s_load_u8 s0, s[0:1], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v0, s0
-; GFX12-NEXT: ; return to shader part epilog
+; GFX12-SDAG-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
+; GFX12-SDAG: ; %bb.0:
+; GFX12-SDAG-NEXT: s_mov_b32 s5, 0
+; GFX12-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_2) | instid1(SALU_CYCLE_1)
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[2:3], s[4:5]
+; GFX12-SDAG-NEXT: s_movk_i32 s2, 0xffe8
+; GFX12-SDAG-NEXT: s_mov_b32 s3, -1
+; GFX12-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-SDAG-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-SDAG-NEXT: s_wait_kmcnt 0x0
+; GFX12-SDAG-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX12-GISEL-LABEL: global_load_saddr_i8_zext_uniform_offset_immoffset:
+; GFX12-GISEL: ; %bb.0:
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s2, s4
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s3, 0
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xffffffe8
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
+; GFX12-GISEL-NEXT: s_load_u8 s0, s[0:1], 0x0
+; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
+; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, s0
+; GFX12-GISEL-NEXT: ; return to shader part epilog
%zext.offset = zext i32 %soffset to i64
%gep0 = getelementptr inbounds i8, ptr addrspace(1) %sbase, i64 %zext.offset
%gep1 = getelementptr inbounds i8, ptr addrspace(1) %gep0, i64 -24
diff --git a/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll b/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
index b5657b81d3192..72260e0b99715 100644
--- a/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
+++ b/llvm/test/CodeGen/AMDGPU/llvm.prefetch.ll
@@ -59,14 +59,21 @@ define amdgpu_ps void @prefetch_data_sgpr_min_offset(ptr addrspace(4) inreg %ptr
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-LABEL: prefetch_data_sgpr_min_offset:
-; SPREFETCH: ; %bb.0: ; %entry
-; SPREFETCH-NEXT: s_mov_b32 s2, 0xff800000
-; SPREFETCH-NEXT: s_mov_b32 s3, -1
-; SPREFETCH-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; SPREFETCH-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-NEXT: s_endpgm
+; SPREFETCH-SDAG-LABEL: prefetch_data_sgpr_min_offset:
+; SPREFETCH-SDAG: ; %bb.0: ; %entry
+; SPREFETCH-SDAG-NEXT: s_mov_b32 s2, 0xff800000
+; SPREFETCH-SDAG-NEXT: s_mov_b32 s3, -1
+; SPREFETCH-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; SPREFETCH-SDAG-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-SDAG-NEXT: s_endpgm
+;
+; SPREFETCH-GISEL-LABEL: prefetch_data_sgpr_min_offset:
+; SPREFETCH-GISEL: ; %bb.0: ; %entry
+; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0xff800000
+; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
+; SPREFETCH-GISEL-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-GISEL-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 -8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 1)
@@ -78,11 +85,18 @@ define amdgpu_ps void @prefetch_data_sgpr_too_large_offset(ptr addrspace(4) inre
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-LABEL: prefetch_data_sgpr_too_large_offset:
-; SPREFETCH: ; %bb.0: ; %entry
-; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
-; SPREFETCH-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
-; SPREFETCH-NEXT: s_endpgm
+; SPREFETCH-SDAG-LABEL: prefetch_data_sgpr_too_large_offset:
+; SPREFETCH-SDAG: ; %bb.0: ; %entry
+; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
+; SPREFETCH-SDAG-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-SDAG-NEXT: s_endpgm
+;
+; SPREFETCH-GISEL-LABEL: prefetch_data_sgpr_too_large_offset:
+; SPREFETCH-GISEL: ; %bb.0: ; %entry
+; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800000
+; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0
+; SPREFETCH-GISEL-NEXT: s_prefetch_data s[0:1], 0x0, null, 0
+; SPREFETCH-GISEL-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 1)
@@ -218,14 +232,21 @@ define amdgpu_ps void @prefetch_inst_sgpr_min_offset(ptr addrspace(4) inreg %ptr
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-LABEL: prefetch_inst_sgpr_min_offset:
-; SPREFETCH: ; %bb.0: ; %entry
-; SPREFETCH-NEXT: s_mov_b32 s2, 0xff800000
-; SPREFETCH-NEXT: s_mov_b32 s3, -1
-; SPREFETCH-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
-; SPREFETCH-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-NEXT: s_endpgm
+; SPREFETCH-SDAG-LABEL: prefetch_inst_sgpr_min_offset:
+; SPREFETCH-SDAG: ; %bb.0: ; %entry
+; SPREFETCH-SDAG-NEXT: s_mov_b32 s2, 0xff800000
+; SPREFETCH-SDAG-NEXT: s_mov_b32 s3, -1
+; SPREFETCH-SDAG-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
+; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; SPREFETCH-SDAG-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-SDAG-NEXT: s_endpgm
+;
+; SPREFETCH-GISEL-LABEL: prefetch_inst_sgpr_min_offset:
+; SPREFETCH-GISEL: ; %bb.0: ; %entry
+; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0xff800000
+; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, -1
+; SPREFETCH-GISEL-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-GISEL-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 -8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 0)
@@ -237,11 +258,18 @@ define amdgpu_ps void @prefetch_inst_sgpr_too_large_offset(ptr addrspace(4) inre
; NOSPREFETCH: ; %bb.0: ; %entry
; NOSPREFETCH-NEXT: s_endpgm
;
-; SPREFETCH-LABEL: prefetch_inst_sgpr_too_large_offset:
-; SPREFETCH: ; %bb.0: ; %entry
-; SPREFETCH-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
-; SPREFETCH-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
-; SPREFETCH-NEXT: s_endpgm
+; SPREFETCH-SDAG-LABEL: prefetch_inst_sgpr_too_large_offset:
+; SPREFETCH-SDAG: ; %bb.0: ; %entry
+; SPREFETCH-SDAG-NEXT: s_add_nc_u64 s[0:1], s[0:1], 0x800000
+; SPREFETCH-SDAG-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-SDAG-NEXT: s_endpgm
+;
+; SPREFETCH-GISEL-LABEL: prefetch_inst_sgpr_too_large_offset:
+; SPREFETCH-GISEL: ; %bb.0: ; %entry
+; SPREFETCH-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800000
+; SPREFETCH-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0
+; SPREFETCH-GISEL-NEXT: s_prefetch_inst s[0:1], 0x0, null, 0
+; SPREFETCH-GISEL-NEXT: s_endpgm
entry:
%gep = getelementptr i8, ptr addrspace(4) %ptr, i32 8388608
tail call void @llvm.prefetch.p4(ptr addrspace(4) %gep, i32 0, i32 0, i32 0)
@@ -254,6 +282,3 @@ declare void @llvm.prefetch.p3(ptr addrspace(3) nocapture readonly, i32, i32, i3
declare void @llvm.prefetch.p4(ptr addrspace(4) nocapture readonly, i32, i32, i32)
declare void @llvm.prefetch.p5(ptr addrspace(5) nocapture readonly, i32, i32, i32)
declare void @llvm.prefetch.p6(ptr addrspace(6) nocapture readonly, i32, i32, i32)
-;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
-; SPREFETCH-GISEL: {{.*}}
-; SPREFETCH-SDAG: {{.*}}
diff --git a/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll b/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
index 203696d2c0e73..61ac1fe92c278 100644
--- a/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
+++ b/llvm/test/CodeGen/AMDGPU/offset-split-flat.ll
@@ -4223,10 +4223,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4343,10 +4342,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4463,10 +4461,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4584,10 +4581,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4705,10 +4701,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_split0(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4826,10 +4821,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_split1(ptr %p) {
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -4952,10 +4946,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5078,10 +5071,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_11bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_11bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5204,10 +5196,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5330,10 +5321,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_12bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_12bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5456,10 +5446,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_neg_high_split0(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_neg_high_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
@@ -5582,10 +5571,9 @@ define amdgpu_kernel void @flat_inst_salu_offset_64bit_13bit_neg_high_split1(ptr
; GFX12-GISEL-LABEL: flat_inst_salu_offset_64bit_13bit_neg_high_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: flat_load_u8 v0, v[0:1] scope:SCOPE_SYS
diff --git a/llvm/test/CodeGen/AMDGPU/offset-split-global.ll b/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
index acb3b1911ff67..de5f4f931070e 100644
--- a/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
+++ b/llvm/test/CodeGen/AMDGPU/offset-split-global.ll
@@ -4105,10 +4105,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_11bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4223,10 +4222,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_11bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4341,10 +4339,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_12bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4459,10 +4456,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_12bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4577,10 +4573,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_split0(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_13bit_split0:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4695,10 +4690,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_split1(ptr addrsp
; GFX12-GISEL-LABEL: global_inst_salu_offset_64bit_13bit_split1:
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
-; GFX12-GISEL-NEXT: s_mov_b32 s3, 2
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 2
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
; GFX12-GISEL-NEXT: global_load_u8 v0, v[0:1], off scope:SCOPE_SYS
@@ -4812,10 +4806,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x7ff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x7ff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -4918,10 +4911,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_11bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x800
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x800
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5024,10 +5016,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0xfff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0xfff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5130,10 +5121,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_12bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1000
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5236,10 +5226,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_neg_high_split0(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x1fff
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x1fff
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
@@ -5342,10 +5331,9 @@ define amdgpu_kernel void @global_inst_salu_offset_64bit_13bit_neg_high_split1(p
; GFX12-GISEL: ; %bb.0:
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: v_mov_b32_e32 v0, 0
-; GFX12-GISEL-NEXT: s_movk_i32 s2, 0x2000
-; GFX12-GISEL-NEXT: s_brev_b32 s3, 1
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_nc_u64 s[0:1], s[0:1], s[2:3]
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, 0x2000
+; GFX12-GISEL-NEXT: s_add_co_ci_u32 s1, s1, 0x80000000
; GFX12-GISEL-NEXT: global_load_u8 v0, v0, s[0:1] scope:SCOPE_SYS
; GFX12-GISEL-NEXT: s_wait_loadcnt 0x0
; GFX12-GISEL-NEXT: global_store_b8 v[0:1], v0, off
diff --git a/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll b/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
index 3fc4cf1ab3e51..bfadfd860edf6 100644
--- a/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
+++ b/llvm/test/CodeGen/AMDGPU/spill-vgpr-block.ll
@@ -57,9 +57,10 @@ define amdgpu_kernel void @entry_func(i32 %x) {
; GISEL-NEXT: ;;#ASMSTART
; GISEL-NEXT: s_nop
; GISEL-NEXT: ;;#ASMEND
+; GISEL-NEXT: s_add_co_u32 s8, s4, 4
; GISEL-NEXT: s_mov_b32 s0, non_entry_func at abs32@lo
; GISEL-NEXT: s_mov_b32 s1, non_entry_func at abs32@hi
-; GISEL-NEXT: s_add_nc_u64 s[8:9], s[4:5], 4
+; GISEL-NEXT: s_add_co_ci_u32 s9, s5, 0
; GISEL-NEXT: s_mov_b64 s[4:5], s[12:13]
; GISEL-NEXT: s_mov_b32 s32, 0
; GISEL-NEXT: s_wait_kmcnt 0x0
>From 58ac8e070928fc7741b92ee9af774151020a7749 Mon Sep 17 00:00:00 2001
From: Fabian Ritter <fabian.ritter at amd.com>
Date: Wed, 28 May 2025 06:12:42 -0400
Subject: [PATCH 3/4] Disable PTRADD for 32-bit pointers and remove relevant
changes
---
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 5 +-
llvm/lib/Target/AMDGPU/SIInstructions.td | 19 -
.../GlobalISel/call-outgoing-stack-args.ll | 44 +-
.../GlobalISel/dynamic-alloca-uniform.ll | 48 +-
.../CodeGen/AMDGPU/GlobalISel/flat-scratch.ll | 96 ++--
.../GlobalISel/insertelement-stack-lower.ll | 6 +-
.../inst-select-atomic-cmpxchg-local.mir | 16 +-
.../inst-select-atomic-cmpxchg-region.mir | 16 +-
.../inst-select-atomicrmw-xchg-local.mir | 8 +-
.../inst-select-atomicrmw-xchg-region.mir | 8 +-
.../inst-select-load-atomic-local.mir | 20 +-
.../GlobalISel/inst-select-load-local-128.mir | 26 +-
.../GlobalISel/inst-select-load-local.mir | 12 +-
.../GlobalISel/inst-select-load-private.mir | 34 +-
.../GlobalISel/inst-select-pattern-add3.mir | 12 +-
.../AMDGPU/GlobalISel/inst-select-ptr-add.mir | 68 +--
.../GlobalISel/inst-select-sextload-local.mir | 2 +-
.../GlobalISel/inst-select-store-local.mir | 53 +-
.../GlobalISel/inst-select-store-private.mir | 2 +-
.../GlobalISel/inst-select-zextload-local.mir | 2 +-
.../CodeGen/AMDGPU/GlobalISel/lds-relocs.ll | 2 +-
.../AMDGPU/GlobalISel/lds-zero-initializer.ll | 4 +-
.../AMDGPU/GlobalISel/non-entry-alloca.ll | 16 +-
.../test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll | 30 +-
llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll | 4 +-
.../test/CodeGen/AMDGPU/dynamic_stackalloc.ll | 128 ++---
.../CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll | 524 +++++++++---------
llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll | 492 +---------------
28 files changed, 539 insertions(+), 1158 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 5164d27fcf003..62573c2439bba 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -65,7 +65,8 @@ static cl::opt<bool> UseDivergentRegisterIndexing(
// the SelectionDAG.
static cl::opt<bool> UseSelectionDAGPTRADD(
"amdgpu-use-sdag-ptradd", cl::Hidden,
- cl::desc("Generate ISD::PTRADD nodes in the SelectionDAG ISel"),
+ cl::desc("Generate ISD::PTRADD nodes for 64-bit pointer arithmetic in the "
+ "SelectionDAG ISel"),
cl::init(false));
static bool denormalModeIsFlushAllF32(const MachineFunction &MF) {
@@ -10466,7 +10467,7 @@ SDValue SITargetLowering::LowerINTRINSIC_VOID(SDValue Op,
bool SITargetLowering::shouldPreservePtrArith(const Function &F,
EVT PtrVT) const {
- return UseSelectionDAGPTRADD;
+ return UseSelectionDAGPTRADD && PtrVT == MVT::i64;
}
// The raw.(t)buffer and struct.(t)buffer intrinsics have two offset args:
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 26b194dffa07d..1419f63202a7c 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1390,25 +1390,6 @@ def : GCNPat<
(S_ADD_U64_PSEUDO $src0, $src1)>;
}
-def : GCNPat<
- (DivergentBinFrag<ptradd> i32:$src0, i32:$src1),
- (V_ADD_U32_e64 $src0, $src1, 0)> {
- let SubtargetPredicate = HasAddNoCarryInsts;
-}
-
-def : GCNPat<
- (DivergentBinFrag<ptradd> i32:$src0, i32:$src1),
- (V_ADD_CO_U32_e64 $src0, $src1)> {
- let SubtargetPredicate = NotHasAddNoCarryInsts;
-}
-
-// Whether we select S_ADD_I32 or S_ADD_U32 does not make much of a
-// difference. Most notably, S_ADD_I32 instructions can be transformed
-// to S_ADDK_I32, so we select that.
-def : GCNPat<
- (UniformBinFrag<ptradd> i32:$src0, i32:$src1),
- (S_ADD_I32 $src0, $src1)>;
-
/********** ============================================ **********/
/********** Extraction, Insertion, Building and Casting **********/
/********** ============================================ **********/
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
index 5a3b36fc1ada2..7adaddf2fc8ba 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/call-outgoing-stack-args.ll
@@ -36,15 +36,15 @@ define amdgpu_kernel void @kernel_caller_stack() {
; FLATSCR-NEXT: s_mov_b32 s32, 0
; FLATSCR-NEXT: s_add_u32 flat_scratch_lo, s8, s13
; FLATSCR-NEXT: s_addc_u32 flat_scratch_hi, s9, 0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 4
+; FLATSCR-NEXT: s_add_u32 s0, s32, 4
; FLATSCR-NEXT: v_mov_b32_e32 v0, 9
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 8
+; FLATSCR-NEXT: s_add_u32 s0, s32, 8
; FLATSCR-NEXT: v_mov_b32_e32 v0, 10
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 12
+; FLATSCR-NEXT: s_add_u32 s0, s32, 12
; FLATSCR-NEXT: v_mov_b32_e32 v0, 11
-; FLATSCR-NEXT: s_add_i32 s2, s32, 16
+; FLATSCR-NEXT: s_add_u32 s2, s32, 16
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
; FLATSCR-NEXT: v_mov_b32_e32 v0, 12
; FLATSCR-NEXT: s_getpc_b64 s[0:1]
@@ -189,13 +189,13 @@ define amdgpu_kernel void @kernel_caller_byval() {
; FLATSCR-NEXT: s_getpc_b64 s[0:1]
; FLATSCR-NEXT: s_add_u32 s0, s0, external_void_func_byval at rel32@lo+4
; FLATSCR-NEXT: s_addc_u32 s1, s1, external_void_func_byval at rel32@hi+12
-; FLATSCR-NEXT: s_add_i32 s2, s32, 8
-; FLATSCR-NEXT: s_add_i32 s3, s32, 16
-; FLATSCR-NEXT: s_add_i32 s4, s32, 24
-; FLATSCR-NEXT: s_add_i32 s5, s32, 32
-; FLATSCR-NEXT: s_add_i32 s6, s32, 40
-; FLATSCR-NEXT: s_add_i32 s7, s32, 48
-; FLATSCR-NEXT: s_add_i32 s8, s32, 56
+; FLATSCR-NEXT: s_add_u32 s2, s32, 8
+; FLATSCR-NEXT: s_add_u32 s3, s32, 16
+; FLATSCR-NEXT: s_add_u32 s4, s32, 24
+; FLATSCR-NEXT: s_add_u32 s5, s32, 32
+; FLATSCR-NEXT: s_add_u32 s6, s32, 40
+; FLATSCR-NEXT: s_add_u32 s7, s32, 48
+; FLATSCR-NEXT: s_add_u32 s8, s32, 56
; FLATSCR-NEXT: s_waitcnt vmcnt(7)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[0:1], s32
; FLATSCR-NEXT: s_waitcnt vmcnt(7)
@@ -266,16 +266,16 @@ define void @func_caller_stack() {
; FLATSCR-NEXT: s_mov_b64 exec, s[2:3]
; FLATSCR-NEXT: s_add_i32 s32, s32, 16
; FLATSCR-NEXT: v_writelane_b32 v40, s0, 2
-; FLATSCR-NEXT: s_add_i32 s0, s32, 4
+; FLATSCR-NEXT: s_add_u32 s0, s32, 4
; FLATSCR-NEXT: v_mov_b32_e32 v0, 9
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 8
+; FLATSCR-NEXT: s_add_u32 s0, s32, 8
; FLATSCR-NEXT: v_mov_b32_e32 v0, 10
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 12
+; FLATSCR-NEXT: s_add_u32 s0, s32, 12
; FLATSCR-NEXT: v_mov_b32_e32 v0, 11
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
-; FLATSCR-NEXT: s_add_i32 s0, s32, 16
+; FLATSCR-NEXT: s_add_u32 s0, s32, 16
; FLATSCR-NEXT: v_mov_b32_e32 v0, 12
; FLATSCR-NEXT: v_writelane_b32 v40, s30, 0
; FLATSCR-NEXT: scratch_store_dword off, v0, s0
@@ -393,8 +393,8 @@ define void @func_caller_byval(ptr addrspace(5) %argptr) {
; FLATSCR-NEXT: s_add_i32 s32, s32, 16
; FLATSCR-NEXT: v_add_u32_e32 v3, 8, v0
; FLATSCR-NEXT: v_writelane_b32 v40, s0, 2
-; FLATSCR-NEXT: s_add_i32 s0, s32, 8
-; FLATSCR-NEXT: s_add_i32 s2, s32, 56
+; FLATSCR-NEXT: s_add_u32 s0, s32, 8
+; FLATSCR-NEXT: s_add_u32 s2, s32, 56
; FLATSCR-NEXT: v_writelane_b32 v40, s30, 0
; FLATSCR-NEXT: v_writelane_b32 v40, s31, 1
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
@@ -404,28 +404,28 @@ define void @func_caller_byval(ptr addrspace(5) %argptr) {
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_i32 s0, s32, 16
+; FLATSCR-NEXT: s_add_u32 s0, s32, 16
; FLATSCR-NEXT: v_add_u32_e32 v3, 24, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_i32 s0, s32, 24
+; FLATSCR-NEXT: s_add_u32 s0, s32, 24
; FLATSCR-NEXT: v_add_u32_e32 v3, 32, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_i32 s0, s32, 32
+; FLATSCR-NEXT: s_add_u32 s0, s32, 32
; FLATSCR-NEXT: v_add_u32_e32 v3, 40, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_i32 s0, s32, 40
+; FLATSCR-NEXT: s_add_u32 s0, s32, 40
; FLATSCR-NEXT: v_add_u32_e32 v3, 48, v0
; FLATSCR-NEXT: v_add_u32_e32 v0, 56, v0
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[1:2], v3, off
-; FLATSCR-NEXT: s_add_i32 s0, s32, 48
+; FLATSCR-NEXT: s_add_u32 s0, s32, 48
; FLATSCR-NEXT: s_waitcnt vmcnt(0)
; FLATSCR-NEXT: scratch_store_dwordx2 off, v[1:2], s0
; FLATSCR-NEXT: scratch_load_dwordx2 v[0:1], v0, off
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
index a1bb8b390847f..6b767d9e754be 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/dynamic-alloca-uniform.ll
@@ -20,7 +20,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s4
; GFX9-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_i32 s32, s4, s5
+; GFX9-NEXT: s_add_u32 s32, s4, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -39,7 +39,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX10-NEXT: s_and_b32 s5, s5, -16
; GFX10-NEXT: s_lshl_b32 s5, s5, 5
-; GFX10-NEXT: s_add_i32 s32, s4, s5
+; GFX10-NEXT: s_add_u32 s32, s4, s5
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align4:
@@ -56,7 +56,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align4(i32 %n) {
; GFX11-NEXT: s_and_b32 s1, s1, -16
; GFX11-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s32, s0, s1
+; GFX11-NEXT: s_add_u32 s32, s0, s1
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 4, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -84,7 +84,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_i32 s32, s6, s4
+; GFX9-NEXT: s_add_u32 s32, s6, s4
; GFX9-NEXT: s_mov_b32 s32, s33
; GFX9-NEXT: s_mov_b32 s33, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -110,7 +110,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_i32 s32, s6, s4
+; GFX10-NEXT: s_add_u32 s32, s6, s4
; GFX10-NEXT: s_mov_b32 s32, s33
; GFX10-NEXT: s_mov_b32 s33, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -136,7 +136,7 @@ define void @func_dynamic_stackalloc_sgpr_align4() {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s32, s2, s0
+; GFX11-NEXT: s_add_u32 s32, s2, s0
; GFX11-NEXT: s_mov_b32 s32, s33
; GFX11-NEXT: s_mov_b32 s33, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
@@ -161,7 +161,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s4
; GFX9-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_i32 s32, s4, s5
+; GFX9-NEXT: s_add_u32 s32, s4, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -180,7 +180,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX10-NEXT: s_and_b32 s5, s5, -16
; GFX10-NEXT: s_lshl_b32 s5, s5, 5
-; GFX10-NEXT: s_add_i32 s32, s4, s5
+; GFX10-NEXT: s_add_u32 s32, s4, s5
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align16:
@@ -197,7 +197,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align16(i32 %n) {
; GFX11-NEXT: s_and_b32 s1, s1, -16
; GFX11-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s32, s0, s1
+; GFX11-NEXT: s_add_u32 s32, s0, s1
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 16, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -225,7 +225,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_i32 s32, s6, s4
+; GFX9-NEXT: s_add_u32 s32, s6, s4
; GFX9-NEXT: s_mov_b32 s32, s33
; GFX9-NEXT: s_mov_b32 s33, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -251,7 +251,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_i32 s32, s6, s4
+; GFX10-NEXT: s_add_u32 s32, s6, s4
; GFX10-NEXT: s_mov_b32 s32, s33
; GFX10-NEXT: s_mov_b32 s33, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -277,7 +277,7 @@ define void @func_dynamic_stackalloc_sgpr_align16() {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s32, s2, s0
+; GFX11-NEXT: s_add_u32 s32, s2, s0
; GFX11-NEXT: s_mov_b32 s32, s33
; GFX11-NEXT: s_mov_b32 s33, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
@@ -294,7 +294,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX9-NEXT: s_movk_i32 s32, 0x800
; GFX9-NEXT: s_add_u32 s0, s0, s17
; GFX9-NEXT: s_addc_u32 s1, s1, 0
-; GFX9-NEXT: s_add_i32 s5, s32, 0x7ff
+; GFX9-NEXT: s_add_u32 s5, s32, 0x7ff
; GFX9-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s5, s5, 0xfffff800
@@ -303,7 +303,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX9-NEXT: v_mov_b32_e32 v1, s5
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-NEXT: s_mov_b32 s33, 0
-; GFX9-NEXT: s_add_i32 s32, s5, s4
+; GFX9-NEXT: s_add_u32 s32, s5, s4
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-NEXT: s_endpgm
;
@@ -313,7 +313,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX10-NEXT: s_movk_i32 s32, 0x400
; GFX10-NEXT: s_add_u32 s0, s0, s17
; GFX10-NEXT: s_addc_u32 s1, s1, 0
-; GFX10-NEXT: s_add_i32 s5, s32, 0x3ff
+; GFX10-NEXT: s_add_u32 s5, s32, 0x3ff
; GFX10-NEXT: v_mov_b32_e32 v0, 0
; GFX10-NEXT: s_and_b32 s5, s5, 0xfffffc00
; GFX10-NEXT: s_mov_b32 s33, 0
@@ -323,7 +323,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_i32 s32, s5, s4
+; GFX10-NEXT: s_add_u32 s32, s5, s4
; GFX10-NEXT: s_endpgm
;
; GFX11-LABEL: kernel_dynamic_stackalloc_sgpr_align32:
@@ -331,7 +331,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX11-NEXT: s_load_b32 s0, s[4:5], 0x0
; GFX11-NEXT: s_mov_b32 s32, 32
; GFX11-NEXT: v_mov_b32_e32 v0, 0
-; GFX11-NEXT: s_add_i32 s1, s32, 0x3ff
+; GFX11-NEXT: s_add_u32 s1, s32, 0x3ff
; GFX11-NEXT: s_mov_b32 s33, 0
; GFX11-NEXT: s_and_b32 s1, s1, 0xfffffc00
; GFX11-NEXT: scratch_store_b32 off, v0, s1
@@ -341,7 +341,7 @@ define amdgpu_kernel void @kernel_dynamic_stackalloc_sgpr_align32(i32 %n) {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s32, s1, s0
+; GFX11-NEXT: s_add_u32 s32, s1, s0
; GFX11-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 32, addrspace(5)
store i32 0, ptr addrspace(5) %alloca
@@ -366,7 +366,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX9-NEXT: s_mov_b32 s33, s6
; GFX9-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-NEXT: s_load_dword s4, s[4:5], 0x0
-; GFX9-NEXT: s_add_i32 s5, s32, 0x7ff
+; GFX9-NEXT: s_add_u32 s5, s32, 0x7ff
; GFX9-NEXT: s_and_b32 s5, s5, 0xfffff800
; GFX9-NEXT: v_mov_b32_e32 v1, s5
; GFX9-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -374,7 +374,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX9-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-NEXT: s_and_b32 s4, s4, -16
; GFX9-NEXT: s_lshl_b32 s4, s4, 6
-; GFX9-NEXT: s_add_i32 s32, s5, s4
+; GFX9-NEXT: s_add_u32 s32, s5, s4
; GFX9-NEXT: s_mov_b32 s32, s34
; GFX9-NEXT: s_mov_b32 s34, s7
; GFX9-NEXT: s_waitcnt vmcnt(0)
@@ -397,7 +397,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX10-NEXT: s_mov_b32 s33, s6
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_load_dword s4, s[4:5], 0x0
-; GFX10-NEXT: s_add_i32 s5, s32, 0x3ff
+; GFX10-NEXT: s_add_u32 s5, s32, 0x3ff
; GFX10-NEXT: s_and_b32 s5, s5, 0xfffffc00
; GFX10-NEXT: v_mov_b32_e32 v1, s5
; GFX10-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -405,7 +405,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX10-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX10-NEXT: s_and_b32 s4, s4, -16
; GFX10-NEXT: s_lshl_b32 s4, s4, 5
-; GFX10-NEXT: s_add_i32 s32, s5, s4
+; GFX10-NEXT: s_add_u32 s32, s5, s4
; GFX10-NEXT: s_mov_b32 s32, s34
; GFX10-NEXT: s_mov_b32 s34, s7
; GFX10-NEXT: s_setpc_b64 s[30:31]
@@ -427,7 +427,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX11-NEXT: s_mov_b32 s33, s2
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_load_b32 s0, s[0:1], 0x0
-; GFX11-NEXT: s_add_i32 s1, s32, 0x3ff
+; GFX11-NEXT: s_add_u32 s1, s32, 0x3ff
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_3) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: s_and_b32 s1, s1, 0xfffffc00
; GFX11-NEXT: scratch_store_b32 off, v0, s1
@@ -436,7 +436,7 @@ define void @func_dynamic_stackalloc_sgpr_align32(ptr addrspace(1) %out) {
; GFX11-NEXT: s_and_b32 s0, s0, -16
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: s_lshl_b32 s0, s0, 5
-; GFX11-NEXT: s_add_i32 s32, s1, s0
+; GFX11-NEXT: s_add_u32 s32, s1, s0
; GFX11-NEXT: s_mov_b32 s32, s34
; GFX11-NEXT: s_mov_b32 s34, s3
; GFX11-NEXT: s_setpc_b64 s[30:31]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
index d1083588e8ac0..8a80afd4a768f 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/flat-scratch.ll
@@ -855,7 +855,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX9-NEXT: s_lshl_b32 s0, s0, 7
; GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX9-NEXT: v_add_u32_e32 v1, 0x100, v1
-; GFX9-NEXT: s_addk_i32 s0, 0x100
+; GFX9-NEXT: s_add_u32 s0, 0x100, s0
; GFX9-NEXT: v_mov_b32_e32 v2, 15
; GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -883,7 +883,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; GFX10-NEXT: s_addk_i32 s0, 0x100
+; GFX10-NEXT: s_add_u32 s0, 0x100, s0
; GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -899,7 +899,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX942-NEXT: s_waitcnt lgkmcnt(0)
; GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; GFX942-NEXT: s_addk_i32 s0, 0x100
+; GFX942-NEXT: s_add_u32 s0, 0x100, s0
; GFX942-NEXT: v_mov_b32_e32 v2, 15
; GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX942-NEXT: scratch_store_dword v1, v2, off offset:384 sc0 sc1
@@ -924,7 +924,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX11-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; GFX11-NEXT: s_addk_i32 s0, 0x100
+; GFX11-NEXT: s_add_u32 s0, 0x100, s0
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX11-NEXT: scratch_load_b32 v0, v1, off offset:124 glc dlc
@@ -946,7 +946,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; GFX12-NEXT: scratch_store_b32 v0, v2, off offset:384 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_storecnt 0x0
; GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; GFX12-NEXT: s_addk_co_i32 s0, 0x100
+; GFX12-NEXT: s_add_co_u32 s0, 0x100, s0
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -966,7 +966,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX9-NEXT: s_lshl_b32 s0, s0, 7
; UNALIGNED_GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX9-NEXT: v_add_u32_e32 v1, 0x100, v1
-; UNALIGNED_GFX9-NEXT: s_addk_i32 s0, 0x100
+; UNALIGNED_GFX9-NEXT: s_add_u32 s0, 0x100, s0
; UNALIGNED_GFX9-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -994,7 +994,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX10-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX10-NEXT: s_addk_i32 s0, 0x100
+; UNALIGNED_GFX10-NEXT: s_add_u32 s0, 0x100, s0
; UNALIGNED_GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1010,7 +1010,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX942-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0x100
+; UNALIGNED_GFX942-NEXT: s_add_u32 s0, 0x100, s0
; UNALIGNED_GFX942-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX942-NEXT: scratch_store_dword v1, v2, off offset:384 sc0 sc1
@@ -1035,7 +1035,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX11-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX11-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX11-NEXT: s_addk_i32 s0, 0x100
+; UNALIGNED_GFX11-NEXT: s_add_u32 s0, 0x100, s0
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v0, v1, off offset:124 glc dlc
@@ -1057,7 +1057,7 @@ define amdgpu_kernel void @store_load_vindex_small_offset_kernel(i32 %n) {
; UNALIGNED_GFX12-NEXT: scratch_store_b32 v0, v2, off offset:384 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_storecnt 0x0
; UNALIGNED_GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX12-NEXT: s_addk_co_i32 s0, 0x100
+; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, 0x100, s0
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -1495,7 +1495,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX9-NEXT: s_lshl_b32 s0, s0, 7
; GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX9-NEXT: v_add_u32_e32 v1, 0x4004, v1
-; GFX9-NEXT: s_addk_i32 s0, 0x4004
+; GFX9-NEXT: s_add_u32 s0, 0x4004, s0
; GFX9-NEXT: v_mov_b32_e32 v2, 15
; GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -1523,7 +1523,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; GFX10-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; GFX10-NEXT: s_addk_i32 s0, 0x4004
+; GFX10-NEXT: s_add_u32 s0, 0x4004, s0
; GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1539,7 +1539,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; GFX942-NEXT: s_waitcnt lgkmcnt(0)
; GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; GFX942-NEXT: s_addk_i32 s0, 0x4004
+; GFX942-NEXT: s_add_u32 s0, 0x4004, s0
; GFX942-NEXT: v_mov_b32_e32 v2, 15
; GFX942-NEXT: s_movk_i32 s1, 0x4004
; GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
@@ -1563,7 +1563,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
; GFX11-NEXT: s_waitcnt lgkmcnt(0)
; GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; GFX11-NEXT: s_addk_i32 s0, 0x4004
+; GFX11-NEXT: s_add_u32 s0, 0x4004, s0
; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
; GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX11-NEXT: s_movk_i32 s0, 0x4004
@@ -1588,7 +1588,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; GFX12-NEXT: scratch_store_b32 v0, v2, off offset:16512 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_storecnt 0x0
; GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; GFX12-NEXT: s_addk_co_i32 s0, 0x4000
+; GFX12-NEXT: s_add_co_u32 s0, 0x4000, s0
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -1608,7 +1608,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX9-NEXT: s_lshl_b32 s0, s0, 7
; UNALIGNED_GFX9-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX9-NEXT: v_add_u32_e32 v1, 0x4004, v1
-; UNALIGNED_GFX9-NEXT: s_addk_i32 s0, 0x4004
+; UNALIGNED_GFX9-NEXT: s_add_u32 s0, 0x4004, s0
; UNALIGNED_GFX9-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX9-NEXT: v_lshlrev_b32_e32 v0, 2, v0
; UNALIGNED_GFX9-NEXT: scratch_store_dword v1, v2, off offset:128
@@ -1636,7 +1636,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX10-NEXT: s_waitcnt_vscnt null, 0x0
; UNALIGNED_GFX10-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX10-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX10-NEXT: s_addk_i32 s0, 0x4004
+; UNALIGNED_GFX10-NEXT: s_add_u32 s0, 0x4004, s0
; UNALIGNED_GFX10-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX10-NEXT: scratch_load_dword v0, v1, off offset:124 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
@@ -1652,7 +1652,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX942-NEXT: v_sub_u32_e32 v0, 0, v0
; UNALIGNED_GFX942-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX942-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0x4004
+; UNALIGNED_GFX942-NEXT: s_add_u32 s0, 0x4004, s0
; UNALIGNED_GFX942-NEXT: v_mov_b32_e32 v2, 15
; UNALIGNED_GFX942-NEXT: s_movk_i32 s1, 0x4004
; UNALIGNED_GFX942-NEXT: v_lshlrev_b32_e32 v0, 2, v0
@@ -1676,7 +1676,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
; UNALIGNED_GFX11-NEXT: s_waitcnt lgkmcnt(0)
; UNALIGNED_GFX11-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX11-NEXT: s_addk_i32 s0, 0x4004
+; UNALIGNED_GFX11-NEXT: s_add_u32 s0, 0x4004, s0
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
; UNALIGNED_GFX11-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX11-NEXT: s_movk_i32 s0, 0x4004
@@ -1701,7 +1701,7 @@ define amdgpu_kernel void @store_load_vindex_large_offset_kernel(i32 %n) {
; UNALIGNED_GFX12-NEXT: scratch_store_b32 v0, v2, off offset:16512 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_storecnt 0x0
; UNALIGNED_GFX12-NEXT: s_lshl_b32 s0, s0, 7
-; UNALIGNED_GFX12-NEXT: s_addk_co_i32 s0, 0x4000
+; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, 0x4000, s0
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; UNALIGNED_GFX12-NEXT: v_add_nc_u32_e32 v1, s0, v1
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v0, v1, off offset:124 scope:SCOPE_SYS
@@ -3803,8 +3803,8 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; GFX9: ; %bb.0: ; %entry
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; GFX9-NEXT: s_add_i32 s2, s2, 0xffe8
-; GFX9-NEXT: scratch_load_dword v2, off, s2
+; GFX9-NEXT: s_add_u32 s0, s2, 0xffe8
+; GFX9-NEXT: scratch_load_dword v2, off, s0
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
; GFX9-NEXT: s_endpgm
@@ -3815,15 +3815,15 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; GFX10-NEXT: s_addc_u32 s1, s1, 0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
-; GFX10-NEXT: s_add_i32 s2, s2, 0xffe8
-; GFX10-NEXT: scratch_load_dword v2, off, s2
+; GFX10-NEXT: s_add_u32 s0, s2, 0xffe8
+; GFX10-NEXT: scratch_load_dword v2, off, s0
; GFX10-NEXT: s_waitcnt vmcnt(0)
; GFX10-NEXT: global_store_dword v[0:1], v2, off
; GFX10-NEXT: s_endpgm
;
; GFX942-LABEL: sgpr_base_large_offset:
; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_add_i32 s0, s0, 0xffe8
+; GFX942-NEXT: s_add_u32 s0, s0, 0xffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3831,7 +3831,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; GFX11-LABEL: sgpr_base_large_offset:
; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_add_i32 s0, s0, 0xffe8
+; GFX11-NEXT: s_add_u32 s0, s0, 0xffe8
; GFX11-NEXT: scratch_load_b32 v2, off, s0
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3848,8 +3848,8 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; UNALIGNED_GFX9: ; %bb.0: ; %entry
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; UNALIGNED_GFX9-NEXT: s_add_i32 s2, s2, 0xffe8
-; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s2
+; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s2, 0xffe8
+; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX9-NEXT: s_endpgm
@@ -3860,15 +3860,15 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
; UNALIGNED_GFX10-NEXT: s_addc_u32 s1, s1, 0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
-; UNALIGNED_GFX10-NEXT: s_add_i32 s2, s2, 0xffe8
-; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s2
+; UNALIGNED_GFX10-NEXT: s_add_u32 s0, s2, 0xffe8
+; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX10-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX10-NEXT: s_endpgm
;
; UNALIGNED_GFX942-LABEL: sgpr_base_large_offset:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
-; UNALIGNED_GFX942-NEXT: s_add_i32 s0, s0, 0xffe8
+; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0xffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3876,7 +3876,7 @@ define amdgpu_gs void @sgpr_base_large_offset(ptr addrspace(1) %out, ptr addrspa
;
; UNALIGNED_GFX11-LABEL: sgpr_base_large_offset:
; UNALIGNED_GFX11: ; %bb.0: ; %entry
-; UNALIGNED_GFX11-NEXT: s_add_i32 s0, s0, 0xffe8
+; UNALIGNED_GFX11-NEXT: s_add_u32 s0, s0, 0xffe8
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v2, off, s0
; UNALIGNED_GFX11-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3901,7 +3901,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
; GFX9-NEXT: s_and_b32 s0, s2, -4
-; GFX9-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; GFX9-NEXT: s_add_u32 s0, s0, 0x100ffe8
; GFX9-NEXT: scratch_load_dword v2, off, s0 glc
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
@@ -3914,7 +3914,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
; GFX10-NEXT: s_and_b32 s0, s2, -4
-; GFX10-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; GFX10-NEXT: s_add_u32 s0, s0, 0x100ffe8
; GFX10-NEXT: scratch_load_dword v2, off, s0 glc dlc
; GFX10-NEXT: s_waitcnt vmcnt(0)
; GFX10-NEXT: global_store_dword v[0:1], v2, off
@@ -3923,7 +3923,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX942-LABEL: sgpr_base_large_offset_split:
; GFX942: ; %bb.0: ; %entry
; GFX942-NEXT: s_and_b32 s0, s0, -4
-; GFX942-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; GFX942-NEXT: s_add_u32 s0, s0, 0x100ffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0 sc0 sc1
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3933,7 +3933,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX11: ; %bb.0: ; %entry
; GFX11-NEXT: s_and_b32 s0, s0, -4
; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; GFX11-NEXT: s_add_u32 s0, s0, 0x100ffe8
; GFX11-NEXT: scratch_load_b32 v2, off, s0 glc dlc
; GFX11-NEXT: s_waitcnt vmcnt(0)
; GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3943,7 +3943,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; GFX12: ; %bb.0: ; %entry
; GFX12-NEXT: s_and_b32 s0, s0, -4
; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_co_i32 s0, s0, 0x100ffe8
+; GFX12-NEXT: s_add_co_u32 s0, s0, 0x100ffe8
; GFX12-NEXT: scratch_load_b32 v2, off, s0 scope:SCOPE_SYS
; GFX12-NEXT: s_wait_loadcnt 0x0
; GFX12-NEXT: global_store_b32 v[0:1], v2, off
@@ -3954,7 +3954,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
; UNALIGNED_GFX9-NEXT: s_and_b32 s0, s2, -4
-; UNALIGNED_GFX9-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s0, 0x100ffe8
; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0 glc
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
@@ -3967,7 +3967,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s0
; UNALIGNED_GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s1
; UNALIGNED_GFX10-NEXT: s_and_b32 s0, s2, -4
-; UNALIGNED_GFX10-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX10-NEXT: s_add_u32 s0, s0, 0x100ffe8
; UNALIGNED_GFX10-NEXT: scratch_load_dword v2, off, s0 glc dlc
; UNALIGNED_GFX10-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX10-NEXT: global_store_dword v[0:1], v2, off
@@ -3976,7 +3976,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX942-LABEL: sgpr_base_large_offset_split:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
; UNALIGNED_GFX942-NEXT: s_and_b32 s0, s0, -4
-; UNALIGNED_GFX942-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0x100ffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0 sc0 sc1
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -3986,7 +3986,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX11: ; %bb.0: ; %entry
; UNALIGNED_GFX11-NEXT: s_and_b32 s0, s0, -4
; UNALIGNED_GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; UNALIGNED_GFX11-NEXT: s_add_i32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX11-NEXT: s_add_u32 s0, s0, 0x100ffe8
; UNALIGNED_GFX11-NEXT: scratch_load_b32 v2, off, s0 glc dlc
; UNALIGNED_GFX11-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX11-NEXT: global_store_b32 v[0:1], v2, off
@@ -3996,7 +3996,7 @@ define amdgpu_gs void @sgpr_base_large_offset_split(ptr addrspace(1) %out, ptr a
; UNALIGNED_GFX12: ; %bb.0: ; %entry
; UNALIGNED_GFX12-NEXT: s_and_b32 s0, s0, -4
; UNALIGNED_GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; UNALIGNED_GFX12-NEXT: s_add_co_i32 s0, s0, 0x100ffe8
+; UNALIGNED_GFX12-NEXT: s_add_co_u32 s0, s0, 0x100ffe8
; UNALIGNED_GFX12-NEXT: scratch_load_b32 v2, off, s0 scope:SCOPE_SYS
; UNALIGNED_GFX12-NEXT: s_wait_loadcnt 0x0
; UNALIGNED_GFX12-NEXT: global_store_b32 v[0:1], v2, off
@@ -4241,8 +4241,8 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
; GFX9: ; %bb.0: ; %entry
; GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; GFX9-NEXT: s_addk_i32 s2, 0xffe8
-; GFX9-NEXT: scratch_load_dword v2, off, s2
+; GFX9-NEXT: s_add_u32 s0, s2, 0xffffffe8
+; GFX9-NEXT: scratch_load_dword v2, off, s0
; GFX9-NEXT: s_waitcnt vmcnt(0)
; GFX9-NEXT: global_store_dword v[0:1], v2, off
; GFX9-NEXT: s_endpgm
@@ -4260,7 +4260,7 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
;
; GFX942-LABEL: sgpr_base_negative_offset:
; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_addk_i32 s0, 0xffe8
+; GFX942-NEXT: s_add_u32 s0, s0, 0xffffffe8
; GFX942-NEXT: scratch_load_dword v2, off, s0
; GFX942-NEXT: s_waitcnt vmcnt(0)
; GFX942-NEXT: global_store_dword v[0:1], v2, off
@@ -4284,8 +4284,8 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
; UNALIGNED_GFX9: ; %bb.0: ; %entry
; UNALIGNED_GFX9-NEXT: s_add_u32 flat_scratch_lo, s0, s5
; UNALIGNED_GFX9-NEXT: s_addc_u32 flat_scratch_hi, s1, 0
-; UNALIGNED_GFX9-NEXT: s_addk_i32 s2, 0xffe8
-; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s2
+; UNALIGNED_GFX9-NEXT: s_add_u32 s0, s2, 0xffffffe8
+; UNALIGNED_GFX9-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX9-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX9-NEXT: global_store_dword v[0:1], v2, off
; UNALIGNED_GFX9-NEXT: s_endpgm
@@ -4303,7 +4303,7 @@ define amdgpu_gs void @sgpr_base_negative_offset(ptr addrspace(1) %out, ptr addr
;
; UNALIGNED_GFX942-LABEL: sgpr_base_negative_offset:
; UNALIGNED_GFX942: ; %bb.0: ; %entry
-; UNALIGNED_GFX942-NEXT: s_addk_i32 s0, 0xffe8
+; UNALIGNED_GFX942-NEXT: s_add_u32 s0, s0, 0xffffffe8
; UNALIGNED_GFX942-NEXT: scratch_load_dword v2, off, s0
; UNALIGNED_GFX942-NEXT: s_waitcnt vmcnt(0)
; UNALIGNED_GFX942-NEXT: global_store_dword v[0:1], v2, off
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
index 6cc7d8d334690..94853767ccfac 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement-stack-lower.ll
@@ -149,9 +149,9 @@ define amdgpu_kernel void @v_insert_v64i32_varidx(ptr addrspace(1) %out.ptr, ptr
; GCN-NEXT: v_mov_b32_e32 v0, s51
; GCN-NEXT: s_lshl_b32 s4, s4, 2
; GCN-NEXT: buffer_store_dword v0, off, s[0:3], 0 offset:252
-; GCN-NEXT: v_mov_b32_e32 v0, s4
-; GCN-NEXT: v_mov_b32_e32 v1, s24
-; GCN-NEXT: buffer_store_dword v1, v0, s[0:3], 0 offen
+; GCN-NEXT: v_mov_b32_e32 v0, s24
+; GCN-NEXT: v_mov_b32_e32 v1, s4
+; GCN-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GCN-NEXT: buffer_load_dword v0, off, s[0:3], 0
; GCN-NEXT: s_nop 0
; GCN-NEXT: buffer_load_dword v1, off, s[0:3], 0 offset:4
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
index 41916dd2973c1..e90e7d95b4c1e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-local.mir
@@ -25,7 +25,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_local
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -35,7 +34,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_local
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -44,7 +42,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_gfx9_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_gfx9_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_local
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -77,11 +74,10 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: %4:vgpr_32, dead %6:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
+ ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 %4, [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -91,7 +87,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -100,7 +95,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_gfx9_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 4, 0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_gfx9_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_local_gep4
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -137,7 +131,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_local
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -147,7 +140,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_local
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -156,7 +148,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_gfx9_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_gfx9_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_local
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
@@ -191,7 +182,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -201,7 +191,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -210,7 +199,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_gfx9_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64_gfx9 [[COPY]], [[COPY1]], [[COPY2]], 0, 0, implicit $exec :: (load store seq_cst (s64), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_gfx9_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_local_gep4
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
index 761fe5a79ebd7..faad869f67a62 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomic-cmpxchg-region.mir
@@ -25,7 +25,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_region
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -35,7 +34,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_region
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -44,7 +42,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_region
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -77,11 +74,10 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: %4:vgpr_32, dead %6:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
+ ; GFX6-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 %4, [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -91,7 +87,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -100,7 +95,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: [[DS_CMPST_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_CMPST_RTN_B32 [[COPY]], [[COPY1]], [[COPY2]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX9-NEXT: $vgpr0 = COPY [[DS_CMPST_RTN_B32_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s32_region_gep4
; GFX11: liveins: $vgpr0, $vgpr1, $vgpr2
; GFX11-NEXT: {{ $}}
@@ -137,7 +131,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_region
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -147,7 +140,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_region
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -156,7 +148,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_region
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
@@ -191,7 +182,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX7-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX7-NEXT: {{ $}}
@@ -201,7 +191,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX9-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX9-NEXT: {{ $}}
@@ -210,7 +199,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vreg_64 = COPY $vgpr3_vgpr4
; GFX9-NEXT: [[DS_CMPST_RTN_B64_:%[0-9]+]]:vreg_64 = DS_CMPST_RTN_B64 [[COPY]], [[COPY1]], [[COPY2]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s64), addrspace 2)
; GFX9-NEXT: $vgpr0_vgpr1 = COPY [[DS_CMPST_RTN_B64_]]
- ;
; GFX11-LABEL: name: atomic_cmpxchg_s64_region_gep4
; GFX11: liveins: $vgpr0, $vgpr1_vgpr2, $vgpr3_vgpr4
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
index ecd3160088852..f9fc72a65dbd9 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-local.mir
@@ -24,7 +24,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_local
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -33,7 +32,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_local
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
@@ -63,11 +61,10 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: %3:vgpr_32, dead %5:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
+ ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 %3, [[COPY1]], 0, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_local_gep4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -76,7 +73,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 4, 0, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_local_gep4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
index bb94760ee6857..551261ee8a76b 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-atomicrmw-xchg-region.mir
@@ -24,7 +24,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_region
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -33,7 +32,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_region
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
@@ -63,11 +61,10 @@ body: |
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: %3:vgpr_32, dead %5:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
- ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[V_ADD_CO_U32_e64_]], [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
+ ; GFX6-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 %3, [[COPY1]], 0, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX6-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX7-LABEL: name: atomicrmw_xchg_s32_region_gep4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -76,7 +73,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_WRXCHG_RTN_B32_:%[0-9]+]]:vgpr_32 = DS_WRXCHG_RTN_B32 [[COPY]], [[COPY1]], 4, 1, implicit $m0, implicit $exec :: (load store seq_cst (s32), addrspace 2)
; GFX7-NEXT: $vgpr0 = COPY [[DS_WRXCHG_RTN_B32_]]
- ;
; GFX9-LABEL: name: atomicrmw_xchg_s32_region_gep4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
index 99e7529da5dc1..5c2df3904b817 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-atomic-local.mir
@@ -24,7 +24,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
- ;
; GFX7-LABEL: name: load_atomic_local_s32_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -32,7 +31,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
- ;
; GFX9-LABEL: name: load_atomic_local_s32_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -63,7 +61,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(<2 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s16>), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[LOAD]](<2 x s16>)
- ;
; GFX7-LABEL: name: load_atomic_local_v2s16_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -71,7 +68,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(<2 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s16>), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[LOAD]](<2 x s16>)
- ;
; GFX9-LABEL: name: load_atomic_local_v2s16_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -102,7 +98,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(p3) = G_LOAD [[COPY]](p3) :: (load seq_cst (p3), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[LOAD]](p3)
- ;
; GFX7-LABEL: name: load_atomic_local_p3_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -110,7 +105,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vgpr_32(p3) = G_LOAD [[COPY]](p3) :: (load seq_cst (p3), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[LOAD]](p3)
- ;
; GFX9-LABEL: name: load_atomic_local_p3_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -141,7 +135,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B64_:%[0-9]+]]:vreg_64 = DS_READ_B64 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s64), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ_B64_]]
- ;
; GFX7-LABEL: name: load_atomic_local_s64_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -149,7 +142,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B64_:%[0-9]+]]:vreg_64 = DS_READ_B64 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s64), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ_B64_]]
- ;
; GFX9-LABEL: name: load_atomic_local_s64_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -180,7 +172,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s32>), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<2 x s32>)
- ;
; GFX7-LABEL: name: load_atomic_local_v2s32_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -188,7 +179,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<2 x s32>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<2 x s32>)
- ;
; GFX9-LABEL: name: load_atomic_local_v2s32_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -219,7 +209,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<4 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<4 x s16>), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<4 x s16>)
- ;
; GFX7-LABEL: name: load_atomic_local_v4s16_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -227,7 +216,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(<4 x s16>) = G_LOAD [[COPY]](p3) :: (load seq_cst (<4 x s16>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](<4 x s16>)
- ;
; GFX9-LABEL: name: load_atomic_local_v4s16_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -258,7 +246,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p1) = G_LOAD [[COPY]](p3) :: (load seq_cst (p1), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p1)
- ;
; GFX7-LABEL: name: load_atomic_local_p1_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -266,7 +253,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p1) = G_LOAD [[COPY]](p3) :: (load seq_cst (p1), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p1)
- ;
; GFX9-LABEL: name: load_atomic_local_p1_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -297,7 +283,6 @@ body: |
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p0) = G_LOAD [[COPY]](p3) :: (load seq_cst (p0), addrspace 3)
; GFX6-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p0)
- ;
; GFX7-LABEL: name: load_atomic_local_p0_seq_cst
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -305,7 +290,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_64(p0) = G_LOAD [[COPY]](p3) :: (load seq_cst (p0), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[LOAD]](p0)
- ;
; GFX9-LABEL: name: load_atomic_local_p0_seq_cst
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -334,11 +318,10 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
- ;
; GFX7-LABEL: name: load_atomic_local_s32_seq_cst_gep_65535
; GFX7: liveins: $vgpr0
; GFX7-NEXT: {{ $}}
@@ -346,7 +329,6 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY]], 65535, 0, implicit $m0, implicit $exec :: (load seq_cst (s32), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_B32_]]
- ;
; GFX9-LABEL: name: load_atomic_local_s32_seq_cst_gep_65535
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
index 59550bf5fe9c9..94104885748a9 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local-128.mir
@@ -22,21 +22,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_B128_:%[0-9]+]]:vreg_128 = DS_READ_B128 [[COPY]], 0, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_]]
- ;
; GFX9-LABEL: name: load_local_v4s32_align16
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ_B128_gfx9_:%[0-9]+]]:vreg_128 = DS_READ_B128_gfx9 [[COPY]], 0, 0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v4s32_align16
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ_B128_gfx9_:%[0-9]+]]:vreg_128 = DS_READ_B128_gfx9 [[COPY]], 0, 0, implicit $exec :: (load (<4 x s32>), addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ_B128_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v4s32_align16
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -67,21 +64,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
- ;
; GFX9-LABEL: name: load_local_v4s32_align_8
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v4s32_align_8
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v4s32_align_8
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -112,21 +106,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 50, 51, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
- ;
; GFX9-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 50, 51, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 50, 51, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v4s32_align_8_offset_160
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -157,11 +148,10 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4000, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[V_ADD_CO_U32_e64_]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
- ;
; GFX9-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
@@ -170,7 +160,6 @@ body: |
; GFX9-NEXT: [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[V_ADD_U32_e64_]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
@@ -179,7 +168,6 @@ body: |
; GFX10-NEXT: [[V_ADD_U32_e64_:%[0-9]+]]:vgpr_32 = V_ADD_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[V_ADD_U32_e64_]], 0, 1, 0, implicit $exec :: (load (<4 x s32>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v4s32_align_8_offset_320
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -214,21 +202,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
- ;
; GFX9-LABEL: name: load_local_v2s64
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v2s64
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<2 x s64>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v2s64
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -259,21 +244,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
- ;
; GFX9-LABEL: name: load_local_v2p1
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX9-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
- ;
; GFX10-LABEL: name: load_local_v2p1
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX10-NEXT: [[LOAD:%[0-9]+]]:vreg_128(<2 x p1>) = G_LOAD [[COPY]](p3) :: (load (<2 x p1>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<2 x p1>)
- ;
; GFX11-LABEL: name: load_local_v2p1
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -304,21 +286,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
- ;
; GFX9-LABEL: name: load_local_s128
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX9-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
- ;
; GFX10-LABEL: name: load_local_s128
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr(p3) = COPY $vgpr0
; GFX10-NEXT: [[LOAD:%[0-9]+]]:vreg_128(s128) = G_LOAD [[COPY]](p3) :: (load (s128), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](s128)
- ;
; GFX11-LABEL: name: load_local_s128
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
@@ -349,21 +328,18 @@ body: |
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B64_:%[0-9]+]]:vreg_128 = DS_READ2_B64 [[COPY]], 0, 1, 0, implicit $m0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_]]
- ;
; GFX9-LABEL: name: load_local_v8s16
; GFX9: liveins: $vgpr0
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX9-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX10-LABEL: name: load_local_v8s16
; GFX10: liveins: $vgpr0
; GFX10-NEXT: {{ $}}
; GFX10-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX10-NEXT: [[DS_READ2_B64_gfx9_:%[0-9]+]]:vreg_128 = DS_READ2_B64_gfx9 [[COPY]], 0, 1, 0, implicit $exec :: (load (<8 x s16>), align 8, addrspace 3)
; GFX10-NEXT: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[DS_READ2_B64_gfx9_]]
- ;
; GFX11-LABEL: name: load_local_v8s16
; GFX11: liveins: $vgpr0
; GFX11-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
index e03e0cb95e445..59c57a5fefbed 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-local.mir
@@ -740,7 +740,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65535, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -849,7 +849,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65536, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -859,7 +859,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 65536, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -905,7 +905,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -1, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -915,7 +915,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -1, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX7-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
@@ -1021,7 +1021,7 @@ body: |
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1020, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[DS_READ2_B32_:%[0-9]+]]:vreg_64 = DS_READ2_B32 [[V_ADD_CO_U32_e64_]], 0, 1, 0, implicit $m0, implicit $exec :: (load (s64), align 4, addrspace 3)
; GFX7-NEXT: $vgpr0_vgpr1 = COPY [[DS_READ2_B32_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
index df72ad7bf0c0e..df148638e7005 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-private.mir
@@ -312,7 +312,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -416,7 +416,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -469,7 +469,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -2047, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -522,7 +522,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -2048, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -575,7 +575,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -628,7 +628,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -683,7 +683,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -736,7 +736,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -789,7 +789,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -844,7 +844,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -899,7 +899,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8191, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -954,7 +954,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8192, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1009,7 +1009,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 8388607, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1064,7 +1064,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 16777214, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1121,7 +1121,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -8388608, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1176,7 +1176,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 -16777215, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
@@ -1490,7 +1490,7 @@ body: |
; GFX6-LABEL: name: load_private_s32_from_1_fi_offset_4096
; GFX6: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 %stack.0, implicit $exec
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, implicit $exec
; GFX6-NEXT: [[BUFFER_LOAD_UBYTE_OFFEN:%[0-9]+]]:vgpr_32 = BUFFER_LOAD_UBYTE_OFFEN [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (load (s8), addrspace 5)
; GFX6-NEXT: $vgpr0 = COPY [[BUFFER_LOAD_UBYTE_OFFEN]]
;
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
index 8b65e719d169a..681c366aa3411 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-pattern-add3.mir
@@ -131,8 +131,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p3_vgpr_vgpr_vgpr
@@ -168,8 +168,8 @@ body: |
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_ADD_CO_U32_e64_]], [[COPY2]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p5_vgpr_vgpr_vgpr
@@ -206,7 +206,7 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p3_s32_vgpr_vgpr_vgpr
@@ -244,7 +244,7 @@ body: |
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX8-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_2:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_3:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY2]], [[V_ADD_CO_U32_e64_]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_2]]
;
; GFX9-LABEL: name: add_p5_s32_vgpr_vgpr_vgpr
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
index d1cf3bfc29ff0..41e416e3f5d72 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-ptr-add.mir
@@ -272,40 +272,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX8-LABEL: name: gep_p3_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX9-LABEL: name: gep_p3_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p3_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p3_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
%0:sgpr(p3) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p3) = G_PTR_ADD %0, %1
@@ -326,7 +326,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX8-LABEL: name: gep_p3_vgpr_vgpr
@@ -334,7 +334,7 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX9-LABEL: name: gep_p3_vgpr_vgpr
@@ -380,7 +380,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX8-LABEL: name: gep_p3_sgpr_vgpr
@@ -388,7 +388,7 @@ body: |
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr0
- ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
+ ; GFX8-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[COPY1]], 0, implicit $exec
; GFX8-NEXT: S_ENDPGM 0, implicit [[V_ADD_CO_U32_e64_]]
;
; GFX9-LABEL: name: gep_p3_sgpr_vgpr
@@ -434,40 +434,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX8-LABEL: name: gep_p6_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX9-LABEL: name: gep_p6_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p6_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p6_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
%0:sgpr(p6) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p6) = G_PTR_ADD %0, %1
@@ -488,40 +488,40 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX6-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX6-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX6-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX6-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX8-LABEL: name: gep_p2_sgpr_sgpr
; GFX8: liveins: $sgpr0, $sgpr1
; GFX8-NEXT: {{ $}}
; GFX8-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX8-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX8-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX8-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX8-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX9-LABEL: name: gep_p2_sgpr_sgpr
; GFX9: liveins: $sgpr0, $sgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX9-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX9-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX9-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE64-LABEL: name: gep_p2_sgpr_sgpr
; GFX10-WAVE64: liveins: $sgpr0, $sgpr1
; GFX10-WAVE64-NEXT: {{ $}}
; GFX10-WAVE64-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE64-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE64-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE64-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE64-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
;
; GFX10-WAVE32-LABEL: name: gep_p2_sgpr_sgpr
; GFX10-WAVE32: liveins: $sgpr0, $sgpr1
; GFX10-WAVE32-NEXT: {{ $}}
; GFX10-WAVE32-NEXT: [[COPY:%[0-9]+]]:sreg_32 = COPY $sgpr0
; GFX10-WAVE32-NEXT: [[COPY1:%[0-9]+]]:sreg_32 = COPY $sgpr1
- ; GFX10-WAVE32-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = S_ADD_I32 [[COPY]], [[COPY1]], implicit-def dead $scc
- ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_I32_]]
+ ; GFX10-WAVE32-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[COPY]], [[COPY1]], implicit-def dead $scc
+ ; GFX10-WAVE32-NEXT: S_ENDPGM 0, implicit [[S_ADD_U32_]]
%0:sgpr(p2) = COPY $sgpr0
%1:sgpr(s32) = COPY $sgpr1
%2:sgpr(p2) = G_PTR_ADD %0, %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
index 778b1d8fe6234..37958480d28a5 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-sextload-local.mir
@@ -113,7 +113,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_I8_:%[0-9]+]]:vgpr_32 = DS_READ_I8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_I8_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
index 7b609c44d78cd..1b7c0fcb76714 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-local.mir
@@ -27,7 +27,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s32), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s32_to_4
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -35,14 +34,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s32), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s32_to_4
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s32), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s32_to_4
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -76,7 +73,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B16 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s16), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s32_to_2
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -84,14 +80,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B16 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s16), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s32_to_2
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B16_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s16), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s32_to_2
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -125,7 +119,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s32_to_1
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -133,14 +126,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s32_to_1
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s32_to_1
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -174,7 +165,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s16>), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_v2s16
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -182,14 +172,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s16>), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_v2s16
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<2 x s16>), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_v2s16
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -223,7 +211,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p3), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_p3
; GFX7: liveins: $vgpr0, $vgpr1
; GFX7-NEXT: {{ $}}
@@ -231,14 +218,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B32 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p3), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_p3
; GFX9: liveins: $vgpr0, $vgpr1
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr1
; GFX9-NEXT: DS_WRITE_B32_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (p3), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_p3
; GFX10: liveins: $vgpr0, $vgpr1
; GFX10-NEXT: {{ $}}
@@ -266,18 +251,15 @@ body: |
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s32_to_1_constant_4095
; GFX7: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX7-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s32_to_1_constant_4095
; GFX9: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX9-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s32_to_1_constant_4095
; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
; GFX10-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
@@ -308,18 +290,15 @@ body: |
; GFX6-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s32_to_1_constant_4096
; GFX7: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX7-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B8 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $m0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s32_to_1_constant_4096
; GFX9: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX9-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
; GFX9-NEXT: DS_WRITE_B8_gfx9 [[V_MOV_B32_e32_]], [[V_MOV_B32_e32_1]], 0, 0, implicit $exec :: (store (s8), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s32_to_1_constant_4096
; GFX10: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4096, implicit $exec
; GFX10-NEXT: [[V_MOV_B32_e32_1:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
@@ -351,7 +330,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[COPY1]](p3) :: (store (s64), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s64_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -361,7 +339,6 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s64_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -370,7 +347,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s64_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -406,7 +382,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](p1), [[COPY1]](p3) :: (store (p1), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_p1_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -416,7 +391,6 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (p1), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_p1_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -425,7 +399,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (p1), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_p1_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -461,7 +434,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](<2 x s32>), [[COPY1]](p3) :: (store (<2 x s32>), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_v2s32_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -471,7 +443,6 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_v2s32_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -480,7 +451,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (<2 x s32>), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_v2s32_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -516,7 +486,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr(p3) = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](<4 x s16>), [[COPY1]](p3) :: (store (<4 x s16>), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_v4s16_align4
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -526,7 +495,6 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_v4s16_align4
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -535,7 +503,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (<4 x s16>), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_v4s16_align4
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -571,7 +538,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s64), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s64_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -579,14 +545,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (s64), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s64_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (s64), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s64_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -620,7 +584,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p1), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_p1_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -628,14 +591,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (p1), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_p1_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (p1), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_p1_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -669,7 +630,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_v2s32_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -677,14 +637,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<2 x s32>), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_v2s32_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<2 x s32>), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_v2s32_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -718,7 +676,6 @@ body: |
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), addrspace 3)
- ;
; GFX7-LABEL: name: store_local_v4s16_align8
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -726,14 +683,12 @@ body: |
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: DS_WRITE_B64 [[COPY1]], [[COPY]], 0, 0, implicit $m0, implicit $exec :: (store (<4 x s16>), addrspace 3)
- ;
; GFX9-LABEL: name: store_local_v4s16_align8
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
; GFX9-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX9-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX9-NEXT: DS_WRITE_B64_gfx9 [[COPY1]], [[COPY]], 0, 0, implicit $exec :: (store (<4 x s16>), addrspace 3)
- ;
; GFX10-LABEL: name: store_local_v4s16_align8
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -769,7 +724,6 @@ body: |
; GFX6-NEXT: [[PTR_ADD:%[0-9]+]]:vgpr(p3) = G_PTR_ADD [[COPY1]], [[C]](s32)
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[PTR_ADD]](p3) :: (store (s64), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
@@ -779,7 +733,6 @@ body: |
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[COPY1]], [[COPY3]], [[COPY2]], 254, 255, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -788,7 +741,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[COPY1]], [[COPY3]], [[COPY2]], 254, 255, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s64_align4_from_1_gep_1016
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
@@ -828,19 +780,17 @@ body: |
; GFX6-NEXT: [[PTR_ADD:%[0-9]+]]:vgpr(p3) = G_PTR_ADD [[COPY1]], [[C]](s32)
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: G_STORE [[COPY]](s64), [[PTR_ADD]](p3) :: (store (s64), align 4, addrspace 3)
- ;
; GFX7-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX7: liveins: $vgpr0_vgpr1, $vgpr2
; GFX7-NEXT: {{ $}}
; GFX7-NEXT: [[COPY:%[0-9]+]]:vreg_64 = COPY $vgpr0_vgpr1
; GFX7-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY $vgpr2
; GFX7-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 1020, implicit $exec
- ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY1]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX7-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY1]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX7-NEXT: $m0 = S_MOV_B32 -1
; GFX7-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX7-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX7-NEXT: DS_WRITE2_B32 [[V_ADD_CO_U32_e64_]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $m0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX9-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX9: liveins: $vgpr0_vgpr1, $vgpr2
; GFX9-NEXT: {{ $}}
@@ -851,7 +801,6 @@ body: |
; GFX9-NEXT: [[COPY2:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub1
; GFX9-NEXT: [[COPY3:%[0-9]+]]:vgpr_32 = COPY [[COPY]].sub0
; GFX9-NEXT: DS_WRITE2_B32_gfx9 [[V_ADD_U32_e64_]], [[COPY3]], [[COPY2]], 0, 1, 0, implicit $exec :: (store (s64), align 4, addrspace 3)
- ;
; GFX10-LABEL: name: store_local_s64_align4_from_1_gep_1020
; GFX10: liveins: $vgpr0_vgpr1, $vgpr2
; GFX10-NEXT: {{ $}}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
index 4c966e474dedf..60357abbc7721 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-store-private.mir
@@ -1006,7 +1006,7 @@ body: |
; GFX6-NEXT: [[V_LSHRREV_B32_e64_:%[0-9]+]]:vgpr_32 = V_LSHRREV_B32_e64 6, $sgpr32, implicit $exec
; GFX6-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 4096
; GFX6-NEXT: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_]]
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[V_LSHRREV_B32_e64_]], [[COPY1]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[V_LSHRREV_B32_e64_]], [[COPY1]], 0, implicit $exec
; GFX6-NEXT: BUFFER_STORE_DWORD_OFFEN [[COPY]], [[V_ADD_CO_U32_e64_]], $sgpr0_sgpr1_sgpr2_sgpr3, 0, 0, 0, 0, implicit $exec :: (store (s32), addrspace 5)
;
; GFX9-LABEL: name: function_store_private_s32_to_4_wave_address_offset_copy_constant_4096
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
index 0fad45e817f51..29671c13e173f 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-zextload-local.mir
@@ -113,7 +113,7 @@ body: |
; GFX6-NEXT: {{ $}}
; GFX6-NEXT: [[COPY:%[0-9]+]]:vgpr_32 = COPY $vgpr0
; GFX6-NEXT: [[V_MOV_B32_e32_:%[0-9]+]]:vgpr_32 = V_MOV_B32_e32 4095, implicit $exec
- ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64 = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
+ ; GFX6-NEXT: [[V_ADD_CO_U32_e64_:%[0-9]+]]:vgpr_32, dead [[V_ADD_CO_U32_e64_1:%[0-9]+]]:sreg_64_xexec = V_ADD_CO_U32_e64 [[COPY]], [[V_MOV_B32_e32_]], 0, implicit $exec
; GFX6-NEXT: $m0 = S_MOV_B32 -1
; GFX6-NEXT: [[DS_READ_U8_:%[0-9]+]]:vgpr_32 = DS_READ_U8 [[V_ADD_CO_U32_e64_]], 0, 0, implicit $m0, implicit $exec :: (load (s8), addrspace 3)
; GFX6-NEXT: $vgpr0 = COPY [[DS_READ_U8_]]
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
index e62cfda72eefb..0b9f31e3a765e 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-relocs.ll
@@ -5,7 +5,7 @@
@lds.defined = unnamed_addr addrspace(3) global [8 x i32] poison, align 8
; GCN-LABEL: {{^}}test_basic:
-; GCN: s_add_i32 s0, s0, lds.defined at abs32@lo ; encoding: [0x00,0xff,0x00,0x81,A,A,A,A]
+; GCN: s_add_u32 s0, lds.defined at abs32@lo, s0 ; encoding: [0xff,0x00,0x00,0x80,A,A,A,A]
; GCN: v_mov_b32_e32 v2, s0 ; encoding: [0x00,0x02,0x04,0x7e]
; GCN: .globl lds.external
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
index 7a82ef94c78f3..39dde4bc86bec 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll
@@ -15,10 +15,10 @@ define amdgpu_kernel void @load_zeroinit_lds_global(ptr addrspace(1) %out, i1 %p
; GCN: [[COPY:%[0-9]+]]:sreg_64 = COPY $sgpr4_sgpr5
; GFX8: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 40
; GCN: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 target-flags(amdgpu-abs32-lo) @lds
- ; GFX8: [[S_ADD_I32_:%[0-9]+]]:sreg_32 = nuw S_ADD_I32 [[S_MOV_B32_1]], [[S_MOV_B32_]], implicit-def dead $scc
+ ; GFX8: [[S_ADD_U32_:%[0-9]+]]:sreg_32 = S_ADD_U32 [[S_MOV_B32_1]], [[S_MOV_B32_]], implicit-def dead $scc
; GFX8: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM [[COPY]], 9, 0
; GFX9: [[S_LOAD_DWORDX2_IMM:%[0-9]+]]:sreg_64_xexec = S_LOAD_DWORDX2_IMM [[COPY]], 36, 0
- ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_ADD_I32_]]
+ ; GFX8: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_ADD_U32_]]
; GCN: $m0 = S_MOV_B32 -1
; GFX9: [[COPY1:%[0-9]+]]:vgpr_32 = COPY [[S_MOV_B32_1]]
; GFX8: [[DS_READ_B32_:%[0-9]+]]:vgpr_32 = DS_READ_B32 [[COPY1]], 0, 0, implicit $m0, implicit $exec
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll b/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
index 6d5b6d6fb1df2..8bb060fc4ed54 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/non-entry-alloca.ll
@@ -34,13 +34,13 @@ define amdgpu_kernel void @kernel_non_entry_block_static_alloca_uniformly_reache
; GCN-NEXT: v_mov_b32_e32 v3, 1
; GCN-NEXT: s_waitcnt lgkmcnt(0)
; GCN-NEXT: s_lshl_b32 s4, s4, 2
-; GCN-NEXT: s_add_i32 s4, s6, s4
+; GCN-NEXT: s_add_u32 s4, s6, s4
; GCN-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GCN-NEXT: buffer_store_dword v3, v2, s[0:3], 0 offen offset:4
; GCN-NEXT: v_mov_b32_e32 v2, s4
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x0
-; GCN-NEXT: s_add_i32 s32, s6, 0x1000
+; GCN-NEXT: s_add_u32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v0, v2, v0
; GCN-NEXT: s_waitcnt lgkmcnt(0)
@@ -95,20 +95,20 @@ define amdgpu_kernel void @kernel_non_entry_block_static_alloca_uniformly_reache
; GCN-NEXT: s_cbranch_scc1 .LBB1_2
; GCN-NEXT: ; %bb.1: ; %bb.0
; GCN-NEXT: s_load_dword s4, s[8:9], 0xc
-; GCN-NEXT: s_add_i32 s5, s32, 0xfff
+; GCN-NEXT: s_add_u32 s5, s32, 0xfff
; GCN-NEXT: s_and_b32 s6, s5, 0xfffff000
; GCN-NEXT: v_mov_b32_e32 v1, 0
; GCN-NEXT: v_mov_b32_e32 v2, s6
; GCN-NEXT: s_waitcnt lgkmcnt(0)
; GCN-NEXT: s_lshl_b32 s4, s4, 2
; GCN-NEXT: v_mov_b32_e32 v3, 1
-; GCN-NEXT: s_add_i32 s4, s6, s4
+; GCN-NEXT: s_add_u32 s4, s6, s4
; GCN-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GCN-NEXT: buffer_store_dword v3, v2, s[0:3], 0 offen offset:4
; GCN-NEXT: v_mov_b32_e32 v2, s4
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: s_load_dwordx2 s[4:5], s[8:9], 0x0
-; GCN-NEXT: s_add_i32 s32, s6, 0x1000
+; GCN-NEXT: s_add_u32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v0, v2, v0
; GCN-NEXT: s_waitcnt lgkmcnt(0)
@@ -171,7 +171,7 @@ define void @func_non_entry_block_static_alloca_align4(ptr addrspace(1) %out, i3
; GCN-NEXT: v_add_u32_e32 v2, s6, v2
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: v_and_b32_e32 v3, 0x3ff, v31
-; GCN-NEXT: s_add_i32 s32, s6, 0x1000
+; GCN-NEXT: s_add_u32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v2, v2, v3
; GCN-NEXT: global_store_dword v[0:1], v2, off
@@ -224,7 +224,7 @@ define void @func_non_entry_block_static_alloca_align64(ptr addrspace(1) %out, i
; GCN-NEXT: s_and_saveexec_b64 s[4:5], vcc
; GCN-NEXT: s_cbranch_execz .LBB3_2
; GCN-NEXT: ; %bb.1: ; %bb.0
-; GCN-NEXT: s_add_i32 s6, s32, 0xfff
+; GCN-NEXT: s_add_u32 s6, s32, 0xfff
; GCN-NEXT: s_and_b32 s6, s6, 0xfffff000
; GCN-NEXT: v_mov_b32_e32 v2, 0
; GCN-NEXT: v_mov_b32_e32 v4, s6
@@ -235,7 +235,7 @@ define void @func_non_entry_block_static_alloca_align64(ptr addrspace(1) %out, i
; GCN-NEXT: v_add_u32_e32 v2, s6, v2
; GCN-NEXT: buffer_load_dword v2, v2, s[0:3], 0 offen
; GCN-NEXT: v_and_b32_e32 v3, 0x3ff, v31
-; GCN-NEXT: s_add_i32 s32, s6, 0x1000
+; GCN-NEXT: s_add_u32 s32, s6, 0x1000
; GCN-NEXT: s_waitcnt vmcnt(0)
; GCN-NEXT: v_add_u32_e32 v2, v2, v3
; GCN-NEXT: global_store_dword v[0:1], v2, off
diff --git a/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll b/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
index e88b036f6402a..6e8a5a1266a15 100644
--- a/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
+++ b/llvm/test/CodeGen/AMDGPU/amdgpu-cs-chain-cc.ll
@@ -103,38 +103,38 @@ define amdgpu_cs_chain void @amdgpu_cs_chain_spill(<24 x i32> inreg %sgprs, <24
; GISEL-GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
; GISEL-GFX11-NEXT: s_mov_b32 s32, 0
; GISEL-GFX11-NEXT: v_dual_mov_b32 v32, v8 :: v_dual_mov_b32 v33, v9
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 4
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 4
; GISEL-GFX11-NEXT: scratch_store_b32 off, v16, s32
; GISEL-GFX11-NEXT: scratch_store_b32 off, v17, s24
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 8
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 12
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 8
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 12
; GISEL-GFX11-NEXT: scratch_store_b32 off, v18, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v19, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 16
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 20
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 16
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 20
; GISEL-GFX11-NEXT: scratch_store_b32 off, v20, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v21, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 24
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 28
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 24
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 28
; GISEL-GFX11-NEXT: scratch_store_b32 off, v22, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v23, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 32
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 36
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 32
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 36
; GISEL-GFX11-NEXT: scratch_store_b32 off, v24, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v25, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 40
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 40
; GISEL-GFX11-NEXT: v_dual_mov_b32 v34, v10 :: v_dual_mov_b32 v35, v11
; GISEL-GFX11-NEXT: v_dual_mov_b32 v36, v12 :: v_dual_mov_b32 v37, v13
; GISEL-GFX11-NEXT: v_dual_mov_b32 v38, v14 :: v_dual_mov_b32 v39, v15
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 44
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 44
; GISEL-GFX11-NEXT: scratch_store_b32 off, v26, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v27, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 48
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 52
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 48
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 52
; GISEL-GFX11-NEXT: scratch_store_b32 off, v28, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v29, s25
-; GISEL-GFX11-NEXT: s_add_i32 s24, s32, 56
-; GISEL-GFX11-NEXT: s_add_i32 s25, s32, 60
+; GISEL-GFX11-NEXT: s_add_u32 s24, s32, 56
+; GISEL-GFX11-NEXT: s_add_u32 s25, s32, 60
; GISEL-GFX11-NEXT: scratch_store_b32 off, v30, s24
; GISEL-GFX11-NEXT: scratch_store_b32 off, v31, s25
; GISEL-GFX11-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
diff --git a/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll b/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
index 76ff285fdab54..e74fd21365c9d 100644
--- a/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
+++ b/llvm/test/CodeGen/AMDGPU/atomics_cond_sub.ll
@@ -179,7 +179,7 @@ define amdgpu_kernel void @ds_cond_sub_no_rtn_u32(ptr addrspace(3) %addr, i32 %i
; GFX12-GISEL: ; %bb.0: ; %entry
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_i32 s0, s0, -16
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, -16
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v1, s1 :: v_dual_mov_b32 v0, s0
; GFX12-GISEL-NEXT: ds_cond_sub_rtn_u32 v0, v0, v1
@@ -205,7 +205,7 @@ define amdgpu_kernel void @ds_cond_sub_no_rtn_u32_forced(ptr addrspace(3) %addr,
; GFX12-GISEL: ; %bb.0: ; %entry
; GFX12-GISEL-NEXT: s_load_b64 s[0:1], s[4:5], 0x24
; GFX12-GISEL-NEXT: s_wait_kmcnt 0x0
-; GFX12-GISEL-NEXT: s_add_co_i32 s0, s0, -16
+; GFX12-GISEL-NEXT: s_add_co_u32 s0, s0, -16
; GFX12-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
; GFX12-GISEL-NEXT: v_dual_mov_b32 v1, s1 :: v_dual_mov_b32 v0, s0
; GFX12-GISEL-NEXT: ds_cond_sub_u32 v0, v1
diff --git a/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll b/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
index 8b578add0fad5..9ae7c4aaa1e95 100644
--- a/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
+++ b/llvm/test/CodeGen/AMDGPU/dynamic_stackalloc.ll
@@ -38,7 +38,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform(i32 %n) {
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -77,7 +77,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform(i32 %n) {
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s0, s1
+; GFX11-GISEL-NEXT: s_add_u32 s32, s0, s1
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, addrspace(5)
store volatile i32 123, ptr addrspace(5) %alloca
@@ -112,7 +112,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX9-GISEL-NEXT: s_movk_i32 s32, 0x2000
; GFX9-GISEL-NEXT: s_add_u32 s0, s0, s17
; GFX9-GISEL-NEXT: s_addc_u32 s1, s1, 0
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_waitcnt lgkmcnt(0)
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s4, s4, 15
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
@@ -121,7 +121,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -151,7 +151,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX11-GISEL-NEXT: s_load_b32 s0, s[4:5], 0x0
; GFX11-GISEL-NEXT: s_movk_i32 s32, 0x80
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 10
-; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: s_mov_b32 s33, 0
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
@@ -162,7 +162,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_over_aligned(i
; GFX11-GISEL-NEXT: s_and_b32 s0, s0, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 128, addrspace(5)
store volatile i32 10, ptr addrspace(5) %alloca
@@ -203,7 +203,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_under_aligned(
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s5, 6
; GFX9-GISEL-NEXT: s_mov_b32 s33, 0
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -242,7 +242,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_uniform_under_aligned(
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s0, s1
+; GFX11-GISEL-NEXT: s_add_u32 s32, s0, s1
; GFX11-GISEL-NEXT: s_endpgm
%alloca = alloca i32, i32 %n, align 2, addrspace(5)
store volatile i32 22, ptr addrspace(5) %alloca
@@ -299,7 +299,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent() {
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -354,7 +354,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -411,12 +411,12 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_over_aligned
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB4_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -470,11 +470,11 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_over_aligned
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB4_1
; GFX11-GISEL-NEXT: ; %bb.2:
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
-; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -533,7 +533,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_under_aligne
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -586,7 +586,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_divergent_under_aligne
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_endpgm
@@ -668,10 +668,10 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s5, s5, 15
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, -16
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s5, 6
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xfffff000
; GFX9-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s6
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s6
; GFX9-GISEL-NEXT: v_and_b32_e32 v0, -16, v0
; GFX9-GISEL-NEXT: s_mov_b64 s[6:7], exec
; GFX9-GISEL-NEXT: .LBB6_2: ; =>This Inner Loop Header: Depth=1
@@ -690,7 +690,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB6_4: ; %bb.1
@@ -703,7 +703,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_endpgm
@@ -777,14 +777,14 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: ; %bb.1: ; %bb.0
; GFX11-GISEL-NEXT: v_and_b32_e32 v0, 0x3ff, v0
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s1, s1, 15
-; GFX11-GISEL-NEXT: s_add_i32 s3, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_u32 s3, s32, 0x7ff
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, -16
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(SKIP_4) | instid1(VALU_DEP_1)
; GFX11-GISEL-NEXT: s_lshl_b32 s4, s1, 5
; GFX11-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
; GFX11-GISEL-NEXT: s_and_b32 s1, s3, 0xfffff800
; GFX11-GISEL-NEXT: s_mov_b32 s3, exec_lo
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s4
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s4
; GFX11-GISEL-NEXT: v_and_b32_e32 v0, -16, v0
; GFX11-GISEL-NEXT: .LBB6_2: ; =>This Inner Loop Header: Depth=1
; GFX11-GISEL-NEXT: s_ctz_i32_b32 s4, s3
@@ -802,7 +802,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s3 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s3, s2
+; GFX11-GISEL-NEXT: s_add_u32 s32, s3, s2
; GFX11-GISEL-NEXT: .LBB6_4: ; %bb.1
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s0, s0, 15
; GFX11-GISEL-NEXT: v_dual_mov_b32 v0, 1 :: v_dual_mov_b32 v1, 2
@@ -813,7 +813,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_multiple_allocas(i32 %
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: s_endpgm
entry:
%cond = icmp eq i32 %n, 0
@@ -910,7 +910,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s6
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s6
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: s_mov_b32 s4, 0
@@ -920,13 +920,13 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB7_6
; GFX9-GISEL-NEXT: ; %bb.5: ; %bb.0
; GFX9-GISEL-NEXT: s_lshl2_add_u32 s4, s5, 15
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s4, s4, -16
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xfffff000
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s4, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB7_6: ; %bb.2
@@ -1008,7 +1008,7 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX11-GISEL-NEXT: s_mov_b32 s3, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s2, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s3, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s3, s0
; GFX11-GISEL-NEXT: s_mov_b32 s0, 0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s3 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
@@ -1020,13 +1020,13 @@ define amdgpu_kernel void @test_dynamic_stackalloc_kernel_control_flow(i32 %n, i
; GFX11-GISEL-NEXT: ; %bb.5: ; %bb.0
; GFX11-GISEL-NEXT: s_lshl2_add_u32 s0, s1, 15
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 2
-; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0x7ff
; GFX11-GISEL-NEXT: s_and_b32 s0, s0, -16
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff800
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: .LBB7_6: ; %bb.2
; GFX11-GISEL-NEXT: s_endpgm
entry:
@@ -1095,7 +1095,7 @@ define void @test_dynamic_stackalloc_device_uniform(i32 %n) {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1156,7 +1156,7 @@ define void @test_dynamic_stackalloc_device_uniform(i32 %n) {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1223,10 +1223,10 @@ define void @test_dynamic_stackalloc_device_uniform_over_aligned(i32 %n) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB9_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 10
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1292,12 +1292,12 @@ define void @test_dynamic_stackalloc_device_uniform_over_aligned(i32 %n) {
; GFX11-GISEL-NEXT: s_cmp_lg_u32 s1, 0
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB9_1
; GFX11-GISEL-NEXT: ; %bb.2:
-; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 10
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_mov_b32 s33, s4
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
@@ -1358,7 +1358,7 @@ define void @test_dynamic_stackalloc_device_uniform_under_aligned(i32 %n) {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 22
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1419,7 +1419,7 @@ define void @test_dynamic_stackalloc_device_uniform_under_aligned(i32 %n) {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1482,7 +1482,7 @@ define void @test_dynamic_stackalloc_device_divergent() {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x7b
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1547,7 +1547,7 @@ define void @test_dynamic_stackalloc_device_divergent() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1617,10 +1617,10 @@ define void @test_dynamic_stackalloc_device_divergent_over_aligned() {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[4:5], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB12_1
; GFX9-GISEL-NEXT: ; %bb.2:
-; GFX9-GISEL-NEXT: s_add_i32 s5, s32, 0x1fff
+; GFX9-GISEL-NEXT: s_add_u32 s5, s32, 0x1fff
; GFX9-GISEL-NEXT: s_lshl_b32 s4, s6, 6
; GFX9-GISEL-NEXT: s_and_b32 s5, s5, 0xffffe000
-; GFX9-GISEL-NEXT: s_add_i32 s32, s5, s4
+; GFX9-GISEL-NEXT: s_add_u32 s32, s5, s4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1689,12 +1689,12 @@ define void @test_dynamic_stackalloc_device_divergent_over_aligned() {
; GFX11-GISEL-NEXT: s_cmp_lg_u32 s1, 0
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB12_1
; GFX11-GISEL-NEXT: ; %bb.2:
-; GFX11-GISEL-NEXT: s_add_i32 s1, s32, 0xfff
+; GFX11-GISEL-NEXT: s_add_u32 s1, s32, 0xfff
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 0x1bc
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_and_b32 s1, s1, 0xfffff000
; GFX11-GISEL-NEXT: s_mov_b32 s33, s4
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
@@ -1758,7 +1758,7 @@ define void @test_dynamic_stackalloc_device_divergent_under_aligned() {
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -1823,7 +1823,7 @@ define void @test_dynamic_stackalloc_device_divergent_under_aligned() {
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -1946,11 +1946,11 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[6:7], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB14_2
; GFX9-GISEL-NEXT: ; %bb.3:
-; GFX9-GISEL-NEXT: s_add_i32 s7, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_u32 s7, s32, 0xfff
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s9, 6
; GFX9-GISEL-NEXT: s_and_b32 s9, s7, 0xfffff000
; GFX9-GISEL-NEXT: v_lshl_add_u32 v1, v2, 2, 15
-; GFX9-GISEL-NEXT: s_add_i32 s32, s9, s6
+; GFX9-GISEL-NEXT: s_add_u32 s32, s9, s6
; GFX9-GISEL-NEXT: v_and_b32_e32 v1, -16, v1
; GFX9-GISEL-NEXT: s_mov_b64 s[6:7], exec
; GFX9-GISEL-NEXT: s_mov_b32 s10, 0
@@ -1970,7 +1970,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, 4
; GFX9-GISEL-NEXT: v_mov_b32_e32 v2, s6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v1, v2, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB14_6: ; %bb.1
@@ -1989,7 +1989,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: buffer_store_dword v0, off, s[0:3], s33
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
@@ -2117,12 +2117,12 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: ; %bb.3:
; GFX11-GISEL-NEXT: v_lshl_add_u32 v1, v1, 2, 15
; GFX11-GISEL-NEXT: s_lshl_b32 s5, s2, 5
-; GFX11-GISEL-NEXT: s_add_i32 s2, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_u32 s2, s32, 0x7ff
; GFX11-GISEL-NEXT: s_mov_b32 s4, exec_lo
; GFX11-GISEL-NEXT: s_and_b32 s2, s2, 0xfffff800
; GFX11-GISEL-NEXT: v_and_b32_e32 v1, -16, v1
; GFX11-GISEL-NEXT: s_mov_b32 s3, 0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s5
+; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s5
; GFX11-GISEL-NEXT: .LBB14_4: ; =>This Inner Loop Header: Depth=1
; GFX11-GISEL-NEXT: s_ctz_i32_b32 s5, s4
; GFX11-GISEL-NEXT: s_delay_alu instid0(VALU_DEP_1) | instid1(SALU_CYCLE_1)
@@ -2139,7 +2139,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v2, s4 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s4, s3
+; GFX11-GISEL-NEXT: s_add_u32 s32, s4, s3
; GFX11-GISEL-NEXT: .LBB14_6: ; %bb.1
; GFX11-GISEL-NEXT: s_or_b32 exec_lo, exec_lo, s1
; GFX11-GISEL-NEXT: v_lshl_add_u32 v0, v0, 2, 15
@@ -2162,7 +2162,7 @@ define void @test_dynamic_stackalloc_device_multiple_allocas(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v1, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s34
; GFX11-GISEL-NEXT: s_mov_b32 s34, s8
; GFX11-GISEL-NEXT: s_mov_b32 s33, s7
@@ -2279,12 +2279,12 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_cmp_lg_u64 s[6:7], 0
; GFX9-GISEL-NEXT: s_cbranch_scc1 .LBB15_2
; GFX9-GISEL-NEXT: ; %bb.3:
-; GFX9-GISEL-NEXT: s_add_i32 s7, s32, 0xfff
+; GFX9-GISEL-NEXT: s_add_u32 s7, s32, 0xfff
; GFX9-GISEL-NEXT: s_and_b32 s7, s7, 0xfffff000
; GFX9-GISEL-NEXT: s_lshl_b32 s6, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 2
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s7
-; GFX9-GISEL-NEXT: s_add_i32 s32, s7, s6
+; GFX9-GISEL-NEXT: s_add_u32 s32, s7, s6
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: ; implicit-def: $vgpr31
@@ -2309,7 +2309,7 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX9-GISEL-NEXT: s_lshl_b32 s7, s8, 6
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 1
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s6, s7
+; GFX9-GISEL-NEXT: s_add_u32 s32, s6, s7
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
; GFX9-GISEL-NEXT: s_waitcnt vmcnt(0)
; GFX9-GISEL-NEXT: .LBB15_8: ; %bb.2
@@ -2418,12 +2418,12 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_cbranch_scc1 .LBB15_2
; GFX11-GISEL-NEXT: ; %bb.3:
; GFX11-GISEL-NEXT: v_mov_b32_e32 v0, 2
-; GFX11-GISEL-NEXT: s_add_i32 s2, s32, 0x7ff
+; GFX11-GISEL-NEXT: s_add_u32 s2, s32, 0x7ff
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_and_b32 s2, s2, 0xfffff800
; GFX11-GISEL-NEXT: ; implicit-def: $vgpr31
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s1
+; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s1
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s2 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: .LBB15_4: ; %Flow
@@ -2449,7 +2449,7 @@ define void @test_dynamic_stackalloc_device_control_flow(i32 %n, i32 %m) {
; GFX11-GISEL-NEXT: s_mov_b32 s2, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s1, s1, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s2, s1
+; GFX11-GISEL-NEXT: s_add_u32 s32, s2, s1
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s2 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: .LBB15_8: ; %bb.2
@@ -2526,7 +2526,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i16(i16
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -2591,7 +2591,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i16(i16
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
@@ -2652,7 +2652,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i64(i64
; GFX9-GISEL-NEXT: ; %bb.2:
; GFX9-GISEL-NEXT: s_mov_b32 s4, s32
; GFX9-GISEL-NEXT: s_lshl_b32 s5, s6, 6
-; GFX9-GISEL-NEXT: s_add_i32 s32, s4, s5
+; GFX9-GISEL-NEXT: s_add_u32 s32, s4, s5
; GFX9-GISEL-NEXT: v_mov_b32_e32 v0, 0x29a
; GFX9-GISEL-NEXT: v_mov_b32_e32 v1, s4
; GFX9-GISEL-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
@@ -2713,7 +2713,7 @@ define void @test_dynamic_stackalloc_device_divergent_non_standard_size_i64(i64
; GFX11-GISEL-NEXT: s_mov_b32 s1, s32
; GFX11-GISEL-NEXT: s_lshl_b32 s0, s0, 5
; GFX11-GISEL-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-GISEL-NEXT: s_add_i32 s32, s1, s0
+; GFX11-GISEL-NEXT: s_add_u32 s32, s1, s0
; GFX11-GISEL-NEXT: scratch_store_b32 off, v0, s1 dlc
; GFX11-GISEL-NEXT: s_waitcnt_vscnt null, 0x0
; GFX11-GISEL-NEXT: s_mov_b32 s32, s33
diff --git a/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll b/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
index 39f9c866f8131..befe0d405307b 100644
--- a/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
+++ b/llvm/test/CodeGen/AMDGPU/isel-amdgpu-cs-chain-cc.ll
@@ -1084,401 +1084,401 @@ define amdgpu_cs_chain void @amdgpu_cs_chain_many_regs(<36 x i32> inreg %a, <128
; GISEL-GFX11-NEXT: [[COPY165:%[0-9]+]]:vgpr_32 = COPY [[COPY32]]
; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY165]], [[COPY164]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_:%[0-9]+]]:sreg_32 = S_MOV_B32 4
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY166:%[0-9]+]]:vgpr_32 = COPY [[COPY33]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY166]], [[S_ADD_I32_]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 4, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY166]], [[S_ADD_U32_]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 4, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_1:%[0-9]+]]:sreg_32 = S_MOV_B32 8
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_1:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_1]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_1:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_1]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY167:%[0-9]+]]:vgpr_32 = COPY [[COPY34]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY167]], [[S_ADD_I32_1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 8, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY167]], [[S_ADD_U32_1]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 8, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_2:%[0-9]+]]:sreg_32 = S_MOV_B32 12
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_2:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_2]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_2:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_2]], implicit-def dead $scc
; GISEL-GFX11-NEXT: [[COPY168:%[0-9]+]]:vgpr_32 = COPY [[COPY35]]
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY168]], [[S_ADD_I32_2]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 12, addrspace 5)
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY168]], [[S_ADD_U32_2]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 12, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_3:%[0-9]+]]:sreg_32 = S_MOV_B32 16
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_3:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_3]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY36]], [[S_ADD_I32_3]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 16, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_3:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_3]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY36]], [[S_ADD_U32_3]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 16, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_4:%[0-9]+]]:sreg_32 = S_MOV_B32 20
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_4:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_4]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY37]], [[S_ADD_I32_4]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 20, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_4:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_4]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY37]], [[S_ADD_U32_4]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 20, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_5:%[0-9]+]]:sreg_32 = S_MOV_B32 24
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_5:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_5]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY38]], [[S_ADD_I32_5]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 24, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_5:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_5]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY38]], [[S_ADD_U32_5]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 24, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_6:%[0-9]+]]:sreg_32 = S_MOV_B32 28
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_6:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_6]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY39]], [[S_ADD_I32_6]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 28, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_6:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_6]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY39]], [[S_ADD_U32_6]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 28, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_7:%[0-9]+]]:sreg_32 = S_MOV_B32 32
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_7:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_7]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY40]], [[S_ADD_I32_7]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 32, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_7:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_7]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY40]], [[S_ADD_U32_7]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 32, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_8:%[0-9]+]]:sreg_32 = S_MOV_B32 36
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_8:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_8]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY41]], [[S_ADD_I32_8]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 36, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_8:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_8]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY41]], [[S_ADD_U32_8]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 36, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_9:%[0-9]+]]:sreg_32 = S_MOV_B32 40
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_9:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_9]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY42]], [[S_ADD_I32_9]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 40, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_9:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_9]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY42]], [[S_ADD_U32_9]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 40, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_10:%[0-9]+]]:sreg_32 = S_MOV_B32 44
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_10:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_10]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY43]], [[S_ADD_I32_10]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 44, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_10:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_10]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY43]], [[S_ADD_U32_10]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 44, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_11:%[0-9]+]]:sreg_32 = S_MOV_B32 48
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_11:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_11]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY44]], [[S_ADD_I32_11]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 48, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_11:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_11]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY44]], [[S_ADD_U32_11]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 48, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_12:%[0-9]+]]:sreg_32 = S_MOV_B32 52
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_12:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_12]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY45]], [[S_ADD_I32_12]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 52, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_12:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_12]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY45]], [[S_ADD_U32_12]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 52, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_13:%[0-9]+]]:sreg_32 = S_MOV_B32 56
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_13:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_13]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY46]], [[S_ADD_I32_13]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 56, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_13:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_13]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY46]], [[S_ADD_U32_13]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 56, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_14:%[0-9]+]]:sreg_32 = S_MOV_B32 60
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_14:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_14]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY47]], [[S_ADD_I32_14]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 60, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_14:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_14]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY47]], [[S_ADD_U32_14]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 60, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_15:%[0-9]+]]:sreg_32 = S_MOV_B32 64
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_15:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_15]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY48]], [[S_ADD_I32_15]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 64, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_15:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_15]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY48]], [[S_ADD_U32_15]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 64, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_16:%[0-9]+]]:sreg_32 = S_MOV_B32 68
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_16:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_16]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY49]], [[S_ADD_I32_16]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 68, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_16:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_16]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY49]], [[S_ADD_U32_16]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 68, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_17:%[0-9]+]]:sreg_32 = S_MOV_B32 72
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_17:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_17]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY50]], [[S_ADD_I32_17]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 72, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_17:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_17]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY50]], [[S_ADD_U32_17]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 72, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_18:%[0-9]+]]:sreg_32 = S_MOV_B32 76
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_18:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_18]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY51]], [[S_ADD_I32_18]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 76, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_18:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_18]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY51]], [[S_ADD_U32_18]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 76, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_19:%[0-9]+]]:sreg_32 = S_MOV_B32 80
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_19:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_19]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY52]], [[S_ADD_I32_19]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 80, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_19:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_19]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY52]], [[S_ADD_U32_19]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 80, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_20:%[0-9]+]]:sreg_32 = S_MOV_B32 84
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_20:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_20]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY53]], [[S_ADD_I32_20]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 84, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_20:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_20]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY53]], [[S_ADD_U32_20]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 84, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_21:%[0-9]+]]:sreg_32 = S_MOV_B32 88
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_21:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_21]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY54]], [[S_ADD_I32_21]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 88, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_21:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_21]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY54]], [[S_ADD_U32_21]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 88, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_22:%[0-9]+]]:sreg_32 = S_MOV_B32 92
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_22:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_22]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY55]], [[S_ADD_I32_22]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 92, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_22:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_22]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY55]], [[S_ADD_U32_22]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 92, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_23:%[0-9]+]]:sreg_32 = S_MOV_B32 96
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_23:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_23]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY56]], [[S_ADD_I32_23]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 96, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_23:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_23]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY56]], [[S_ADD_U32_23]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 96, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_24:%[0-9]+]]:sreg_32 = S_MOV_B32 100
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_24:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_24]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY57]], [[S_ADD_I32_24]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 100, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_24:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_24]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY57]], [[S_ADD_U32_24]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 100, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_25:%[0-9]+]]:sreg_32 = S_MOV_B32 104
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_25:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_25]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY58]], [[S_ADD_I32_25]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 104, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_25:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_25]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY58]], [[S_ADD_U32_25]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 104, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_26:%[0-9]+]]:sreg_32 = S_MOV_B32 108
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_26:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_26]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY59]], [[S_ADD_I32_26]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 108, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_26:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_26]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY59]], [[S_ADD_U32_26]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 108, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_27:%[0-9]+]]:sreg_32 = S_MOV_B32 112
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_27:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_27]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY60]], [[S_ADD_I32_27]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 112, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_27:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_27]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY60]], [[S_ADD_U32_27]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 112, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_28:%[0-9]+]]:sreg_32 = S_MOV_B32 116
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_28:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_28]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY61]], [[S_ADD_I32_28]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 116, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_28:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_28]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY61]], [[S_ADD_U32_28]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 116, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_29:%[0-9]+]]:sreg_32 = S_MOV_B32 120
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_29:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_29]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY62]], [[S_ADD_I32_29]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 120, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_29:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_29]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY62]], [[S_ADD_U32_29]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 120, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_30:%[0-9]+]]:sreg_32 = S_MOV_B32 124
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_30:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_30]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY63]], [[S_ADD_I32_30]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 124, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_30:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_30]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY63]], [[S_ADD_U32_30]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 124, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_31:%[0-9]+]]:sreg_32 = S_MOV_B32 128
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_31:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_31]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY64]], [[S_ADD_I32_31]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 128, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_31:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_31]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY64]], [[S_ADD_U32_31]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 128, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_32:%[0-9]+]]:sreg_32 = S_MOV_B32 132
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_32:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_32]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY65]], [[S_ADD_I32_32]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 132, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_32:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_32]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY65]], [[S_ADD_U32_32]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 132, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_33:%[0-9]+]]:sreg_32 = S_MOV_B32 136
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_33:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_33]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY66]], [[S_ADD_I32_33]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 136, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_33:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_33]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY66]], [[S_ADD_U32_33]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 136, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_34:%[0-9]+]]:sreg_32 = S_MOV_B32 140
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_34:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_34]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY67]], [[S_ADD_I32_34]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 140, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_34:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_34]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY67]], [[S_ADD_U32_34]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 140, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_35:%[0-9]+]]:sreg_32 = S_MOV_B32 144
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_35:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_35]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY68]], [[S_ADD_I32_35]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 144, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_35:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_35]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY68]], [[S_ADD_U32_35]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 144, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_36:%[0-9]+]]:sreg_32 = S_MOV_B32 148
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_36:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_36]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY69]], [[S_ADD_I32_36]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 148, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_36:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_36]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY69]], [[S_ADD_U32_36]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 148, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_37:%[0-9]+]]:sreg_32 = S_MOV_B32 152
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_37:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_37]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY70]], [[S_ADD_I32_37]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 152, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_37:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_37]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY70]], [[S_ADD_U32_37]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 152, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_38:%[0-9]+]]:sreg_32 = S_MOV_B32 156
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_38:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_38]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY71]], [[S_ADD_I32_38]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 156, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_38:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_38]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY71]], [[S_ADD_U32_38]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 156, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_39:%[0-9]+]]:sreg_32 = S_MOV_B32 160
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_39:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_39]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY72]], [[S_ADD_I32_39]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 160, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_39:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_39]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY72]], [[S_ADD_U32_39]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 160, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_40:%[0-9]+]]:sreg_32 = S_MOV_B32 164
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_40:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_40]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY73]], [[S_ADD_I32_40]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 164, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_40:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_40]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY73]], [[S_ADD_U32_40]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 164, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_41:%[0-9]+]]:sreg_32 = S_MOV_B32 168
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_41:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_41]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY74]], [[S_ADD_I32_41]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 168, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_41:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_41]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY74]], [[S_ADD_U32_41]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 168, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_42:%[0-9]+]]:sreg_32 = S_MOV_B32 172
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_42:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_42]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY75]], [[S_ADD_I32_42]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 172, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_42:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_42]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY75]], [[S_ADD_U32_42]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 172, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_43:%[0-9]+]]:sreg_32 = S_MOV_B32 176
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_43:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_43]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY76]], [[S_ADD_I32_43]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 176, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_43:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_43]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY76]], [[S_ADD_U32_43]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 176, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_44:%[0-9]+]]:sreg_32 = S_MOV_B32 180
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_44:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_44]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY77]], [[S_ADD_I32_44]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 180, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_44:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_44]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY77]], [[S_ADD_U32_44]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 180, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_45:%[0-9]+]]:sreg_32 = S_MOV_B32 184
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_45:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_45]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY78]], [[S_ADD_I32_45]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 184, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_45:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_45]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY78]], [[S_ADD_U32_45]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 184, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_46:%[0-9]+]]:sreg_32 = S_MOV_B32 188
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_46:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_46]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY79]], [[S_ADD_I32_46]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 188, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_46:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_46]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY79]], [[S_ADD_U32_46]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 188, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_47:%[0-9]+]]:sreg_32 = S_MOV_B32 192
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_47:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_47]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY80]], [[S_ADD_I32_47]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 192, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_47:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_47]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY80]], [[S_ADD_U32_47]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 192, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_48:%[0-9]+]]:sreg_32 = S_MOV_B32 196
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_48:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_48]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY81]], [[S_ADD_I32_48]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 196, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_48:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_48]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY81]], [[S_ADD_U32_48]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 196, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_49:%[0-9]+]]:sreg_32 = S_MOV_B32 200
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_49:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_49]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY82]], [[S_ADD_I32_49]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 200, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_49:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_49]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY82]], [[S_ADD_U32_49]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 200, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_50:%[0-9]+]]:sreg_32 = S_MOV_B32 204
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_50:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_50]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY83]], [[S_ADD_I32_50]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 204, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_50:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_50]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY83]], [[S_ADD_U32_50]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 204, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_51:%[0-9]+]]:sreg_32 = S_MOV_B32 208
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_51:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_51]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY84]], [[S_ADD_I32_51]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 208, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_51:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_51]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY84]], [[S_ADD_U32_51]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 208, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_52:%[0-9]+]]:sreg_32 = S_MOV_B32 212
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_52:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_52]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY85]], [[S_ADD_I32_52]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 212, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_52:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_52]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY85]], [[S_ADD_U32_52]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 212, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_53:%[0-9]+]]:sreg_32 = S_MOV_B32 216
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_53:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_53]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY86]], [[S_ADD_I32_53]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 216, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_53:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_53]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY86]], [[S_ADD_U32_53]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 216, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_54:%[0-9]+]]:sreg_32 = S_MOV_B32 220
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_54:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_54]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY87]], [[S_ADD_I32_54]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 220, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_54:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_54]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY87]], [[S_ADD_U32_54]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 220, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_55:%[0-9]+]]:sreg_32 = S_MOV_B32 224
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_55:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_55]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY88]], [[S_ADD_I32_55]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 224, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_55:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_55]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY88]], [[S_ADD_U32_55]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 224, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_56:%[0-9]+]]:sreg_32 = S_MOV_B32 228
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_56:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_56]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY89]], [[S_ADD_I32_56]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 228, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_56:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_56]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY89]], [[S_ADD_U32_56]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 228, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_57:%[0-9]+]]:sreg_32 = S_MOV_B32 232
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_57:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_57]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY90]], [[S_ADD_I32_57]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 232, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_57:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_57]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY90]], [[S_ADD_U32_57]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 232, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_58:%[0-9]+]]:sreg_32 = S_MOV_B32 236
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_58:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_58]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY91]], [[S_ADD_I32_58]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 236, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_58:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_58]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY91]], [[S_ADD_U32_58]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 236, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_59:%[0-9]+]]:sreg_32 = S_MOV_B32 240
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_59:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_59]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY92]], [[S_ADD_I32_59]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 240, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_59:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_59]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY92]], [[S_ADD_U32_59]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 240, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_60:%[0-9]+]]:sreg_32 = S_MOV_B32 244
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_60:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_60]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY93]], [[S_ADD_I32_60]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 244, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_60:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_60]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY93]], [[S_ADD_U32_60]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 244, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_61:%[0-9]+]]:sreg_32 = S_MOV_B32 248
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_61:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_61]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY94]], [[S_ADD_I32_61]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 248, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_61:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_61]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY94]], [[S_ADD_U32_61]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 248, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_62:%[0-9]+]]:sreg_32 = S_MOV_B32 252
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_62:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_62]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY95]], [[S_ADD_I32_62]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 252, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_62:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_62]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY95]], [[S_ADD_U32_62]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 252, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_63:%[0-9]+]]:sreg_32 = S_MOV_B32 256
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_63:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_63]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY96]], [[S_ADD_I32_63]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 256, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_63:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_63]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY96]], [[S_ADD_U32_63]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 256, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_64:%[0-9]+]]:sreg_32 = S_MOV_B32 260
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_64:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_64]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY97]], [[S_ADD_I32_64]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 260, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_64:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_64]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY97]], [[S_ADD_U32_64]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 260, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_65:%[0-9]+]]:sreg_32 = S_MOV_B32 264
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_65:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_65]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY98]], [[S_ADD_I32_65]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 264, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_65:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_65]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY98]], [[S_ADD_U32_65]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 264, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_66:%[0-9]+]]:sreg_32 = S_MOV_B32 268
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_66:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_66]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY99]], [[S_ADD_I32_66]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 268, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_66:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_66]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY99]], [[S_ADD_U32_66]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 268, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_67:%[0-9]+]]:sreg_32 = S_MOV_B32 272
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_67:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_67]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY100]], [[S_ADD_I32_67]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 272, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_67:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_67]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY100]], [[S_ADD_U32_67]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 272, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_68:%[0-9]+]]:sreg_32 = S_MOV_B32 276
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_68:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_68]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY101]], [[S_ADD_I32_68]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 276, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_68:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_68]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY101]], [[S_ADD_U32_68]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 276, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_69:%[0-9]+]]:sreg_32 = S_MOV_B32 280
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_69:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_69]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY102]], [[S_ADD_I32_69]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 280, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_69:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_69]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY102]], [[S_ADD_U32_69]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 280, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_70:%[0-9]+]]:sreg_32 = S_MOV_B32 284
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_70:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_70]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY103]], [[S_ADD_I32_70]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 284, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_70:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_70]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY103]], [[S_ADD_U32_70]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 284, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_71:%[0-9]+]]:sreg_32 = S_MOV_B32 288
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_71:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_71]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY104]], [[S_ADD_I32_71]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 288, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_71:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_71]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY104]], [[S_ADD_U32_71]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 288, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_72:%[0-9]+]]:sreg_32 = S_MOV_B32 292
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_72:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_72]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY105]], [[S_ADD_I32_72]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 292, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_72:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_72]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY105]], [[S_ADD_U32_72]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 292, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_73:%[0-9]+]]:sreg_32 = S_MOV_B32 296
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_73:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_73]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY106]], [[S_ADD_I32_73]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 296, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_73:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_73]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY106]], [[S_ADD_U32_73]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 296, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_74:%[0-9]+]]:sreg_32 = S_MOV_B32 300
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_74:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_74]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY107]], [[S_ADD_I32_74]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 300, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_74:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_74]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY107]], [[S_ADD_U32_74]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 300, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_75:%[0-9]+]]:sreg_32 = S_MOV_B32 304
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_75:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_75]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY108]], [[S_ADD_I32_75]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 304, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_75:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_75]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY108]], [[S_ADD_U32_75]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 304, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_76:%[0-9]+]]:sreg_32 = S_MOV_B32 308
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_76:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_76]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY109]], [[S_ADD_I32_76]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 308, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_76:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_76]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY109]], [[S_ADD_U32_76]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 308, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_77:%[0-9]+]]:sreg_32 = S_MOV_B32 312
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_77:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_77]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY110]], [[S_ADD_I32_77]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 312, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_77:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_77]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY110]], [[S_ADD_U32_77]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 312, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_78:%[0-9]+]]:sreg_32 = S_MOV_B32 316
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_78:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_78]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY111]], [[S_ADD_I32_78]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 316, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_78:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_78]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY111]], [[S_ADD_U32_78]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 316, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_79:%[0-9]+]]:sreg_32 = S_MOV_B32 320
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_79:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_79]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY112]], [[S_ADD_I32_79]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 320, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_79:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_79]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY112]], [[S_ADD_U32_79]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 320, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_80:%[0-9]+]]:sreg_32 = S_MOV_B32 324
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_80:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_80]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY113]], [[S_ADD_I32_80]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 324, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_80:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_80]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY113]], [[S_ADD_U32_80]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 324, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_81:%[0-9]+]]:sreg_32 = S_MOV_B32 328
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_81:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_81]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY114]], [[S_ADD_I32_81]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 328, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_81:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_81]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY114]], [[S_ADD_U32_81]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 328, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_82:%[0-9]+]]:sreg_32 = S_MOV_B32 332
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_82:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_82]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY115]], [[S_ADD_I32_82]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 332, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_82:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_82]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY115]], [[S_ADD_U32_82]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 332, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_83:%[0-9]+]]:sreg_32 = S_MOV_B32 336
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_83:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_83]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY116]], [[S_ADD_I32_83]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 336, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_83:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_83]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY116]], [[S_ADD_U32_83]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 336, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_84:%[0-9]+]]:sreg_32 = S_MOV_B32 340
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_84:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_84]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY117]], [[S_ADD_I32_84]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 340, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_84:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_84]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY117]], [[S_ADD_U32_84]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 340, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_85:%[0-9]+]]:sreg_32 = S_MOV_B32 344
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_85:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_85]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY118]], [[S_ADD_I32_85]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 344, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_85:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_85]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY118]], [[S_ADD_U32_85]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 344, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_86:%[0-9]+]]:sreg_32 = S_MOV_B32 348
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_86:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_86]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY119]], [[S_ADD_I32_86]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 348, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_86:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_86]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY119]], [[S_ADD_U32_86]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 348, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_87:%[0-9]+]]:sreg_32 = S_MOV_B32 352
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_87:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_87]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY120]], [[S_ADD_I32_87]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 352, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_87:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_87]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY120]], [[S_ADD_U32_87]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 352, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_88:%[0-9]+]]:sreg_32 = S_MOV_B32 356
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_88:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_88]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY121]], [[S_ADD_I32_88]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 356, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_88:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_88]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY121]], [[S_ADD_U32_88]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 356, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_89:%[0-9]+]]:sreg_32 = S_MOV_B32 360
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_89:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_89]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY122]], [[S_ADD_I32_89]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 360, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_89:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_89]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY122]], [[S_ADD_U32_89]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 360, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_90:%[0-9]+]]:sreg_32 = S_MOV_B32 364
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_90:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_90]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY123]], [[S_ADD_I32_90]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 364, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_90:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_90]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY123]], [[S_ADD_U32_90]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 364, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_91:%[0-9]+]]:sreg_32 = S_MOV_B32 368
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_91:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_91]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY124]], [[S_ADD_I32_91]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 368, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_91:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_91]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY124]], [[S_ADD_U32_91]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 368, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_92:%[0-9]+]]:sreg_32 = S_MOV_B32 372
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_92:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_92]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY125]], [[S_ADD_I32_92]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 372, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_92:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_92]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY125]], [[S_ADD_U32_92]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 372, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_93:%[0-9]+]]:sreg_32 = S_MOV_B32 376
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_93:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_93]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY126]], [[S_ADD_I32_93]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 376, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_93:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_93]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY126]], [[S_ADD_U32_93]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 376, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_94:%[0-9]+]]:sreg_32 = S_MOV_B32 380
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_94:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_94]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY127]], [[S_ADD_I32_94]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 380, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_94:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_94]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY127]], [[S_ADD_U32_94]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 380, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_95:%[0-9]+]]:sreg_32 = S_MOV_B32 384
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_95:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_95]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY128]], [[S_ADD_I32_95]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 384, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_95:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_95]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY128]], [[S_ADD_U32_95]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 384, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_96:%[0-9]+]]:sreg_32 = S_MOV_B32 388
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_96:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_96]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY129]], [[S_ADD_I32_96]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 388, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_96:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_96]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY129]], [[S_ADD_U32_96]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 388, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_97:%[0-9]+]]:sreg_32 = S_MOV_B32 392
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_97:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_97]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY130]], [[S_ADD_I32_97]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 392, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_97:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_97]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY130]], [[S_ADD_U32_97]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 392, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_98:%[0-9]+]]:sreg_32 = S_MOV_B32 396
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_98:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_98]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY131]], [[S_ADD_I32_98]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 396, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_98:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_98]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY131]], [[S_ADD_U32_98]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 396, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_99:%[0-9]+]]:sreg_32 = S_MOV_B32 400
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_99:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_99]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY132]], [[S_ADD_I32_99]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 400, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_99:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_99]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY132]], [[S_ADD_U32_99]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 400, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_100:%[0-9]+]]:sreg_32 = S_MOV_B32 404
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_100:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_100]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY133]], [[S_ADD_I32_100]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 404, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_100:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_100]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY133]], [[S_ADD_U32_100]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 404, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_101:%[0-9]+]]:sreg_32 = S_MOV_B32 408
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_101:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_101]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY134]], [[S_ADD_I32_101]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 408, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_101:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_101]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY134]], [[S_ADD_U32_101]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 408, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_102:%[0-9]+]]:sreg_32 = S_MOV_B32 412
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_102:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_102]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY135]], [[S_ADD_I32_102]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 412, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_102:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_102]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY135]], [[S_ADD_U32_102]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 412, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_103:%[0-9]+]]:sreg_32 = S_MOV_B32 416
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_103:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_103]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY136]], [[S_ADD_I32_103]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 416, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_103:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_103]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY136]], [[S_ADD_U32_103]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 416, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_104:%[0-9]+]]:sreg_32 = S_MOV_B32 420
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_104:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_104]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY137]], [[S_ADD_I32_104]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 420, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_104:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_104]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY137]], [[S_ADD_U32_104]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 420, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_105:%[0-9]+]]:sreg_32 = S_MOV_B32 424
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_105:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_105]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY138]], [[S_ADD_I32_105]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 424, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_105:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_105]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY138]], [[S_ADD_U32_105]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 424, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_106:%[0-9]+]]:sreg_32 = S_MOV_B32 428
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_106:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_106]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY139]], [[S_ADD_I32_106]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 428, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_106:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_106]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY139]], [[S_ADD_U32_106]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 428, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_107:%[0-9]+]]:sreg_32 = S_MOV_B32 432
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_107:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_107]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY140]], [[S_ADD_I32_107]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 432, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_107:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_107]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY140]], [[S_ADD_U32_107]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 432, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_108:%[0-9]+]]:sreg_32 = S_MOV_B32 436
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_108:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_108]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY141]], [[S_ADD_I32_108]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 436, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_108:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_108]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY141]], [[S_ADD_U32_108]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 436, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_109:%[0-9]+]]:sreg_32 = S_MOV_B32 440
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_109:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_109]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY142]], [[S_ADD_I32_109]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 440, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_109:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_109]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY142]], [[S_ADD_U32_109]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 440, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_110:%[0-9]+]]:sreg_32 = S_MOV_B32 444
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_110:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_110]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY143]], [[S_ADD_I32_110]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 444, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_110:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_110]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY143]], [[S_ADD_U32_110]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 444, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_111:%[0-9]+]]:sreg_32 = S_MOV_B32 448
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_111:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_111]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY144]], [[S_ADD_I32_111]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 448, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_111:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_111]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY144]], [[S_ADD_U32_111]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 448, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_112:%[0-9]+]]:sreg_32 = S_MOV_B32 452
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_112:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_112]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY145]], [[S_ADD_I32_112]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 452, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_112:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_112]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY145]], [[S_ADD_U32_112]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 452, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_113:%[0-9]+]]:sreg_32 = S_MOV_B32 456
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_113:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_113]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY146]], [[S_ADD_I32_113]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 456, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_113:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_113]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY146]], [[S_ADD_U32_113]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 456, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_114:%[0-9]+]]:sreg_32 = S_MOV_B32 460
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_114:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_114]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY147]], [[S_ADD_I32_114]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 460, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_114:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_114]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY147]], [[S_ADD_U32_114]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 460, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_115:%[0-9]+]]:sreg_32 = S_MOV_B32 464
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_115:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_115]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY148]], [[S_ADD_I32_115]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 464, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_115:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_115]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY148]], [[S_ADD_U32_115]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 464, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_116:%[0-9]+]]:sreg_32 = S_MOV_B32 468
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_116:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_116]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY149]], [[S_ADD_I32_116]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 468, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_116:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_116]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY149]], [[S_ADD_U32_116]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 468, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_117:%[0-9]+]]:sreg_32 = S_MOV_B32 472
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_117:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_117]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY150]], [[S_ADD_I32_117]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 472, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_117:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_117]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY150]], [[S_ADD_U32_117]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 472, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_118:%[0-9]+]]:sreg_32 = S_MOV_B32 476
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_118:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_118]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY151]], [[S_ADD_I32_118]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 476, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_118:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_118]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY151]], [[S_ADD_U32_118]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 476, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_119:%[0-9]+]]:sreg_32 = S_MOV_B32 480
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_119:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_119]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY152]], [[S_ADD_I32_119]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 480, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_119:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_119]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY152]], [[S_ADD_U32_119]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 480, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_120:%[0-9]+]]:sreg_32 = S_MOV_B32 484
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_120:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_120]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY153]], [[S_ADD_I32_120]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 484, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_120:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_120]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY153]], [[S_ADD_U32_120]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 484, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_121:%[0-9]+]]:sreg_32 = S_MOV_B32 488
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_121:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_121]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY154]], [[S_ADD_I32_121]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 488, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_121:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_121]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY154]], [[S_ADD_U32_121]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 488, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_122:%[0-9]+]]:sreg_32 = S_MOV_B32 492
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_122:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_122]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY155]], [[S_ADD_I32_122]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 492, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_122:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_122]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY155]], [[S_ADD_U32_122]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 492, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_123:%[0-9]+]]:sreg_32 = S_MOV_B32 496
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_123:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_123]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY156]], [[S_ADD_I32_123]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 496, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_123:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_123]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY156]], [[S_ADD_U32_123]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 496, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_124:%[0-9]+]]:sreg_32 = S_MOV_B32 500
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_124:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_124]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY157]], [[S_ADD_I32_124]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 500, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_124:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_124]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY157]], [[S_ADD_U32_124]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 500, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_125:%[0-9]+]]:sreg_32 = S_MOV_B32 504
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_125:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_125]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY158]], [[S_ADD_I32_125]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 504, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_125:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_125]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY158]], [[S_ADD_U32_125]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 504, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_126:%[0-9]+]]:sreg_32 = S_MOV_B32 508
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_126:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_126]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY159]], [[S_ADD_I32_126]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 508, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_126:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_126]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY159]], [[S_ADD_U32_126]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 508, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_127:%[0-9]+]]:sreg_32 = S_MOV_B32 512
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_127:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_127]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY160]], [[S_ADD_I32_127]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 512, align 16, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_127:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_127]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY160]], [[S_ADD_U32_127]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 512, align 16, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_128:%[0-9]+]]:sreg_32 = S_MOV_B32 516
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_128:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_128]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY161]], [[S_ADD_I32_128]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 516, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_128:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_128]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY161]], [[S_ADD_U32_128]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 516, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_129:%[0-9]+]]:sreg_32 = S_MOV_B32 520
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_129:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_129]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY162]], [[S_ADD_I32_129]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 520, align 8, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_129:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_129]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY162]], [[S_ADD_U32_129]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 520, align 8, addrspace 5)
; GISEL-GFX11-NEXT: [[S_MOV_B32_130:%[0-9]+]]:sreg_32 = S_MOV_B32 524
- ; GISEL-GFX11-NEXT: [[S_ADD_I32_130:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_I32 [[COPY164]], [[S_MOV_B32_130]], implicit-def dead $scc
- ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY163]], [[S_ADD_I32_130]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 524, addrspace 5)
+ ; GISEL-GFX11-NEXT: [[S_ADD_U32_130:%[0-9]+]]:sreg_32_xexec_hi = S_ADD_U32 [[COPY164]], [[S_MOV_B32_130]], implicit-def dead $scc
+ ; GISEL-GFX11-NEXT: SCRATCH_STORE_DWORD_SADDR [[COPY163]], [[S_ADD_U32_130]], 0, 0, implicit $exec, implicit $flat_scr :: (store (s32) into stack + 524, addrspace 5)
; GISEL-GFX11-NEXT: $vgpr0 = COPY [[COPY]]
; GISEL-GFX11-NEXT: $vgpr1 = COPY [[COPY1]]
; GISEL-GFX11-NEXT: $vgpr2 = COPY [[COPY2]]
diff --git a/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll b/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
index 618456a9978f5..653d4b85a9a5b 100644
--- a/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
+++ b/llvm/test/CodeGen/AMDGPU/ptradd-sdag.ll
@@ -10,9 +10,8 @@
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -amdgpu-use-sdag-ptradd=1 < %s | FileCheck %s -check-prefixes=GFX12,GFX12_PTRADD
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -amdgpu-use-sdag-ptradd=0 < %s | FileCheck %s -check-prefixes=GFX12,GFX12_LEGACY
-; Tests for the ISD::PTRADD SelectionDAG opcode.
-
-; fiji is included since it does not have FeatureAddNoCarryInsts.
+; Tests for the ISD::PTRADD SelectionDAG opcode. This only tests 64-bit address
+; spaces since PTRADD is currently only used for these.
; Check that basic pointer arithmetic can be lowered.
define ptr @gep_as0(ptr %p, i64 %offset) {
@@ -85,80 +84,6 @@ entry:
ret ptr %gep2
}
-define ptr addrspace(3) @gep_as3(ptr addrspace(3) %p, i32 %offset) {
-; GFX8-LABEL: gep_as3:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v1
-; GFX8-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942_PTRADD-LABEL: gep_as3:
-; GFX942_PTRADD: ; %bb.0: ; %entry
-; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX942_PTRADD-NEXT: v_add_u32_e32 v0, v0, v1
-; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942_LEGACY-LABEL: gep_as3:
-; GFX942_LEGACY: ; %bb.0: ; %entry
-; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
-; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX10_PTRADD-LABEL: gep_as3:
-; GFX10_PTRADD: ; %bb.0: ; %entry
-; GFX10_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX10_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
-; GFX10_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX10_LEGACY-LABEL: gep_as3:
-; GFX10_LEGACY: ; %bb.0: ; %entry
-; GFX10_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
-; GFX10_LEGACY-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11_PTRADD-LABEL: gep_as3:
-; GFX11_PTRADD: ; %bb.0: ; %entry
-; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX11_PTRADD-NEXT: s_delay_alu instid0(VALU_DEP_1)
-; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
-; GFX11_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11_LEGACY-LABEL: gep_as3:
-; GFX11_LEGACY: ; %bb.0: ; %entry
-; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
-; GFX11_LEGACY-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX12_PTRADD-LABEL: gep_as3:
-; GFX12_PTRADD: ; %bb.0: ; %entry
-; GFX12_PTRADD-NEXT: s_wait_loadcnt_dscnt 0x0
-; GFX12_PTRADD-NEXT: s_wait_expcnt 0x0
-; GFX12_PTRADD-NEXT: s_wait_samplecnt 0x0
-; GFX12_PTRADD-NEXT: s_wait_bvhcnt 0x0
-; GFX12_PTRADD-NEXT: s_wait_kmcnt 0x0
-; GFX12_PTRADD-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX12_PTRADD-NEXT: s_delay_alu instid0(VALU_DEP_1)
-; GFX12_PTRADD-NEXT: v_add_nc_u32_e32 v0, v0, v1
-; GFX12_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX12_LEGACY-LABEL: gep_as3:
-; GFX12_LEGACY: ; %bb.0: ; %entry
-; GFX12_LEGACY-NEXT: s_wait_loadcnt_dscnt 0x0
-; GFX12_LEGACY-NEXT: s_wait_expcnt 0x0
-; GFX12_LEGACY-NEXT: s_wait_samplecnt 0x0
-; GFX12_LEGACY-NEXT: s_wait_bvhcnt 0x0
-; GFX12_LEGACY-NEXT: s_wait_kmcnt 0x0
-; GFX12_LEGACY-NEXT: v_lshl_add_u32 v0, v1, 2, v0
-; GFX12_LEGACY-NEXT: s_setpc_b64 s[30:31]
-entry:
- %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
- ret ptr addrspace(3) %gep
-}
-
define amdgpu_kernel void @gep_as0_uniform(ptr %p, i64 %offset, ptr %ret) {
; GFX8-LABEL: gep_as0_uniform:
; GFX8: ; %bb.0: ; %entry
@@ -250,78 +175,6 @@ entry:
ret void
}
-define amdgpu_kernel void @gep_as3_uniform(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
-; GFX8-LABEL: gep_as3_uniform:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
-; GFX8-NEXT: s_add_i32 s12, s12, s17
-; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
-; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
-; GFX8-NEXT: s_waitcnt lgkmcnt(0)
-; GFX8-NEXT: s_lshl_b32 s1, s1, 2
-; GFX8-NEXT: s_add_i32 s0, s0, s1
-; GFX8-NEXT: v_mov_b32_e32 v0, s2
-; GFX8-NEXT: v_mov_b32_e32 v1, s3
-; GFX8-NEXT: v_mov_b32_e32 v2, s0
-; GFX8-NEXT: flat_store_dword v[0:1], v2
-; GFX8-NEXT: s_endpgm
-;
-; GFX942-LABEL: gep_as3_uniform:
-; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
-; GFX942-NEXT: s_waitcnt lgkmcnt(0)
-; GFX942-NEXT: s_lshl_b32 s1, s1, 2
-; GFX942-NEXT: s_add_i32 s0, s0, s1
-; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[2:3]
-; GFX942-NEXT: v_mov_b32_e32 v2, s0
-; GFX942-NEXT: flat_store_dword v[0:1], v2
-; GFX942-NEXT: s_endpgm
-;
-; GFX10-LABEL: gep_as3_uniform:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_add_u32 s12, s12, s17
-; GFX10-NEXT: s_addc_u32 s13, s13, 0
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
-; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
-; GFX10-NEXT: s_waitcnt lgkmcnt(0)
-; GFX10-NEXT: s_lshl_b32 s1, s1, 2
-; GFX10-NEXT: v_mov_b32_e32 v0, s2
-; GFX10-NEXT: s_add_i32 s0, s0, s1
-; GFX10-NEXT: v_mov_b32_e32 v1, s3
-; GFX10-NEXT: v_mov_b32_e32 v2, s0
-; GFX10-NEXT: flat_store_dword v[0:1], v2
-; GFX10-NEXT: s_endpgm
-;
-; GFX11-LABEL: gep_as3_uniform:
-; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
-; GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; GFX11-NEXT: s_lshl_b32 s1, s1, 2
-; GFX11-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
-; GFX11-NEXT: s_add_i32 s0, s0, s1
-; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX11-NEXT: v_mov_b32_e32 v2, s0
-; GFX11-NEXT: flat_store_b32 v[0:1], v2
-; GFX11-NEXT: s_endpgm
-;
-; GFX12-LABEL: gep_as3_uniform:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: s_lshl_b32 s1, s1, 2
-; GFX12-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
-; GFX12-NEXT: s_add_co_i32 s0, s0, s1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1)
-; GFX12-NEXT: v_mov_b32_e32 v2, s0
-; GFX12-NEXT: flat_store_b32 v[0:1], v2
-; GFX12-NEXT: s_endpgm
-entry:
- %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
- store ptr addrspace(3) %gep, ptr %ret
- ret void
-}
-
; Check that pointer arithmetic with multiple indexing steps can be lowered.
define ptr @multi_gep_as0(ptr %p, i64 %offset) {
; GFX8-LABEL: multi_gep_as0:
@@ -393,54 +246,6 @@ entry:
ret ptr %gep2
}
-define ptr addrspace(3) @multi_gep_as3(ptr addrspace(3) %p, i32 %offset) {
-; GFX8-LABEL: multi_gep_as3:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX8-NEXT: v_add_u32_e32 v0, vcc, v0, v1
-; GFX8-NEXT: v_add_u32_e32 v0, vcc, 5, v0
-; GFX8-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942-LABEL: multi_gep_as3:
-; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX942-NEXT: v_add3_u32 v0, v0, v1, 5
-; GFX942-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX10-LABEL: multi_gep_as3:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX10-NEXT: v_add3_u32 v0, v0, v1, 5
-; GFX10-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11-LABEL: multi_gep_as3:
-; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX11-NEXT: s_delay_alu instid0(VALU_DEP_1)
-; GFX11-NEXT: v_add3_u32 v0, v0, v1, 5
-; GFX11-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX12-LABEL: multi_gep_as3:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
-; GFX12-NEXT: s_wait_expcnt 0x0
-; GFX12-NEXT: s_wait_samplecnt 0x0
-; GFX12-NEXT: s_wait_bvhcnt 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_lshlrev_b32_e32 v1, 2, v1
-; GFX12-NEXT: s_delay_alu instid0(VALU_DEP_1)
-; GFX12-NEXT: v_add3_u32 v0, v0, v1, 5
-; GFX12-NEXT: s_setpc_b64 s[30:31]
-entry:
- %gep1 = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
- %gep2 = getelementptr inbounds i8, ptr addrspace(3) %gep1, i32 5
- ret ptr addrspace(3) %gep2
-}
-
define amdgpu_kernel void @multi_gep_as0_uniform(ptr %p, i64 %offset, ptr %ret) {
; GFX8-LABEL: multi_gep_as0_uniform:
; GFX8: ; %bb.0: ; %entry
@@ -542,158 +347,6 @@ entry:
ret void
}
-define amdgpu_kernel void @multi_gep_as3_uniform(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
-; GFX8-LABEL: multi_gep_as3_uniform:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
-; GFX8-NEXT: s_add_i32 s12, s12, s17
-; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
-; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
-; GFX8-NEXT: s_waitcnt lgkmcnt(0)
-; GFX8-NEXT: s_lshl_b32 s1, s1, 2
-; GFX8-NEXT: s_add_i32 s0, s0, s1
-; GFX8-NEXT: s_add_i32 s0, s0, 5
-; GFX8-NEXT: v_mov_b32_e32 v0, s2
-; GFX8-NEXT: v_mov_b32_e32 v1, s3
-; GFX8-NEXT: v_mov_b32_e32 v2, s0
-; GFX8-NEXT: flat_store_dword v[0:1], v2
-; GFX8-NEXT: s_endpgm
-;
-; GFX942-LABEL: multi_gep_as3_uniform:
-; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_load_dwordx4 s[0:3], s[4:5], 0x0
-; GFX942-NEXT: s_waitcnt lgkmcnt(0)
-; GFX942-NEXT: s_lshl_b32 s1, s1, 2
-; GFX942-NEXT: s_add_i32 s0, s0, s1
-; GFX942-NEXT: s_add_i32 s0, s0, 5
-; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[2:3]
-; GFX942-NEXT: v_mov_b32_e32 v2, s0
-; GFX942-NEXT: flat_store_dword v[0:1], v2
-; GFX942-NEXT: s_endpgm
-;
-; GFX10-LABEL: multi_gep_as3_uniform:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_add_u32 s12, s12, s17
-; GFX10-NEXT: s_addc_u32 s13, s13, 0
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
-; GFX10-NEXT: s_load_dwordx4 s[0:3], s[8:9], 0x0
-; GFX10-NEXT: s_waitcnt lgkmcnt(0)
-; GFX10-NEXT: s_lshl_b32 s1, s1, 2
-; GFX10-NEXT: v_mov_b32_e32 v0, s2
-; GFX10-NEXT: s_add_i32 s0, s0, s1
-; GFX10-NEXT: v_mov_b32_e32 v1, s3
-; GFX10-NEXT: s_add_i32 s0, s0, 5
-; GFX10-NEXT: v_mov_b32_e32 v2, s0
-; GFX10-NEXT: flat_store_dword v[0:1], v2
-; GFX10-NEXT: s_endpgm
-;
-; GFX11-LABEL: multi_gep_as3_uniform:
-; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
-; GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; GFX11-NEXT: s_lshl_b32 s1, s1, 2
-; GFX11-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
-; GFX11-NEXT: s_add_i32 s0, s0, s1
-; GFX11-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GFX11-NEXT: s_add_i32 s0, s0, 5
-; GFX11-NEXT: v_mov_b32_e32 v2, s0
-; GFX11-NEXT: flat_store_b32 v[0:1], v2
-; GFX11-NEXT: s_endpgm
-;
-; GFX12-LABEL: multi_gep_as3_uniform:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_load_b128 s[0:3], s[4:5], 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: s_lshl_b32 s1, s1, 2
-; GFX12-NEXT: v_dual_mov_b32 v0, s2 :: v_dual_mov_b32 v1, s3
-; GFX12-NEXT: s_add_co_i32 s0, s0, s1
-; GFX12-NEXT: s_delay_alu instid0(SALU_CYCLE_1) | instskip(NEXT) | instid1(SALU_CYCLE_1)
-; GFX12-NEXT: s_add_co_i32 s0, s0, 5
-; GFX12-NEXT: v_mov_b32_e32 v2, s0
-; GFX12-NEXT: flat_store_b32 v[0:1], v2
-; GFX12-NEXT: s_endpgm
-entry:
- %gep1 = getelementptr inbounds i32, ptr addrspace(3) %p, i32 %offset
- %gep2 = getelementptr inbounds i8, ptr addrspace(3) %gep1, i32 5
- store ptr addrspace(3) %gep2, ptr %ret
- ret void
-}
-
-; Check that SIShrinkInstructions triggers and generates s_addk_i32.
-define amdgpu_kernel void @use_s_addk_i32(ptr addrspace(3) %p, i32 %offset, ptr %ret) {
-; GFX8-LABEL: use_s_addk_i32:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_load_dword s2, s[8:9], 0x0
-; GFX8-NEXT: s_load_dwordx2 s[0:1], s[8:9], 0x8
-; GFX8-NEXT: s_add_i32 s12, s12, s17
-; GFX8-NEXT: s_mov_b32 flat_scratch_lo, s13
-; GFX8-NEXT: s_lshr_b32 flat_scratch_hi, s12, 8
-; GFX8-NEXT: s_waitcnt lgkmcnt(0)
-; GFX8-NEXT: s_addk_i32 s2, 0x4b0
-; GFX8-NEXT: v_mov_b32_e32 v0, s0
-; GFX8-NEXT: v_mov_b32_e32 v1, s1
-; GFX8-NEXT: v_mov_b32_e32 v2, s2
-; GFX8-NEXT: flat_store_dword v[0:1], v2
-; GFX8-NEXT: s_endpgm
-;
-; GFX942-LABEL: use_s_addk_i32:
-; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_load_dword s2, s[4:5], 0x0
-; GFX942-NEXT: s_load_dwordx2 s[0:1], s[4:5], 0x8
-; GFX942-NEXT: s_waitcnt lgkmcnt(0)
-; GFX942-NEXT: s_addk_i32 s2, 0x4b0
-; GFX942-NEXT: v_mov_b64_e32 v[0:1], s[0:1]
-; GFX942-NEXT: v_mov_b32_e32 v2, s2
-; GFX942-NEXT: flat_store_dword v[0:1], v2
-; GFX942-NEXT: s_endpgm
-;
-; GFX10-LABEL: use_s_addk_i32:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_add_u32 s12, s12, s17
-; GFX10-NEXT: s_addc_u32 s13, s13, 0
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_LO), s12
-; GFX10-NEXT: s_setreg_b32 hwreg(HW_REG_FLAT_SCR_HI), s13
-; GFX10-NEXT: s_clause 0x1
-; GFX10-NEXT: s_load_dword s2, s[8:9], 0x0
-; GFX10-NEXT: s_load_dwordx2 s[0:1], s[8:9], 0x8
-; GFX10-NEXT: s_waitcnt lgkmcnt(0)
-; GFX10-NEXT: s_addk_i32 s2, 0x4b0
-; GFX10-NEXT: v_mov_b32_e32 v0, s0
-; GFX10-NEXT: v_mov_b32_e32 v1, s1
-; GFX10-NEXT: v_mov_b32_e32 v2, s2
-; GFX10-NEXT: flat_store_dword v[0:1], v2
-; GFX10-NEXT: s_endpgm
-;
-; GFX11-LABEL: use_s_addk_i32:
-; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_clause 0x1
-; GFX11-NEXT: s_load_b32 s2, s[4:5], 0x0
-; GFX11-NEXT: s_load_b64 s[0:1], s[4:5], 0x8
-; GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; GFX11-NEXT: s_addk_i32 s2, 0x4b0
-; GFX11-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
-; GFX11-NEXT: v_mov_b32_e32 v2, s2
-; GFX11-NEXT: flat_store_b32 v[0:1], v2
-; GFX11-NEXT: s_endpgm
-;
-; GFX12-LABEL: use_s_addk_i32:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_clause 0x1
-; GFX12-NEXT: s_load_b32 s2, s[4:5], 0x0
-; GFX12-NEXT: s_load_b64 s[0:1], s[4:5], 0x8
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: s_addk_co_i32 s2, 0x4b0
-; GFX12-NEXT: v_dual_mov_b32 v0, s0 :: v_dual_mov_b32 v1, s1
-; GFX12-NEXT: v_mov_b32_e32 v2, s2
-; GFX12-NEXT: flat_store_b32 v[0:1], v2
-; GFX12-NEXT: s_endpgm
-entry:
- %gep = getelementptr inbounds i32, ptr addrspace(3) %p, i32 300
- store ptr addrspace(3) %gep, ptr %ret
- ret void
-}
-
; Check that constant offsets are folded into memory instructions.
define void @fold_as0(ptr %from, ptr %to) {
@@ -815,64 +468,6 @@ entry:
ret void
}
-define void @fold_as3(ptr addrspace(3) %from, ptr addrspace(3) %to) {
-; GFX8-LABEL: fold_as3:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT: s_mov_b32 m0, -1
-; GFX8-NEXT: ds_read_b32 v0, v0 offset:8
-; GFX8-NEXT: s_waitcnt lgkmcnt(0)
-; GFX8-NEXT: ds_write_b32 v1, v0 offset:8
-; GFX8-NEXT: s_waitcnt lgkmcnt(0)
-; GFX8-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942-LABEL: fold_as3:
-; GFX942: ; %bb.0: ; %entry
-; GFX942-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942-NEXT: ds_read_b32 v0, v0 offset:8
-; GFX942-NEXT: s_waitcnt lgkmcnt(0)
-; GFX942-NEXT: ds_write_b32 v1, v0 offset:8
-; GFX942-NEXT: s_waitcnt lgkmcnt(0)
-; GFX942-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX10-LABEL: fold_as3:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT: ds_read_b32 v0, v0 offset:8
-; GFX10-NEXT: s_waitcnt lgkmcnt(0)
-; GFX10-NEXT: ds_write_b32 v1, v0 offset:8
-; GFX10-NEXT: s_waitcnt lgkmcnt(0)
-; GFX10-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11-LABEL: fold_as3:
-; GFX11: ; %bb.0: ; %entry
-; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT: ds_load_b32 v0, v0 offset:8
-; GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; GFX11-NEXT: ds_store_b32 v1, v0 offset:8
-; GFX11-NEXT: s_waitcnt lgkmcnt(0)
-; GFX11-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX12-LABEL: fold_as3:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
-; GFX12-NEXT: s_wait_expcnt 0x0
-; GFX12-NEXT: s_wait_samplecnt 0x0
-; GFX12-NEXT: s_wait_bvhcnt 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: ds_load_b32 v0, v0 offset:8
-; GFX12-NEXT: s_wait_dscnt 0x0
-; GFX12-NEXT: ds_store_b32 v1, v0 offset:8
-; GFX12-NEXT: s_wait_dscnt 0x0
-; GFX12-NEXT: s_setpc_b64 s[30:31]
-entry:
- %gep.from = getelementptr inbounds i8, ptr addrspace(3) %from, i32 8
- %val = load i32, ptr addrspace(3) %gep.from, align 4
- %gep.to = getelementptr inbounds i8, ptr addrspace(3) %to, i32 8
- store i32 %val, ptr addrspace(3) %gep.to, align 4
- ret void
-}
-
define void @fold_as4(ptr addrspace(4) %from, ptr addrspace(1) %to) {
; GFX8-LABEL: fold_as4:
; GFX8: ; %bb.0: ; %entry
@@ -931,83 +526,12 @@ entry:
ret void
}
-define void @fold_as5(ptr addrspace(5) %from, ptr addrspace(5) %to) {
-; GFX8-LABEL: fold_as5:
-; GFX8: ; %bb.0: ; %entry
-; GFX8-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX8-NEXT: v_add_u32_e32 v0, vcc, 8, v0
-; GFX8-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen
-; GFX8-NEXT: v_add_u32_e32 v1, vcc, 8, v1
-; GFX8-NEXT: s_waitcnt vmcnt(0)
-; GFX8-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen
-; GFX8-NEXT: s_waitcnt vmcnt(0)
-; GFX8-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942_PTRADD-LABEL: fold_as5:
-; GFX942_PTRADD: ; %bb.0: ; %entry
-; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942_PTRADD-NEXT: v_add_u32_e32 v0, 8, v0
-; GFX942_PTRADD-NEXT: scratch_load_dword v0, v0, off
-; GFX942_PTRADD-NEXT: v_add_u32_e32 v1, 8, v1
-; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0)
-; GFX942_PTRADD-NEXT: scratch_store_dword v1, v0, off
-; GFX942_PTRADD-NEXT: s_waitcnt vmcnt(0)
-; GFX942_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX942_LEGACY-LABEL: fold_as5:
-; GFX942_LEGACY: ; %bb.0: ; %entry
-; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX942_LEGACY-NEXT: scratch_load_dword v0, v0, off offset:8
-; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0)
-; GFX942_LEGACY-NEXT: scratch_store_dword v1, v0, off offset:8
-; GFX942_LEGACY-NEXT: s_waitcnt vmcnt(0)
-; GFX942_LEGACY-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX10-LABEL: fold_as5:
-; GFX10: ; %bb.0: ; %entry
-; GFX10-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX10-NEXT: buffer_load_dword v0, v0, s[0:3], 0 offen offset:8
-; GFX10-NEXT: s_waitcnt vmcnt(0)
-; GFX10-NEXT: buffer_store_dword v0, v1, s[0:3], 0 offen offset:8
-; GFX10-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11_PTRADD-LABEL: fold_as5:
-; GFX11_PTRADD: ; %bb.0: ; %entry
-; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v0, 8, v0
-; GFX11_PTRADD-NEXT: v_add_nc_u32_e32 v1, 8, v1
-; GFX11_PTRADD-NEXT: scratch_load_b32 v0, v0, off
-; GFX11_PTRADD-NEXT: s_waitcnt vmcnt(0)
-; GFX11_PTRADD-NEXT: scratch_store_b32 v1, v0, off
-; GFX11_PTRADD-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX11_LEGACY-LABEL: fold_as5:
-; GFX11_LEGACY: ; %bb.0: ; %entry
-; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11_LEGACY-NEXT: scratch_load_b32 v0, v0, off offset:8
-; GFX11_LEGACY-NEXT: s_waitcnt vmcnt(0)
-; GFX11_LEGACY-NEXT: scratch_store_b32 v1, v0, off offset:8
-; GFX11_LEGACY-NEXT: s_setpc_b64 s[30:31]
-;
-; GFX12-LABEL: fold_as5:
-; GFX12: ; %bb.0: ; %entry
-; GFX12-NEXT: s_wait_loadcnt_dscnt 0x0
-; GFX12-NEXT: s_wait_expcnt 0x0
-; GFX12-NEXT: s_wait_samplecnt 0x0
-; GFX12-NEXT: s_wait_bvhcnt 0x0
-; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: scratch_load_b32 v0, v0, off offset:8
-; GFX12-NEXT: s_wait_loadcnt 0x0
-; GFX12-NEXT: scratch_store_b32 v1, v0, off offset:8
-; GFX12-NEXT: s_setpc_b64 s[30:31]
-entry:
- %gep.from = getelementptr inbounds i8, ptr addrspace(5) %from, i32 8
- %val = load i32, ptr addrspace(5) %gep.from, align 4
- %gep.to = getelementptr inbounds i8, ptr addrspace(5) %to, i32 8
- store i32 %val, ptr addrspace(5) %gep.to, align 4
- ret void
-}
-
;; NOTE: These prefixes are unused and the list is autogenerated. Do not add tests below this line:
+; GFX10_LEGACY: {{.*}}
+; GFX10_PTRADD: {{.*}}
+; GFX11_LEGACY: {{.*}}
+; GFX11_PTRADD: {{.*}}
+; GFX12_LEGACY: {{.*}}
+; GFX12_PTRADD: {{.*}}
; GFX8_LEGACY: {{.*}}
; GFX8_PTRADD: {{.*}}
>From bcc798e7472cca699de05ba92f370bc60dc9ec5f Mon Sep 17 00:00:00 2001
From: Fabian Ritter <ritter.x2a at gmail.com>
Date: Fri, 13 Jun 2025 13:22:37 +0200
Subject: [PATCH 4/4] Update SIISelLowering.cpp
---
llvm/lib/Target/AMDGPU/SIISelLowering.cpp | 4631 +--------------------
1 file changed, 2 insertions(+), 4629 deletions(-)
diff --git a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
index 62573c2439bba..eebbacbbb92e2 100644
--- a/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/SIISelLowering.cpp
@@ -61,7 +61,7 @@ static cl::opt<bool> UseDivergentRegisterIndexing(
cl::desc("Use indirect register addressing for divergent indexes"),
cl::init(false));
-// TODO This option should be removed once we switch to always using PTRADD in
+// TODO: This option should be removed once we switch to always using PTRADD in
// the SelectionDAG.
static cl::opt<bool> UseSelectionDAGPTRADD(
"amdgpu-use-sdag-ptradd", cl::Hidden,
@@ -13105,4631 +13105,4 @@ SITargetLowering::performSignExtendInRegCombine(SDNode *N,
if (((Src.getOpcode() == AMDGPUISD::SBUFFER_LOAD_UBYTE &&
VTSign->getVT() == MVT::i8) ||
(Src.getOpcode() == AMDGPUISD::SBUFFER_LOAD_USHORT &&
- VTSign->getVT() == MVT::i16))) {
- assert(Subtarget->hasScalarSubwordLoads() &&
- "s_buffer_load_{u8, i8} are supported "
- "in GFX12 (or newer) architectures.");
- EVT VT = Src.getValueType();
- unsigned Opc = (Src.getOpcode() == AMDGPUISD::SBUFFER_LOAD_UBYTE)
- ? AMDGPUISD::SBUFFER_LOAD_BYTE
- : AMDGPUISD::SBUFFER_LOAD_SHORT;
- SDLoc DL(N);
- SDVTList ResList = DCI.DAG.getVTList(MVT::i32);
- SDValue Ops[] = {
- Src.getOperand(0), // source register
- Src.getOperand(1), // offset
- Src.getOperand(2) // cachePolicy
- };
- auto *M = cast<MemSDNode>(Src);
- SDValue BufferLoad = DCI.DAG.getMemIntrinsicNode(
- Opc, DL, ResList, Ops, M->getMemoryVT(), M->getMemOperand());
- SDValue LoadVal = DCI.DAG.getNode(ISD::TRUNCATE, DL, VT, BufferLoad);
- return LoadVal;
- }
- if (((Src.getOpcode() == AMDGPUISD::BUFFER_LOAD_UBYTE &&
- VTSign->getVT() == MVT::i8) ||
- (Src.getOpcode() == AMDGPUISD::BUFFER_LOAD_USHORT &&
- VTSign->getVT() == MVT::i16)) &&
- Src.hasOneUse()) {
- auto *M = cast<MemSDNode>(Src);
- SDValue Ops[] = {Src.getOperand(0), // Chain
- Src.getOperand(1), // rsrc
- Src.getOperand(2), // vindex
- Src.getOperand(3), // voffset
- Src.getOperand(4), // soffset
- Src.getOperand(5), // offset
- Src.getOperand(6), Src.getOperand(7)};
- // replace with BUFFER_LOAD_BYTE/SHORT
- SDVTList ResList =
- DCI.DAG.getVTList(MVT::i32, Src.getOperand(0).getValueType());
- unsigned Opc = (Src.getOpcode() == AMDGPUISD::BUFFER_LOAD_UBYTE)
- ? AMDGPUISD::BUFFER_LOAD_BYTE
- : AMDGPUISD::BUFFER_LOAD_SHORT;
- SDValue BufferLoadSignExt = DCI.DAG.getMemIntrinsicNode(
- Opc, SDLoc(N), ResList, Ops, M->getMemoryVT(), M->getMemOperand());
- return DCI.DAG.getMergeValues(
- {BufferLoadSignExt, BufferLoadSignExt.getValue(1)}, SDLoc(N));
- }
- return SDValue();
-}
-
-SDValue SITargetLowering::performClassCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- SDValue Mask = N->getOperand(1);
-
- // fp_class x, 0 -> false
- if (isNullConstant(Mask))
- return DAG.getConstant(0, SDLoc(N), MVT::i1);
-
- if (N->getOperand(0).isUndef())
- return DAG.getUNDEF(MVT::i1);
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performRcpCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- EVT VT = N->getValueType(0);
- SDValue N0 = N->getOperand(0);
-
- if (N0.isUndef()) {
- return DCI.DAG.getConstantFP(APFloat::getQNaN(VT.getFltSemantics()),
- SDLoc(N), VT);
- }
-
- if (VT == MVT::f32 && (N0.getOpcode() == ISD::UINT_TO_FP ||
- N0.getOpcode() == ISD::SINT_TO_FP)) {
- return DCI.DAG.getNode(AMDGPUISD::RCP_IFLAG, SDLoc(N), VT, N0,
- N->getFlags());
- }
-
- // TODO: Could handle f32 + amdgcn.sqrt but probably never reaches here.
- if ((VT == MVT::f16 && N0.getOpcode() == ISD::FSQRT) &&
- N->getFlags().hasAllowContract() && N0->getFlags().hasAllowContract()) {
- return DCI.DAG.getNode(AMDGPUISD::RSQ, SDLoc(N), VT, N0.getOperand(0),
- N->getFlags());
- }
-
- return AMDGPUTargetLowering::performRcpCombine(N, DCI);
-}
-
-bool SITargetLowering::isCanonicalized(SelectionDAG &DAG, SDValue Op,
- unsigned MaxDepth) const {
- unsigned Opcode = Op.getOpcode();
- if (Opcode == ISD::FCANONICALIZE)
- return true;
-
- if (auto *CFP = dyn_cast<ConstantFPSDNode>(Op)) {
- const auto &F = CFP->getValueAPF();
- if (F.isNaN() && F.isSignaling())
- return false;
- if (!F.isDenormal())
- return true;
-
- DenormalMode Mode =
- DAG.getMachineFunction().getDenormalMode(F.getSemantics());
- return Mode == DenormalMode::getIEEE();
- }
-
- // If source is a result of another standard FP operation it is already in
- // canonical form.
- if (MaxDepth == 0)
- return false;
-
- switch (Opcode) {
- // These will flush denorms if required.
- case ISD::FADD:
- case ISD::FSUB:
- case ISD::FMUL:
- case ISD::FCEIL:
- case ISD::FFLOOR:
- case ISD::FMA:
- case ISD::FMAD:
- case ISD::FSQRT:
- case ISD::FDIV:
- case ISD::FREM:
- case ISD::FP_ROUND:
- case ISD::FP_EXTEND:
- case ISD::FP16_TO_FP:
- case ISD::FP_TO_FP16:
- case ISD::BF16_TO_FP:
- case ISD::FP_TO_BF16:
- case ISD::FLDEXP:
- case AMDGPUISD::FMUL_LEGACY:
- case AMDGPUISD::FMAD_FTZ:
- case AMDGPUISD::RCP:
- case AMDGPUISD::RSQ:
- case AMDGPUISD::RSQ_CLAMP:
- case AMDGPUISD::RCP_LEGACY:
- case AMDGPUISD::RCP_IFLAG:
- case AMDGPUISD::LOG:
- case AMDGPUISD::EXP:
- case AMDGPUISD::DIV_SCALE:
- case AMDGPUISD::DIV_FMAS:
- case AMDGPUISD::DIV_FIXUP:
- case AMDGPUISD::FRACT:
- case AMDGPUISD::CVT_PKRTZ_F16_F32:
- case AMDGPUISD::CVT_F32_UBYTE0:
- case AMDGPUISD::CVT_F32_UBYTE1:
- case AMDGPUISD::CVT_F32_UBYTE2:
- case AMDGPUISD::CVT_F32_UBYTE3:
- case AMDGPUISD::FP_TO_FP16:
- case AMDGPUISD::SIN_HW:
- case AMDGPUISD::COS_HW:
- return true;
-
- // It can/will be lowered or combined as a bit operation.
- // Need to check their input recursively to handle.
- case ISD::FNEG:
- case ISD::FABS:
- case ISD::FCOPYSIGN:
- return isCanonicalized(DAG, Op.getOperand(0), MaxDepth - 1);
-
- case ISD::AND:
- if (Op.getValueType() == MVT::i32) {
- // Be careful as we only know it is a bitcast floating point type. It
- // could be f32, v2f16, we have no way of knowing. Luckily the constant
- // value that we optimize for, which comes up in fp32 to bf16 conversions,
- // is valid to optimize for all types.
- if (auto *RHS = dyn_cast<ConstantSDNode>(Op.getOperand(1))) {
- if (RHS->getZExtValue() == 0xffff0000) {
- return isCanonicalized(DAG, Op.getOperand(0), MaxDepth - 1);
- }
- }
- }
- break;
-
- case ISD::FSIN:
- case ISD::FCOS:
- case ISD::FSINCOS:
- return Op.getValueType().getScalarType() != MVT::f16;
-
- case ISD::FMINNUM:
- case ISD::FMAXNUM:
- case ISD::FMINNUM_IEEE:
- case ISD::FMAXNUM_IEEE:
- case ISD::FMINIMUM:
- case ISD::FMAXIMUM:
- case ISD::FMINIMUMNUM:
- case ISD::FMAXIMUMNUM:
- case AMDGPUISD::CLAMP:
- case AMDGPUISD::FMED3:
- case AMDGPUISD::FMAX3:
- case AMDGPUISD::FMIN3:
- case AMDGPUISD::FMAXIMUM3:
- case AMDGPUISD::FMINIMUM3: {
- // FIXME: Shouldn't treat the generic operations different based these.
- // However, we aren't really required to flush the result from
- // minnum/maxnum..
-
- // snans will be quieted, so we only need to worry about denormals.
- if (Subtarget->supportsMinMaxDenormModes() ||
- // FIXME: denormalsEnabledForType is broken for dynamic
- denormalsEnabledForType(DAG, Op.getValueType()))
- return true;
-
- // Flushing may be required.
- // In pre-GFX9 targets V_MIN_F32 and others do not flush denorms. For such
- // targets need to check their input recursively.
-
- // FIXME: Does this apply with clamp? It's implemented with max.
- for (unsigned I = 0, E = Op.getNumOperands(); I != E; ++I) {
- if (!isCanonicalized(DAG, Op.getOperand(I), MaxDepth - 1))
- return false;
- }
-
- return true;
- }
- case ISD::SELECT: {
- return isCanonicalized(DAG, Op.getOperand(1), MaxDepth - 1) &&
- isCanonicalized(DAG, Op.getOperand(2), MaxDepth - 1);
- }
- case ISD::BUILD_VECTOR: {
- for (unsigned i = 0, e = Op.getNumOperands(); i != e; ++i) {
- SDValue SrcOp = Op.getOperand(i);
- if (!isCanonicalized(DAG, SrcOp, MaxDepth - 1))
- return false;
- }
-
- return true;
- }
- case ISD::EXTRACT_VECTOR_ELT:
- case ISD::EXTRACT_SUBVECTOR: {
- return isCanonicalized(DAG, Op.getOperand(0), MaxDepth - 1);
- }
- case ISD::INSERT_VECTOR_ELT: {
- return isCanonicalized(DAG, Op.getOperand(0), MaxDepth - 1) &&
- isCanonicalized(DAG, Op.getOperand(1), MaxDepth - 1);
- }
- case ISD::UNDEF:
- // Could be anything.
- return false;
-
- case ISD::BITCAST:
- // TODO: This is incorrect as it loses track of the operand's type. We may
- // end up effectively bitcasting from f32 to v2f16 or vice versa, and the
- // same bits that are canonicalized in one type need not be in the other.
- return isCanonicalized(DAG, Op.getOperand(0), MaxDepth - 1);
- case ISD::TRUNCATE: {
- // Hack round the mess we make when legalizing extract_vector_elt
- if (Op.getValueType() == MVT::i16) {
- SDValue TruncSrc = Op.getOperand(0);
- if (TruncSrc.getValueType() == MVT::i32 &&
- TruncSrc.getOpcode() == ISD::BITCAST &&
- TruncSrc.getOperand(0).getValueType() == MVT::v2f16) {
- return isCanonicalized(DAG, TruncSrc.getOperand(0), MaxDepth - 1);
- }
- }
- return false;
- }
- case ISD::INTRINSIC_WO_CHAIN: {
- unsigned IntrinsicID = Op.getConstantOperandVal(0);
- // TODO: Handle more intrinsics
- switch (IntrinsicID) {
- case Intrinsic::amdgcn_cvt_pkrtz:
- case Intrinsic::amdgcn_cubeid:
- case Intrinsic::amdgcn_frexp_mant:
- case Intrinsic::amdgcn_fdot2:
- case Intrinsic::amdgcn_rcp:
- case Intrinsic::amdgcn_rsq:
- case Intrinsic::amdgcn_rsq_clamp:
- case Intrinsic::amdgcn_rcp_legacy:
- case Intrinsic::amdgcn_rsq_legacy:
- case Intrinsic::amdgcn_trig_preop:
- case Intrinsic::amdgcn_log:
- case Intrinsic::amdgcn_exp2:
- case Intrinsic::amdgcn_sqrt:
- return true;
- default:
- break;
- }
-
- break;
- }
- default:
- break;
- }
-
- // FIXME: denormalsEnabledForType is broken for dynamic
- return denormalsEnabledForType(DAG, Op.getValueType()) &&
- DAG.isKnownNeverSNaN(Op);
-}
-
-bool SITargetLowering::isCanonicalized(Register Reg, const MachineFunction &MF,
- unsigned MaxDepth) const {
- const MachineRegisterInfo &MRI = MF.getRegInfo();
- MachineInstr *MI = MRI.getVRegDef(Reg);
- unsigned Opcode = MI->getOpcode();
-
- if (Opcode == AMDGPU::G_FCANONICALIZE)
- return true;
-
- std::optional<FPValueAndVReg> FCR;
- // Constant splat (can be padded with undef) or scalar constant.
- if (mi_match(Reg, MRI, MIPatternMatch::m_GFCstOrSplat(FCR))) {
- if (FCR->Value.isSignaling())
- return false;
- if (!FCR->Value.isDenormal())
- return true;
-
- DenormalMode Mode = MF.getDenormalMode(FCR->Value.getSemantics());
- return Mode == DenormalMode::getIEEE();
- }
-
- if (MaxDepth == 0)
- return false;
-
- switch (Opcode) {
- case AMDGPU::G_FADD:
- case AMDGPU::G_FSUB:
- case AMDGPU::G_FMUL:
- case AMDGPU::G_FCEIL:
- case AMDGPU::G_FFLOOR:
- case AMDGPU::G_FRINT:
- case AMDGPU::G_FNEARBYINT:
- case AMDGPU::G_INTRINSIC_FPTRUNC_ROUND:
- case AMDGPU::G_INTRINSIC_TRUNC:
- case AMDGPU::G_INTRINSIC_ROUNDEVEN:
- case AMDGPU::G_FMA:
- case AMDGPU::G_FMAD:
- case AMDGPU::G_FSQRT:
- case AMDGPU::G_FDIV:
- case AMDGPU::G_FREM:
- case AMDGPU::G_FPOW:
- case AMDGPU::G_FPEXT:
- case AMDGPU::G_FLOG:
- case AMDGPU::G_FLOG2:
- case AMDGPU::G_FLOG10:
- case AMDGPU::G_FPTRUNC:
- case AMDGPU::G_AMDGPU_RCP_IFLAG:
- case AMDGPU::G_AMDGPU_CVT_F32_UBYTE0:
- case AMDGPU::G_AMDGPU_CVT_F32_UBYTE1:
- case AMDGPU::G_AMDGPU_CVT_F32_UBYTE2:
- case AMDGPU::G_AMDGPU_CVT_F32_UBYTE3:
- return true;
- case AMDGPU::G_FNEG:
- case AMDGPU::G_FABS:
- case AMDGPU::G_FCOPYSIGN:
- return isCanonicalized(MI->getOperand(1).getReg(), MF, MaxDepth - 1);
- case AMDGPU::G_FMINNUM:
- case AMDGPU::G_FMAXNUM:
- case AMDGPU::G_FMINNUM_IEEE:
- case AMDGPU::G_FMAXNUM_IEEE:
- case AMDGPU::G_FMINIMUM:
- case AMDGPU::G_FMAXIMUM:
- case AMDGPU::G_FMINIMUMNUM:
- case AMDGPU::G_FMAXIMUMNUM: {
- if (Subtarget->supportsMinMaxDenormModes() ||
- // FIXME: denormalsEnabledForType is broken for dynamic
- denormalsEnabledForType(MRI.getType(Reg), MF))
- return true;
-
- [[fallthrough]];
- }
- case AMDGPU::G_BUILD_VECTOR:
- for (const MachineOperand &MO : llvm::drop_begin(MI->operands()))
- if (!isCanonicalized(MO.getReg(), MF, MaxDepth - 1))
- return false;
- return true;
- case AMDGPU::G_INTRINSIC:
- case AMDGPU::G_INTRINSIC_CONVERGENT:
- switch (cast<GIntrinsic>(MI)->getIntrinsicID()) {
- case Intrinsic::amdgcn_fmul_legacy:
- case Intrinsic::amdgcn_fmad_ftz:
- case Intrinsic::amdgcn_sqrt:
- case Intrinsic::amdgcn_fmed3:
- case Intrinsic::amdgcn_sin:
- case Intrinsic::amdgcn_cos:
- case Intrinsic::amdgcn_log:
- case Intrinsic::amdgcn_exp2:
- case Intrinsic::amdgcn_log_clamp:
- case Intrinsic::amdgcn_rcp:
- case Intrinsic::amdgcn_rcp_legacy:
- case Intrinsic::amdgcn_rsq:
- case Intrinsic::amdgcn_rsq_clamp:
- case Intrinsic::amdgcn_rsq_legacy:
- case Intrinsic::amdgcn_div_scale:
- case Intrinsic::amdgcn_div_fmas:
- case Intrinsic::amdgcn_div_fixup:
- case Intrinsic::amdgcn_fract:
- case Intrinsic::amdgcn_cvt_pkrtz:
- case Intrinsic::amdgcn_cubeid:
- case Intrinsic::amdgcn_cubema:
- case Intrinsic::amdgcn_cubesc:
- case Intrinsic::amdgcn_cubetc:
- case Intrinsic::amdgcn_frexp_mant:
- case Intrinsic::amdgcn_fdot2:
- case Intrinsic::amdgcn_trig_preop:
- return true;
- default:
- break;
- }
-
- [[fallthrough]];
- default:
- return false;
- }
-
- llvm_unreachable("invalid operation");
-}
-
-// Constant fold canonicalize.
-SDValue SITargetLowering::getCanonicalConstantFP(SelectionDAG &DAG,
- const SDLoc &SL, EVT VT,
- const APFloat &C) const {
- // Flush denormals to 0 if not enabled.
- if (C.isDenormal()) {
- DenormalMode Mode =
- DAG.getMachineFunction().getDenormalMode(C.getSemantics());
- if (Mode == DenormalMode::getPreserveSign()) {
- return DAG.getConstantFP(
- APFloat::getZero(C.getSemantics(), C.isNegative()), SL, VT);
- }
-
- if (Mode != DenormalMode::getIEEE())
- return SDValue();
- }
-
- if (C.isNaN()) {
- APFloat CanonicalQNaN = APFloat::getQNaN(C.getSemantics());
- if (C.isSignaling()) {
- // Quiet a signaling NaN.
- // FIXME: Is this supposed to preserve payload bits?
- return DAG.getConstantFP(CanonicalQNaN, SL, VT);
- }
-
- // Make sure it is the canonical NaN bitpattern.
- //
- // TODO: Can we use -1 as the canonical NaN value since it's an inline
- // immediate?
- if (C.bitcastToAPInt() != CanonicalQNaN.bitcastToAPInt())
- return DAG.getConstantFP(CanonicalQNaN, SL, VT);
- }
-
- // Already canonical.
- return DAG.getConstantFP(C, SL, VT);
-}
-
-static bool vectorEltWillFoldAway(SDValue Op) {
- return Op.isUndef() || isa<ConstantFPSDNode>(Op);
-}
-
-SDValue
-SITargetLowering::performFCanonicalizeCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- SDValue N0 = N->getOperand(0);
- EVT VT = N->getValueType(0);
-
- // fcanonicalize undef -> qnan
- if (N0.isUndef()) {
- APFloat QNaN = APFloat::getQNaN(VT.getFltSemantics());
- return DAG.getConstantFP(QNaN, SDLoc(N), VT);
- }
-
- if (ConstantFPSDNode *CFP = isConstOrConstSplatFP(N0)) {
- EVT VT = N->getValueType(0);
- return getCanonicalConstantFP(DAG, SDLoc(N), VT, CFP->getValueAPF());
- }
-
- // fcanonicalize (build_vector x, k) -> build_vector (fcanonicalize x),
- // (fcanonicalize k)
- //
- // fcanonicalize (build_vector x, undef) -> build_vector (fcanonicalize x), 0
-
- // TODO: This could be better with wider vectors that will be split to v2f16,
- // and to consider uses since there aren't that many packed operations.
- if (N0.getOpcode() == ISD::BUILD_VECTOR && VT == MVT::v2f16 &&
- isTypeLegal(MVT::v2f16)) {
- SDLoc SL(N);
- SDValue NewElts[2];
- SDValue Lo = N0.getOperand(0);
- SDValue Hi = N0.getOperand(1);
- EVT EltVT = Lo.getValueType();
-
- if (vectorEltWillFoldAway(Lo) || vectorEltWillFoldAway(Hi)) {
- for (unsigned I = 0; I != 2; ++I) {
- SDValue Op = N0.getOperand(I);
- if (ConstantFPSDNode *CFP = dyn_cast<ConstantFPSDNode>(Op)) {
- NewElts[I] =
- getCanonicalConstantFP(DAG, SL, EltVT, CFP->getValueAPF());
- } else if (Op.isUndef()) {
- // Handled below based on what the other operand is.
- NewElts[I] = Op;
- } else {
- NewElts[I] = DAG.getNode(ISD::FCANONICALIZE, SL, EltVT, Op);
- }
- }
-
- // If one half is undef, and one is constant, prefer a splat vector rather
- // than the normal qNaN. If it's a register, prefer 0.0 since that's
- // cheaper to use and may be free with a packed operation.
- if (NewElts[0].isUndef()) {
- if (isa<ConstantFPSDNode>(NewElts[1]))
- NewElts[0] = isa<ConstantFPSDNode>(NewElts[1])
- ? NewElts[1]
- : DAG.getConstantFP(0.0f, SL, EltVT);
- }
-
- if (NewElts[1].isUndef()) {
- NewElts[1] = isa<ConstantFPSDNode>(NewElts[0])
- ? NewElts[0]
- : DAG.getConstantFP(0.0f, SL, EltVT);
- }
-
- return DAG.getBuildVector(VT, SL, NewElts);
- }
- }
-
- return SDValue();
-}
-
-static unsigned minMaxOpcToMin3Max3Opc(unsigned Opc) {
- switch (Opc) {
- case ISD::FMAXNUM:
- case ISD::FMAXNUM_IEEE:
- case ISD::FMAXIMUMNUM:
- return AMDGPUISD::FMAX3;
- case ISD::FMAXIMUM:
- return AMDGPUISD::FMAXIMUM3;
- case ISD::SMAX:
- return AMDGPUISD::SMAX3;
- case ISD::UMAX:
- return AMDGPUISD::UMAX3;
- case ISD::FMINNUM:
- case ISD::FMINNUM_IEEE:
- case ISD::FMINIMUMNUM:
- return AMDGPUISD::FMIN3;
- case ISD::FMINIMUM:
- return AMDGPUISD::FMINIMUM3;
- case ISD::SMIN:
- return AMDGPUISD::SMIN3;
- case ISD::UMIN:
- return AMDGPUISD::UMIN3;
- default:
- llvm_unreachable("Not a min/max opcode");
- }
-}
-
-SDValue SITargetLowering::performIntMed3ImmCombine(SelectionDAG &DAG,
- const SDLoc &SL, SDValue Src,
- SDValue MinVal,
- SDValue MaxVal,
- bool Signed) const {
-
- // med3 comes from
- // min(max(x, K0), K1), K0 < K1
- // max(min(x, K0), K1), K1 < K0
- //
- // "MinVal" and "MaxVal" respectively refer to the rhs of the
- // min/max op.
- ConstantSDNode *MinK = dyn_cast<ConstantSDNode>(MinVal);
- ConstantSDNode *MaxK = dyn_cast<ConstantSDNode>(MaxVal);
-
- if (!MinK || !MaxK)
- return SDValue();
-
- if (Signed) {
- if (MaxK->getAPIntValue().sge(MinK->getAPIntValue()))
- return SDValue();
- } else {
- if (MaxK->getAPIntValue().uge(MinK->getAPIntValue()))
- return SDValue();
- }
-
- EVT VT = MinK->getValueType(0);
- unsigned Med3Opc = Signed ? AMDGPUISD::SMED3 : AMDGPUISD::UMED3;
- if (VT == MVT::i32 || (VT == MVT::i16 && Subtarget->hasMed3_16()))
- return DAG.getNode(Med3Opc, SL, VT, Src, MaxVal, MinVal);
-
- // Note: we could also extend to i32 and use i32 med3 if i16 med3 is
- // not available, but this is unlikely to be profitable as constants
- // will often need to be materialized & extended, especially on
- // pre-GFX10 where VOP3 instructions couldn't take literal operands.
- return SDValue();
-}
-
-static ConstantFPSDNode *getSplatConstantFP(SDValue Op) {
- if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Op))
- return C;
-
- if (BuildVectorSDNode *BV = dyn_cast<BuildVectorSDNode>(Op)) {
- if (ConstantFPSDNode *C = BV->getConstantFPSplatNode())
- return C;
- }
-
- return nullptr;
-}
-
-SDValue SITargetLowering::performFPMed3ImmCombine(SelectionDAG &DAG,
- const SDLoc &SL, SDValue Op0,
- SDValue Op1) const {
- ConstantFPSDNode *K1 = getSplatConstantFP(Op1);
- if (!K1)
- return SDValue();
-
- ConstantFPSDNode *K0 = getSplatConstantFP(Op0.getOperand(1));
- if (!K0)
- return SDValue();
-
- // Ordered >= (although NaN inputs should have folded away by now).
- if (K0->getValueAPF() > K1->getValueAPF())
- return SDValue();
-
- // med3 with a nan input acts like
- // v_min_f32(v_min_f32(S0.f32, S1.f32), S2.f32)
- //
- // So the result depends on whether the IEEE mode bit is enabled or not with a
- // signaling nan input.
- // ieee=1
- // s0 snan: yields s2
- // s1 snan: yields s2
- // s2 snan: qnan
-
- // s0 qnan: min(s1, s2)
- // s1 qnan: min(s0, s2)
- // s2 qnan: min(s0, s1)
-
- // ieee=0
- // s0 snan: min(s1, s2)
- // s1 snan: min(s0, s2)
- // s2 snan: qnan
-
- // s0 qnan: min(s1, s2)
- // s1 qnan: min(s0, s2)
- // s2 qnan: min(s0, s1)
- const MachineFunction &MF = DAG.getMachineFunction();
- const SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
-
- // TODO: Check IEEE bit enabled. We can form fmed3 with IEEE=0 regardless of
- // whether the input is a signaling nan if op0 is fmaximum or fmaximumnum. We
- // can only form if op0 is fmaxnum_ieee if IEEE=1.
- EVT VT = Op0.getValueType();
- if (Info->getMode().DX10Clamp) {
- // If dx10_clamp is enabled, NaNs clamp to 0.0. This is the same as the
- // hardware fmed3 behavior converting to a min.
- // FIXME: Should this be allowing -0.0?
- if (K1->isExactlyValue(1.0) && K0->isExactlyValue(0.0))
- return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Op0.getOperand(0));
- }
-
- // med3 for f16 is only available on gfx9+, and not available for v2f16.
- if (VT == MVT::f32 || (VT == MVT::f16 && Subtarget->hasMed3_16())) {
- // This isn't safe with signaling NaNs because in IEEE mode, min/max on a
- // signaling NaN gives a quiet NaN. The quiet NaN input to the min would
- // then give the other result, which is different from med3 with a NaN
- // input.
- SDValue Var = Op0.getOperand(0);
- if (!DAG.isKnownNeverSNaN(Var))
- return SDValue();
-
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
-
- if ((!K0->hasOneUse() || TII->isInlineConstant(K0->getValueAPF())) &&
- (!K1->hasOneUse() || TII->isInlineConstant(K1->getValueAPF()))) {
- return DAG.getNode(AMDGPUISD::FMED3, SL, K0->getValueType(0), Var,
- SDValue(K0, 0), SDValue(K1, 0));
- }
- }
-
- return SDValue();
-}
-
-/// \return true if the subtarget supports minimum3 and maximum3 with the given
-/// base min/max opcode \p Opc for type \p VT.
-static bool supportsMin3Max3(const GCNSubtarget &Subtarget, unsigned Opc,
- EVT VT) {
- switch (Opc) {
- case ISD::FMINNUM:
- case ISD::FMAXNUM:
- case ISD::FMINNUM_IEEE:
- case ISD::FMAXNUM_IEEE:
- case ISD::FMINIMUMNUM:
- case ISD::FMAXIMUMNUM:
- case AMDGPUISD::FMIN_LEGACY:
- case AMDGPUISD::FMAX_LEGACY:
- return (VT == MVT::f32) || (VT == MVT::f16 && Subtarget.hasMin3Max3_16());
- case ISD::FMINIMUM:
- case ISD::FMAXIMUM:
- return (VT == MVT::f32 && Subtarget.hasMinimum3Maximum3F32()) ||
- (VT == MVT::f16 && Subtarget.hasMinimum3Maximum3F16()) ||
- (VT == MVT::v2f16 && Subtarget.hasMinimum3Maximum3PKF16());
- case ISD::SMAX:
- case ISD::SMIN:
- case ISD::UMAX:
- case ISD::UMIN:
- return (VT == MVT::i32) || (VT == MVT::i16 && Subtarget.hasMin3Max3_16());
- default:
- return false;
- }
-
- llvm_unreachable("not a min/max opcode");
-}
-
-SDValue SITargetLowering::performMinMaxCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
-
- EVT VT = N->getValueType(0);
- unsigned Opc = N->getOpcode();
- SDValue Op0 = N->getOperand(0);
- SDValue Op1 = N->getOperand(1);
-
- // Only do this if the inner op has one use since this will just increases
- // register pressure for no benefit.
-
- if (supportsMin3Max3(*Subtarget, Opc, VT)) {
- // max(max(a, b), c) -> max3(a, b, c)
- // min(min(a, b), c) -> min3(a, b, c)
- if (Op0.getOpcode() == Opc && Op0.hasOneUse()) {
- SDLoc DL(N);
- return DAG.getNode(minMaxOpcToMin3Max3Opc(Opc), DL, N->getValueType(0),
- Op0.getOperand(0), Op0.getOperand(1), Op1);
- }
-
- // Try commuted.
- // max(a, max(b, c)) -> max3(a, b, c)
- // min(a, min(b, c)) -> min3(a, b, c)
- if (Op1.getOpcode() == Opc && Op1.hasOneUse()) {
- SDLoc DL(N);
- return DAG.getNode(minMaxOpcToMin3Max3Opc(Opc), DL, N->getValueType(0),
- Op0, Op1.getOperand(0), Op1.getOperand(1));
- }
- }
-
- // min(max(x, K0), K1), K0 < K1 -> med3(x, K0, K1)
- // max(min(x, K0), K1), K1 < K0 -> med3(x, K1, K0)
- if (Opc == ISD::SMIN && Op0.getOpcode() == ISD::SMAX && Op0.hasOneUse()) {
- if (SDValue Med3 = performIntMed3ImmCombine(
- DAG, SDLoc(N), Op0->getOperand(0), Op1, Op0->getOperand(1), true))
- return Med3;
- }
- if (Opc == ISD::SMAX && Op0.getOpcode() == ISD::SMIN && Op0.hasOneUse()) {
- if (SDValue Med3 = performIntMed3ImmCombine(
- DAG, SDLoc(N), Op0->getOperand(0), Op0->getOperand(1), Op1, true))
- return Med3;
- }
-
- if (Opc == ISD::UMIN && Op0.getOpcode() == ISD::UMAX && Op0.hasOneUse()) {
- if (SDValue Med3 = performIntMed3ImmCombine(
- DAG, SDLoc(N), Op0->getOperand(0), Op1, Op0->getOperand(1), false))
- return Med3;
- }
- if (Opc == ISD::UMAX && Op0.getOpcode() == ISD::UMIN && Op0.hasOneUse()) {
- if (SDValue Med3 = performIntMed3ImmCombine(
- DAG, SDLoc(N), Op0->getOperand(0), Op0->getOperand(1), Op1, false))
- return Med3;
- }
-
- // if !is_snan(x):
- // fminnum(fmaxnum(x, K0), K1), K0 < K1 -> fmed3(x, K0, K1)
- // fminnum_ieee(fmaxnum_ieee(x, K0), K1), K0 < K1 -> fmed3(x, K0, K1)
- // fminnumnum(fmaxnumnum(x, K0), K1), K0 < K1 -> fmed3(x, K0, K1)
- // fmin_legacy(fmax_legacy(x, K0), K1), K0 < K1 -> fmed3(x, K0, K1)
- if (((Opc == ISD::FMINNUM && Op0.getOpcode() == ISD::FMAXNUM) ||
- (Opc == ISD::FMINNUM_IEEE && Op0.getOpcode() == ISD::FMAXNUM_IEEE) ||
- (Opc == ISD::FMINIMUMNUM && Op0.getOpcode() == ISD::FMAXIMUMNUM) ||
- (Opc == AMDGPUISD::FMIN_LEGACY &&
- Op0.getOpcode() == AMDGPUISD::FMAX_LEGACY)) &&
- (VT == MVT::f32 || VT == MVT::f64 ||
- (VT == MVT::f16 && Subtarget->has16BitInsts()) ||
- (VT == MVT::v2f16 && Subtarget->hasVOP3PInsts())) &&
- Op0.hasOneUse()) {
- if (SDValue Res = performFPMed3ImmCombine(DAG, SDLoc(N), Op0, Op1))
- return Res;
- }
-
- return SDValue();
-}
-
-static bool isClampZeroToOne(SDValue A, SDValue B) {
- if (ConstantFPSDNode *CA = dyn_cast<ConstantFPSDNode>(A)) {
- if (ConstantFPSDNode *CB = dyn_cast<ConstantFPSDNode>(B)) {
- // FIXME: Should this be allowing -0.0?
- return (CA->isExactlyValue(0.0) && CB->isExactlyValue(1.0)) ||
- (CA->isExactlyValue(1.0) && CB->isExactlyValue(0.0));
- }
- }
-
- return false;
-}
-
-// FIXME: Should only worry about snans for version with chain.
-SDValue SITargetLowering::performFMed3Combine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- EVT VT = N->getValueType(0);
- // v_med3_f32 and v_max_f32 behave identically wrt denorms, exceptions and
- // NaNs. With a NaN input, the order of the operands may change the result.
-
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
-
- SDValue Src0 = N->getOperand(0);
- SDValue Src1 = N->getOperand(1);
- SDValue Src2 = N->getOperand(2);
-
- if (isClampZeroToOne(Src0, Src1)) {
- // const_a, const_b, x -> clamp is safe in all cases including signaling
- // nans.
- // FIXME: Should this be allowing -0.0?
- return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Src2);
- }
-
- const MachineFunction &MF = DAG.getMachineFunction();
- const SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
-
- // FIXME: dx10_clamp behavior assumed in instcombine. Should we really bother
- // handling no dx10-clamp?
- if (Info->getMode().DX10Clamp) {
- // If NaNs is clamped to 0, we are free to reorder the inputs.
-
- if (isa<ConstantFPSDNode>(Src0) && !isa<ConstantFPSDNode>(Src1))
- std::swap(Src0, Src1);
-
- if (isa<ConstantFPSDNode>(Src1) && !isa<ConstantFPSDNode>(Src2))
- std::swap(Src1, Src2);
-
- if (isa<ConstantFPSDNode>(Src0) && !isa<ConstantFPSDNode>(Src1))
- std::swap(Src0, Src1);
-
- if (isClampZeroToOne(Src1, Src2))
- return DAG.getNode(AMDGPUISD::CLAMP, SL, VT, Src0);
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performCvtPkRTZCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SDValue Src0 = N->getOperand(0);
- SDValue Src1 = N->getOperand(1);
- if (Src0.isUndef() && Src1.isUndef())
- return DCI.DAG.getUNDEF(N->getValueType(0));
- return SDValue();
-}
-
-// Check if EXTRACT_VECTOR_ELT/INSERT_VECTOR_ELT (<n x e>, var-idx) should be
-// expanded into a set of cmp/select instructions.
-bool SITargetLowering::shouldExpandVectorDynExt(unsigned EltSize,
- unsigned NumElem,
- bool IsDivergentIdx,
- const GCNSubtarget *Subtarget) {
- if (UseDivergentRegisterIndexing)
- return false;
-
- unsigned VecSize = EltSize * NumElem;
-
- // Sub-dword vectors of size 2 dword or less have better implementation.
- if (VecSize <= 64 && EltSize < 32)
- return false;
-
- // Always expand the rest of sub-dword instructions, otherwise it will be
- // lowered via memory.
- if (EltSize < 32)
- return true;
-
- // Always do this if var-idx is divergent, otherwise it will become a loop.
- if (IsDivergentIdx)
- return true;
-
- // Large vectors would yield too many compares and v_cndmask_b32 instructions.
- unsigned NumInsts = NumElem /* Number of compares */ +
- ((EltSize + 31) / 32) * NumElem /* Number of cndmasks */;
-
- // On some architectures (GFX9) movrel is not available and it's better
- // to expand.
- if (Subtarget->useVGPRIndexMode())
- return NumInsts <= 16;
-
- // If movrel is available, use it instead of expanding for vector of 8
- // elements.
- if (Subtarget->hasMovrel())
- return NumInsts <= 15;
-
- return true;
-}
-
-bool SITargetLowering::shouldExpandVectorDynExt(SDNode *N) const {
- SDValue Idx = N->getOperand(N->getNumOperands() - 1);
- if (isa<ConstantSDNode>(Idx))
- return false;
-
- SDValue Vec = N->getOperand(0);
- EVT VecVT = Vec.getValueType();
- EVT EltVT = VecVT.getVectorElementType();
- unsigned EltSize = EltVT.getSizeInBits();
- unsigned NumElem = VecVT.getVectorNumElements();
-
- return SITargetLowering::shouldExpandVectorDynExt(
- EltSize, NumElem, Idx->isDivergent(), getSubtarget());
-}
-
-SDValue
-SITargetLowering::performExtractVectorEltCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SDValue Vec = N->getOperand(0);
- SelectionDAG &DAG = DCI.DAG;
-
- EVT VecVT = Vec.getValueType();
- EVT VecEltVT = VecVT.getVectorElementType();
- EVT ResVT = N->getValueType(0);
-
- unsigned VecSize = VecVT.getSizeInBits();
- unsigned VecEltSize = VecEltVT.getSizeInBits();
-
- if ((Vec.getOpcode() == ISD::FNEG || Vec.getOpcode() == ISD::FABS) &&
- allUsesHaveSourceMods(N)) {
- SDLoc SL(N);
- SDValue Idx = N->getOperand(1);
- SDValue Elt =
- DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, ResVT, Vec.getOperand(0), Idx);
- return DAG.getNode(Vec.getOpcode(), SL, ResVT, Elt);
- }
-
- // ScalarRes = EXTRACT_VECTOR_ELT ((vector-BINOP Vec1, Vec2), Idx)
- // =>
- // Vec1Elt = EXTRACT_VECTOR_ELT(Vec1, Idx)
- // Vec2Elt = EXTRACT_VECTOR_ELT(Vec2, Idx)
- // ScalarRes = scalar-BINOP Vec1Elt, Vec2Elt
- if (Vec.hasOneUse() && DCI.isBeforeLegalize() && VecEltVT == ResVT) {
- SDLoc SL(N);
- SDValue Idx = N->getOperand(1);
- unsigned Opc = Vec.getOpcode();
-
- switch (Opc) {
- default:
- break;
- // TODO: Support other binary operations.
- case ISD::FADD:
- case ISD::FSUB:
- case ISD::FMUL:
- case ISD::ADD:
- case ISD::UMIN:
- case ISD::UMAX:
- case ISD::SMIN:
- case ISD::SMAX:
- case ISD::FMAXNUM:
- case ISD::FMINNUM:
- case ISD::FMAXNUM_IEEE:
- case ISD::FMINNUM_IEEE:
- case ISD::FMAXIMUM:
- case ISD::FMINIMUM: {
- SDValue Elt0 = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, ResVT,
- Vec.getOperand(0), Idx);
- SDValue Elt1 = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, ResVT,
- Vec.getOperand(1), Idx);
-
- DCI.AddToWorklist(Elt0.getNode());
- DCI.AddToWorklist(Elt1.getNode());
- return DAG.getNode(Opc, SL, ResVT, Elt0, Elt1, Vec->getFlags());
- }
- }
- }
-
- // EXTRACT_VECTOR_ELT (<n x e>, var-idx) => n x select (e, const-idx)
- if (shouldExpandVectorDynExt(N)) {
- SDLoc SL(N);
- SDValue Idx = N->getOperand(1);
- SDValue V;
- for (unsigned I = 0, E = VecVT.getVectorNumElements(); I < E; ++I) {
- SDValue IC = DAG.getVectorIdxConstant(I, SL);
- SDValue Elt = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, ResVT, Vec, IC);
- if (I == 0)
- V = Elt;
- else
- V = DAG.getSelectCC(SL, Idx, IC, Elt, V, ISD::SETEQ);
- }
- return V;
- }
-
- if (!DCI.isBeforeLegalize())
- return SDValue();
-
- // Try to turn sub-dword accesses of vectors into accesses of the same 32-bit
- // elements. This exposes more load reduction opportunities by replacing
- // multiple small extract_vector_elements with a single 32-bit extract.
- auto *Idx = dyn_cast<ConstantSDNode>(N->getOperand(1));
- if (isa<MemSDNode>(Vec) && VecEltSize <= 16 && VecEltVT.isByteSized() &&
- VecSize > 32 && VecSize % 32 == 0 && Idx) {
- EVT NewVT = getEquivalentMemType(*DAG.getContext(), VecVT);
-
- unsigned BitIndex = Idx->getZExtValue() * VecEltSize;
- unsigned EltIdx = BitIndex / 32;
- unsigned LeftoverBitIdx = BitIndex % 32;
- SDLoc SL(N);
-
- SDValue Cast = DAG.getNode(ISD::BITCAST, SL, NewVT, Vec);
- DCI.AddToWorklist(Cast.getNode());
-
- SDValue Elt = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, MVT::i32, Cast,
- DAG.getConstant(EltIdx, SL, MVT::i32));
- DCI.AddToWorklist(Elt.getNode());
- SDValue Srl = DAG.getNode(ISD::SRL, SL, MVT::i32, Elt,
- DAG.getConstant(LeftoverBitIdx, SL, MVT::i32));
- DCI.AddToWorklist(Srl.getNode());
-
- EVT VecEltAsIntVT = VecEltVT.changeTypeToInteger();
- SDValue Trunc = DAG.getNode(ISD::TRUNCATE, SL, VecEltAsIntVT, Srl);
- DCI.AddToWorklist(Trunc.getNode());
-
- if (VecEltVT == ResVT) {
- return DAG.getNode(ISD::BITCAST, SL, VecEltVT, Trunc);
- }
-
- assert(ResVT.isScalarInteger());
- return DAG.getAnyExtOrTrunc(Trunc, SL, ResVT);
- }
-
- return SDValue();
-}
-
-SDValue
-SITargetLowering::performInsertVectorEltCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SDValue Vec = N->getOperand(0);
- SDValue Idx = N->getOperand(2);
- EVT VecVT = Vec.getValueType();
- EVT EltVT = VecVT.getVectorElementType();
-
- // INSERT_VECTOR_ELT (<n x e>, var-idx)
- // => BUILD_VECTOR n x select (e, const-idx)
- if (!shouldExpandVectorDynExt(N))
- return SDValue();
-
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
- SDValue Ins = N->getOperand(1);
- EVT IdxVT = Idx.getValueType();
-
- SmallVector<SDValue, 16> Ops;
- for (unsigned I = 0, E = VecVT.getVectorNumElements(); I < E; ++I) {
- SDValue IC = DAG.getConstant(I, SL, IdxVT);
- SDValue Elt = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, SL, EltVT, Vec, IC);
- SDValue V = DAG.getSelectCC(SL, Idx, IC, Ins, Elt, ISD::SETEQ);
- Ops.push_back(V);
- }
-
- return DAG.getBuildVector(VecVT, SL, Ops);
-}
-
-/// Return the source of an fp_extend from f16 to f32, or a converted FP
-/// constant.
-static SDValue strictFPExtFromF16(SelectionDAG &DAG, SDValue Src) {
- if (Src.getOpcode() == ISD::FP_EXTEND &&
- Src.getOperand(0).getValueType() == MVT::f16) {
- return Src.getOperand(0);
- }
-
- if (auto *CFP = dyn_cast<ConstantFPSDNode>(Src)) {
- APFloat Val = CFP->getValueAPF();
- bool LosesInfo = true;
- Val.convert(APFloat::IEEEhalf(), APFloat::rmNearestTiesToEven, &LosesInfo);
- if (!LosesInfo)
- return DAG.getConstantFP(Val, SDLoc(Src), MVT::f16);
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performFPRoundCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- assert(Subtarget->has16BitInsts() && !Subtarget->hasMed3_16() &&
- "combine only useful on gfx8");
-
- SDValue TruncSrc = N->getOperand(0);
- EVT VT = N->getValueType(0);
- if (VT != MVT::f16)
- return SDValue();
-
- if (TruncSrc.getOpcode() != AMDGPUISD::FMED3 ||
- TruncSrc.getValueType() != MVT::f32 || !TruncSrc.hasOneUse())
- return SDValue();
-
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
-
- // Optimize f16 fmed3 pattern performed on f32. On gfx8 there is no f16 fmed3,
- // and expanding it with min/max saves 1 instruction vs. casting to f32 and
- // casting back.
-
- // fptrunc (f32 (fmed3 (fpext f16:a, fpext f16:b, fpext f16:c))) =>
- // fmin(fmax(a, b), fmax(fmin(a, b), c))
- SDValue A = strictFPExtFromF16(DAG, TruncSrc.getOperand(0));
- if (!A)
- return SDValue();
-
- SDValue B = strictFPExtFromF16(DAG, TruncSrc.getOperand(1));
- if (!B)
- return SDValue();
-
- SDValue C = strictFPExtFromF16(DAG, TruncSrc.getOperand(2));
- if (!C)
- return SDValue();
-
- // This changes signaling nan behavior. If an input is a signaling nan, it
- // would have been quieted by the fpext originally. We don't care because
- // these are unconstrained ops. If we needed to insert quieting canonicalizes
- // we would be worse off than just doing the promotion.
- SDValue A1 = DAG.getNode(ISD::FMINNUM_IEEE, SL, VT, A, B);
- SDValue B1 = DAG.getNode(ISD::FMAXNUM_IEEE, SL, VT, A, B);
- SDValue C1 = DAG.getNode(ISD::FMAXNUM_IEEE, SL, VT, A1, C);
- return DAG.getNode(ISD::FMINNUM_IEEE, SL, VT, B1, C1);
-}
-
-unsigned SITargetLowering::getFusedOpcode(const SelectionDAG &DAG,
- const SDNode *N0,
- const SDNode *N1) const {
- EVT VT = N0->getValueType(0);
-
- // Only do this if we are not trying to support denormals. v_mad_f32 does not
- // support denormals ever.
- if (((VT == MVT::f32 &&
- denormalModeIsFlushAllF32(DAG.getMachineFunction())) ||
- (VT == MVT::f16 && Subtarget->hasMadF16() &&
- denormalModeIsFlushAllF64F16(DAG.getMachineFunction()))) &&
- isOperationLegal(ISD::FMAD, VT))
- return ISD::FMAD;
-
- const TargetOptions &Options = DAG.getTarget().Options;
- if ((Options.AllowFPOpFusion == FPOpFusion::Fast || Options.UnsafeFPMath ||
- (N0->getFlags().hasAllowContract() &&
- N1->getFlags().hasAllowContract())) &&
- isFMAFasterThanFMulAndFAdd(DAG.getMachineFunction(), VT)) {
- return ISD::FMA;
- }
-
- return 0;
-}
-
-// For a reassociatable opcode perform:
-// op x, (op y, z) -> op (op x, z), y, if x and z are uniform
-SDValue SITargetLowering::reassociateScalarOps(SDNode *N,
- SelectionDAG &DAG) const {
- EVT VT = N->getValueType(0);
- if (VT != MVT::i32 && VT != MVT::i64)
- return SDValue();
-
- if (DAG.isBaseWithConstantOffset(SDValue(N, 0)))
- return SDValue();
-
- unsigned Opc = N->getOpcode();
- SDValue Op0 = N->getOperand(0);
- SDValue Op1 = N->getOperand(1);
-
- if (!(Op0->isDivergent() ^ Op1->isDivergent()))
- return SDValue();
-
- if (Op0->isDivergent())
- std::swap(Op0, Op1);
-
- if (Op1.getOpcode() != Opc || !Op1.hasOneUse())
- return SDValue();
-
- SDValue Op2 = Op1.getOperand(1);
- Op1 = Op1.getOperand(0);
- if (!(Op1->isDivergent() ^ Op2->isDivergent()))
- return SDValue();
-
- if (Op1->isDivergent())
- std::swap(Op1, Op2);
-
- SDLoc SL(N);
- SDValue Add1 = DAG.getNode(Opc, SL, VT, Op0, Op1);
- return DAG.getNode(Opc, SL, VT, Add1, Op2);
-}
-
-static SDValue getMad64_32(SelectionDAG &DAG, const SDLoc &SL, EVT VT,
- SDValue N0, SDValue N1, SDValue N2, bool Signed) {
- unsigned MadOpc = Signed ? AMDGPUISD::MAD_I64_I32 : AMDGPUISD::MAD_U64_U32;
- SDVTList VTs = DAG.getVTList(MVT::i64, MVT::i1);
- SDValue Mad = DAG.getNode(MadOpc, SL, VTs, N0, N1, N2);
- return DAG.getNode(ISD::TRUNCATE, SL, VT, Mad);
-}
-
-// Fold
-// y = lshr i64 x, 32
-// res = add (mul i64 y, Const), x where "Const" is a 64-bit constant
-// with Const.hi == -1
-// To
-// res = mad_u64_u32 y.lo ,Const.lo, x.lo
-static SDValue tryFoldMADwithSRL(SelectionDAG &DAG, const SDLoc &SL,
- SDValue MulLHS, SDValue MulRHS,
- SDValue AddRHS) {
- if (MulRHS.getOpcode() == ISD::SRL)
- std::swap(MulLHS, MulRHS);
-
- if (MulLHS.getValueType() != MVT::i64 || MulLHS.getOpcode() != ISD::SRL)
- return SDValue();
-
- ConstantSDNode *ShiftVal = dyn_cast<ConstantSDNode>(MulLHS.getOperand(1));
- if (!ShiftVal || ShiftVal->getAsZExtVal() != 32 ||
- MulLHS.getOperand(0) != AddRHS)
- return SDValue();
-
- ConstantSDNode *Const = dyn_cast<ConstantSDNode>(MulRHS.getNode());
- if (!Const || Hi_32(Const->getZExtValue()) != uint32_t(-1))
- return SDValue();
-
- SDValue ConstMul =
- DAG.getConstant(Lo_32(Const->getZExtValue()), SL, MVT::i32);
- return getMad64_32(DAG, SL, MVT::i64,
- DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, MulLHS), ConstMul,
- DAG.getZeroExtendInReg(AddRHS, SL, MVT::i32), false);
-}
-
-// Fold (add (mul x, y), z) --> (mad_[iu]64_[iu]32 x, y, z) plus high
-// multiplies, if any.
-//
-// Full 64-bit multiplies that feed into an addition are lowered here instead
-// of using the generic expansion. The generic expansion ends up with
-// a tree of ADD nodes that prevents us from using the "add" part of the
-// MAD instruction. The expansion produced here results in a chain of ADDs
-// instead of a tree.
-SDValue SITargetLowering::tryFoldToMad64_32(SDNode *N,
- DAGCombinerInfo &DCI) const {
- assert(N->getOpcode() == ISD::ADD);
-
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
- SDLoc SL(N);
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- if (VT.isVector())
- return SDValue();
-
- // S_MUL_HI_[IU]32 was added in gfx9, which allows us to keep the overall
- // result in scalar registers for uniform values.
- if (!N->isDivergent() && Subtarget->hasSMulHi())
- return SDValue();
-
- unsigned NumBits = VT.getScalarSizeInBits();
- if (NumBits <= 32 || NumBits > 64)
- return SDValue();
-
- if (LHS.getOpcode() != ISD::MUL) {
- assert(RHS.getOpcode() == ISD::MUL);
- std::swap(LHS, RHS);
- }
-
- // Avoid the fold if it would unduly increase the number of multiplies due to
- // multiple uses, except on hardware with full-rate multiply-add (which is
- // part of full-rate 64-bit ops).
- if (!Subtarget->hasFullRate64Ops()) {
- unsigned NumUsers = 0;
- for (SDNode *User : LHS->users()) {
- // There is a use that does not feed into addition, so the multiply can't
- // be removed. We prefer MUL + ADD + ADDC over MAD + MUL.
- if (User->getOpcode() != ISD::ADD)
- return SDValue();
-
- // We prefer 2xMAD over MUL + 2xADD + 2xADDC (code density), and prefer
- // MUL + 3xADD + 3xADDC over 3xMAD.
- ++NumUsers;
- if (NumUsers >= 3)
- return SDValue();
- }
- }
-
- SDValue MulLHS = LHS.getOperand(0);
- SDValue MulRHS = LHS.getOperand(1);
- SDValue AddRHS = RHS;
-
- if (SDValue FoldedMAD = tryFoldMADwithSRL(DAG, SL, MulLHS, MulRHS, AddRHS))
- return FoldedMAD;
-
- // Always check whether operands are small unsigned values, since that
- // knowledge is useful in more cases. Check for small signed values only if
- // doing so can unlock a shorter code sequence.
- bool MulLHSUnsigned32 = numBitsUnsigned(MulLHS, DAG) <= 32;
- bool MulRHSUnsigned32 = numBitsUnsigned(MulRHS, DAG) <= 32;
-
- bool MulSignedLo = false;
- if (!MulLHSUnsigned32 || !MulRHSUnsigned32) {
- MulSignedLo =
- numBitsSigned(MulLHS, DAG) <= 32 && numBitsSigned(MulRHS, DAG) <= 32;
- }
-
- // The operands and final result all have the same number of bits. If
- // operands need to be extended, they can be extended with garbage. The
- // resulting garbage in the high bits of the mad_[iu]64_[iu]32 result is
- // truncated away in the end.
- if (VT != MVT::i64) {
- MulLHS = DAG.getNode(ISD::ANY_EXTEND, SL, MVT::i64, MulLHS);
- MulRHS = DAG.getNode(ISD::ANY_EXTEND, SL, MVT::i64, MulRHS);
- AddRHS = DAG.getNode(ISD::ANY_EXTEND, SL, MVT::i64, AddRHS);
- }
-
- // The basic code generated is conceptually straightforward. Pseudo code:
- //
- // accum = mad_64_32 lhs.lo, rhs.lo, accum
- // accum.hi = add (mul lhs.hi, rhs.lo), accum.hi
- // accum.hi = add (mul lhs.lo, rhs.hi), accum.hi
- //
- // The second and third lines are optional, depending on whether the factors
- // are {sign,zero}-extended or not.
- //
- // The actual DAG is noisier than the pseudo code, but only due to
- // instructions that disassemble values into low and high parts, and
- // assemble the final result.
- SDValue One = DAG.getConstant(1, SL, MVT::i32);
-
- auto MulLHSLo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, MulLHS);
- auto MulRHSLo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, MulRHS);
- SDValue Accum =
- getMad64_32(DAG, SL, MVT::i64, MulLHSLo, MulRHSLo, AddRHS, MulSignedLo);
-
- if (!MulSignedLo && (!MulLHSUnsigned32 || !MulRHSUnsigned32)) {
- auto [AccumLo, AccumHi] = DAG.SplitScalar(Accum, SL, MVT::i32, MVT::i32);
-
- if (!MulLHSUnsigned32) {
- auto MulLHSHi =
- DAG.getNode(ISD::EXTRACT_ELEMENT, SL, MVT::i32, MulLHS, One);
- SDValue MulHi = DAG.getNode(ISD::MUL, SL, MVT::i32, MulLHSHi, MulRHSLo);
- AccumHi = DAG.getNode(ISD::ADD, SL, MVT::i32, MulHi, AccumHi);
- }
-
- if (!MulRHSUnsigned32) {
- auto MulRHSHi =
- DAG.getNode(ISD::EXTRACT_ELEMENT, SL, MVT::i32, MulRHS, One);
- SDValue MulHi = DAG.getNode(ISD::MUL, SL, MVT::i32, MulLHSLo, MulRHSHi);
- AccumHi = DAG.getNode(ISD::ADD, SL, MVT::i32, MulHi, AccumHi);
- }
-
- Accum = DAG.getBuildVector(MVT::v2i32, SL, {AccumLo, AccumHi});
- Accum = DAG.getBitcast(MVT::i64, Accum);
- }
-
- if (VT != MVT::i64)
- Accum = DAG.getNode(ISD::TRUNCATE, SL, VT, Accum);
- return Accum;
-}
-
-SDValue
-SITargetLowering::foldAddSub64WithZeroLowBitsTo32(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SDValue RHS = N->getOperand(1);
- auto *CRHS = dyn_cast<ConstantSDNode>(RHS);
- if (!CRHS)
- return SDValue();
-
- // TODO: Worth using computeKnownBits? Maybe expensive since it's so
- // common.
- uint64_t Val = CRHS->getZExtValue();
- if (countr_zero(Val) >= 32) {
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
- SDValue LHS = N->getOperand(0);
-
- // Avoid carry machinery if we know the low half of the add does not
- // contribute to the final result.
- //
- // add i64:x, K if computeTrailingZeros(K) >= 32
- // => build_pair (add x.hi, K.hi), x.lo
-
- // Breaking the 64-bit add here with this strange constant is unlikely
- // to interfere with addressing mode patterns.
-
- SDValue Hi = getHiHalf64(LHS, DAG);
- SDValue ConstHi32 = DAG.getConstant(Hi_32(Val), SL, MVT::i32);
- SDValue AddHi =
- DAG.getNode(N->getOpcode(), SL, MVT::i32, Hi, ConstHi32, N->getFlags());
-
- SDValue Lo = DAG.getNode(ISD::TRUNCATE, SL, MVT::i32, LHS);
- return DAG.getNode(ISD::BUILD_PAIR, SL, MVT::i64, Lo, AddHi);
- }
-
- return SDValue();
-}
-
-// Collect the ultimate src of each of the mul node's operands, and confirm
-// each operand is 8 bytes.
-static std::optional<ByteProvider<SDValue>>
-handleMulOperand(const SDValue &MulOperand) {
- auto Byte0 = calculateByteProvider(MulOperand, 0, 0);
- if (!Byte0 || Byte0->isConstantZero()) {
- return std::nullopt;
- }
- auto Byte1 = calculateByteProvider(MulOperand, 1, 0);
- if (Byte1 && !Byte1->isConstantZero()) {
- return std::nullopt;
- }
- return Byte0;
-}
-
-static unsigned addPermMasks(unsigned First, unsigned Second) {
- unsigned FirstCs = First & 0x0c0c0c0c;
- unsigned SecondCs = Second & 0x0c0c0c0c;
- unsigned FirstNoCs = First & ~0x0c0c0c0c;
- unsigned SecondNoCs = Second & ~0x0c0c0c0c;
-
- assert((FirstCs & 0xFF) | (SecondCs & 0xFF));
- assert((FirstCs & 0xFF00) | (SecondCs & 0xFF00));
- assert((FirstCs & 0xFF0000) | (SecondCs & 0xFF0000));
- assert((FirstCs & 0xFF000000) | (SecondCs & 0xFF000000));
-
- return (FirstNoCs | SecondNoCs) | (FirstCs & SecondCs);
-}
-
-struct DotSrc {
- SDValue SrcOp;
- int64_t PermMask;
- int64_t DWordOffset;
-};
-
-static void placeSources(ByteProvider<SDValue> &Src0,
- ByteProvider<SDValue> &Src1,
- SmallVectorImpl<DotSrc> &Src0s,
- SmallVectorImpl<DotSrc> &Src1s, int Step) {
-
- assert(Src0.Src.has_value() && Src1.Src.has_value());
- // Src0s and Src1s are empty, just place arbitrarily.
- if (Step == 0) {
- Src0s.push_back({*Src0.Src, ((Src0.SrcOffset % 4) << 24) + 0x0c0c0c,
- Src0.SrcOffset / 4});
- Src1s.push_back({*Src1.Src, ((Src1.SrcOffset % 4) << 24) + 0x0c0c0c,
- Src1.SrcOffset / 4});
- return;
- }
-
- for (int BPI = 0; BPI < 2; BPI++) {
- std::pair<ByteProvider<SDValue>, ByteProvider<SDValue>> BPP = {Src0, Src1};
- if (BPI == 1) {
- BPP = {Src1, Src0};
- }
- unsigned ZeroMask = 0x0c0c0c0c;
- unsigned FMask = 0xFF << (8 * (3 - Step));
-
- unsigned FirstMask =
- (BPP.first.SrcOffset % 4) << (8 * (3 - Step)) | (ZeroMask & ~FMask);
- unsigned SecondMask =
- (BPP.second.SrcOffset % 4) << (8 * (3 - Step)) | (ZeroMask & ~FMask);
- // Attempt to find Src vector which contains our SDValue, if so, add our
- // perm mask to the existing one. If we are unable to find a match for the
- // first SDValue, attempt to find match for the second.
- int FirstGroup = -1;
- for (int I = 0; I < 2; I++) {
- SmallVectorImpl<DotSrc> &Srcs = I == 0 ? Src0s : Src1s;
- auto MatchesFirst = [&BPP](DotSrc &IterElt) {
- return IterElt.SrcOp == *BPP.first.Src &&
- (IterElt.DWordOffset == (BPP.first.SrcOffset / 4));
- };
-
- auto *Match = llvm::find_if(Srcs, MatchesFirst);
- if (Match != Srcs.end()) {
- Match->PermMask = addPermMasks(FirstMask, Match->PermMask);
- FirstGroup = I;
- break;
- }
- }
- if (FirstGroup != -1) {
- SmallVectorImpl<DotSrc> &Srcs = FirstGroup == 1 ? Src0s : Src1s;
- auto MatchesSecond = [&BPP](DotSrc &IterElt) {
- return IterElt.SrcOp == *BPP.second.Src &&
- (IterElt.DWordOffset == (BPP.second.SrcOffset / 4));
- };
- auto *Match = llvm::find_if(Srcs, MatchesSecond);
- if (Match != Srcs.end()) {
- Match->PermMask = addPermMasks(SecondMask, Match->PermMask);
- } else
- Srcs.push_back({*BPP.second.Src, SecondMask, BPP.second.SrcOffset / 4});
- return;
- }
- }
-
- // If we have made it here, then we could not find a match in Src0s or Src1s
- // for either Src0 or Src1, so just place them arbitrarily.
-
- unsigned ZeroMask = 0x0c0c0c0c;
- unsigned FMask = 0xFF << (8 * (3 - Step));
-
- Src0s.push_back(
- {*Src0.Src,
- ((Src0.SrcOffset % 4) << (8 * (3 - Step)) | (ZeroMask & ~FMask)),
- Src0.SrcOffset / 4});
- Src1s.push_back(
- {*Src1.Src,
- ((Src1.SrcOffset % 4) << (8 * (3 - Step)) | (ZeroMask & ~FMask)),
- Src1.SrcOffset / 4});
-}
-
-static SDValue resolveSources(SelectionDAG &DAG, SDLoc SL,
- SmallVectorImpl<DotSrc> &Srcs, bool IsSigned,
- bool IsAny) {
-
- // If we just have one source, just permute it accordingly.
- if (Srcs.size() == 1) {
- auto *Elt = Srcs.begin();
- auto EltOp = getDWordFromOffset(DAG, SL, Elt->SrcOp, Elt->DWordOffset);
-
- // v_perm will produce the original value
- if (Elt->PermMask == 0x3020100)
- return EltOp;
-
- return DAG.getNode(AMDGPUISD::PERM, SL, MVT::i32, EltOp, EltOp,
- DAG.getConstant(Elt->PermMask, SL, MVT::i32));
- }
-
- auto *FirstElt = Srcs.begin();
- auto *SecondElt = std::next(FirstElt);
-
- SmallVector<SDValue, 2> Perms;
-
- // If we have multiple sources in the chain, combine them via perms (using
- // calculated perm mask) and Ors.
- while (true) {
- auto FirstMask = FirstElt->PermMask;
- auto SecondMask = SecondElt->PermMask;
-
- unsigned FirstCs = FirstMask & 0x0c0c0c0c;
- unsigned FirstPlusFour = FirstMask | 0x04040404;
- // 0x0c + 0x04 = 0x10, so anding with 0x0F will produced 0x00 for any
- // original 0x0C.
- FirstMask = (FirstPlusFour & 0x0F0F0F0F) | FirstCs;
-
- auto PermMask = addPermMasks(FirstMask, SecondMask);
- auto FirstVal =
- getDWordFromOffset(DAG, SL, FirstElt->SrcOp, FirstElt->DWordOffset);
- auto SecondVal =
- getDWordFromOffset(DAG, SL, SecondElt->SrcOp, SecondElt->DWordOffset);
-
- Perms.push_back(DAG.getNode(AMDGPUISD::PERM, SL, MVT::i32, FirstVal,
- SecondVal,
- DAG.getConstant(PermMask, SL, MVT::i32)));
-
- FirstElt = std::next(SecondElt);
- if (FirstElt == Srcs.end())
- break;
-
- SecondElt = std::next(FirstElt);
- // If we only have a FirstElt, then just combine that into the cumulative
- // source node.
- if (SecondElt == Srcs.end()) {
- auto EltOp =
- getDWordFromOffset(DAG, SL, FirstElt->SrcOp, FirstElt->DWordOffset);
-
- Perms.push_back(
- DAG.getNode(AMDGPUISD::PERM, SL, MVT::i32, EltOp, EltOp,
- DAG.getConstant(FirstElt->PermMask, SL, MVT::i32)));
- break;
- }
- }
-
- assert(Perms.size() == 1 || Perms.size() == 2);
- return Perms.size() == 2
- ? DAG.getNode(ISD::OR, SL, MVT::i32, Perms[0], Perms[1])
- : Perms[0];
-}
-
-static void fixMasks(SmallVectorImpl<DotSrc> &Srcs, unsigned ChainLength) {
- for (auto &[EntryVal, EntryMask, EntryOffset] : Srcs) {
- EntryMask = EntryMask >> ((4 - ChainLength) * 8);
- auto ZeroMask = ChainLength == 2 ? 0x0c0c0000 : 0x0c000000;
- EntryMask += ZeroMask;
- }
-}
-
-static bool isMul(const SDValue Op) {
- auto Opcode = Op.getOpcode();
-
- return (Opcode == ISD::MUL || Opcode == AMDGPUISD::MUL_U24 ||
- Opcode == AMDGPUISD::MUL_I24);
-}
-
-static std::optional<bool>
-checkDot4MulSignedness(const SDValue &N, ByteProvider<SDValue> &Src0,
- ByteProvider<SDValue> &Src1, const SDValue &S0Op,
- const SDValue &S1Op, const SelectionDAG &DAG) {
- // If we both ops are i8s (pre legalize-dag), then the signedness semantics
- // of the dot4 is irrelevant.
- if (S0Op.getValueSizeInBits() == 8 && S1Op.getValueSizeInBits() == 8)
- return false;
-
- auto Known0 = DAG.computeKnownBits(S0Op, 0);
- bool S0IsUnsigned = Known0.countMinLeadingZeros() > 0;
- bool S0IsSigned = Known0.countMinLeadingOnes() > 0;
- auto Known1 = DAG.computeKnownBits(S1Op, 0);
- bool S1IsUnsigned = Known1.countMinLeadingZeros() > 0;
- bool S1IsSigned = Known1.countMinLeadingOnes() > 0;
-
- assert(!(S0IsUnsigned && S0IsSigned));
- assert(!(S1IsUnsigned && S1IsSigned));
-
- // There are 9 possible permutations of
- // {S0IsUnsigned, S0IsSigned, S1IsUnsigned, S1IsSigned}
-
- // In two permutations, the sign bits are known to be the same for both Ops,
- // so simply return Signed / Unsigned corresponding to the MSB
-
- if ((S0IsUnsigned && S1IsUnsigned) || (S0IsSigned && S1IsSigned))
- return S0IsSigned;
-
- // In another two permutations, the sign bits are known to be opposite. In
- // this case return std::nullopt to indicate a bad match.
-
- if ((S0IsUnsigned && S1IsSigned) || (S0IsSigned && S1IsUnsigned))
- return std::nullopt;
-
- // In the remaining five permutations, we don't know the value of the sign
- // bit for at least one Op. Since we have a valid ByteProvider, we know that
- // the upper bits must be extension bits. Thus, the only ways for the sign
- // bit to be unknown is if it was sign extended from unknown value, or if it
- // was any extended. In either case, it is correct to use the signed
- // version of the signedness semantics of dot4
-
- // In two of such permutations, we known the sign bit is set for
- // one op, and the other is unknown. It is okay to used signed version of
- // dot4.
- if ((S0IsSigned && !(S1IsSigned || S1IsUnsigned)) ||
- ((S1IsSigned && !(S0IsSigned || S0IsUnsigned))))
- return true;
-
- // In one such permutation, we don't know either of the sign bits. It is okay
- // to used the signed version of dot4.
- if ((!(S1IsSigned || S1IsUnsigned) && !(S0IsSigned || S0IsUnsigned)))
- return true;
-
- // In two of such permutations, we known the sign bit is unset for
- // one op, and the other is unknown. Return std::nullopt to indicate a
- // bad match.
- if ((S0IsUnsigned && !(S1IsSigned || S1IsUnsigned)) ||
- ((S1IsUnsigned && !(S0IsSigned || S0IsUnsigned))))
- return std::nullopt;
-
- llvm_unreachable("Fully covered condition");
-}
-
-SDValue SITargetLowering::performAddCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
- SDLoc SL(N);
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- if (LHS.getOpcode() == ISD::MUL || RHS.getOpcode() == ISD::MUL) {
- if (Subtarget->hasMad64_32()) {
- if (SDValue Folded = tryFoldToMad64_32(N, DCI))
- return Folded;
- }
- }
-
- if (SDValue V = reassociateScalarOps(N, DAG)) {
- return V;
- }
-
- if (VT == MVT::i64) {
- if (SDValue Folded = foldAddSub64WithZeroLowBitsTo32(N, DCI))
- return Folded;
- }
-
- if ((isMul(LHS) || isMul(RHS)) && Subtarget->hasDot7Insts() &&
- (Subtarget->hasDot1Insts() || Subtarget->hasDot8Insts())) {
- SDValue TempNode(N, 0);
- std::optional<bool> IsSigned;
- SmallVector<DotSrc, 4> Src0s;
- SmallVector<DotSrc, 4> Src1s;
- SmallVector<SDValue, 4> Src2s;
-
- // Match the v_dot4 tree, while collecting src nodes.
- int ChainLength = 0;
- for (int I = 0; I < 4; I++) {
- auto MulIdx = isMul(LHS) ? 0 : isMul(RHS) ? 1 : -1;
- if (MulIdx == -1)
- break;
- auto Src0 = handleMulOperand(TempNode->getOperand(MulIdx)->getOperand(0));
- if (!Src0)
- break;
- auto Src1 = handleMulOperand(TempNode->getOperand(MulIdx)->getOperand(1));
- if (!Src1)
- break;
-
- auto IterIsSigned = checkDot4MulSignedness(
- TempNode->getOperand(MulIdx), *Src0, *Src1,
- TempNode->getOperand(MulIdx)->getOperand(0),
- TempNode->getOperand(MulIdx)->getOperand(1), DAG);
- if (!IterIsSigned)
- break;
- if (!IsSigned)
- IsSigned = *IterIsSigned;
- if (*IterIsSigned != *IsSigned)
- break;
- placeSources(*Src0, *Src1, Src0s, Src1s, I);
- auto AddIdx = 1 - MulIdx;
- // Allow the special case where add (add (mul24, 0), mul24) became ->
- // add (mul24, mul24).
- if (I == 2 && isMul(TempNode->getOperand(AddIdx))) {
- Src2s.push_back(TempNode->getOperand(AddIdx));
- auto Src0 =
- handleMulOperand(TempNode->getOperand(AddIdx)->getOperand(0));
- if (!Src0)
- break;
- auto Src1 =
- handleMulOperand(TempNode->getOperand(AddIdx)->getOperand(1));
- if (!Src1)
- break;
- auto IterIsSigned = checkDot4MulSignedness(
- TempNode->getOperand(AddIdx), *Src0, *Src1,
- TempNode->getOperand(AddIdx)->getOperand(0),
- TempNode->getOperand(AddIdx)->getOperand(1), DAG);
- if (!IterIsSigned)
- break;
- assert(IsSigned);
- if (*IterIsSigned != *IsSigned)
- break;
- placeSources(*Src0, *Src1, Src0s, Src1s, I + 1);
- Src2s.push_back(DAG.getConstant(0, SL, MVT::i32));
- ChainLength = I + 2;
- break;
- }
-
- TempNode = TempNode->getOperand(AddIdx);
- Src2s.push_back(TempNode);
- ChainLength = I + 1;
- if (TempNode->getNumOperands() < 2)
- break;
- LHS = TempNode->getOperand(0);
- RHS = TempNode->getOperand(1);
- }
-
- if (ChainLength < 2)
- return SDValue();
-
- // Masks were constructed with assumption that we would find a chain of
- // length 4. If not, then we need to 0 out the MSB bits (via perm mask of
- // 0x0c) so they do not affect dot calculation.
- if (ChainLength < 4) {
- fixMasks(Src0s, ChainLength);
- fixMasks(Src1s, ChainLength);
- }
-
- SDValue Src0, Src1;
-
- // If we are just using a single source for both, and have permuted the
- // bytes consistently, we can just use the sources without permuting
- // (commutation).
- bool UseOriginalSrc = false;
- if (ChainLength == 4 && Src0s.size() == 1 && Src1s.size() == 1 &&
- Src0s.begin()->PermMask == Src1s.begin()->PermMask &&
- Src0s.begin()->SrcOp.getValueSizeInBits() >= 32 &&
- Src1s.begin()->SrcOp.getValueSizeInBits() >= 32) {
- SmallVector<unsigned, 4> SrcBytes;
- auto Src0Mask = Src0s.begin()->PermMask;
- SrcBytes.push_back(Src0Mask & 0xFF000000);
- bool UniqueEntries = true;
- for (auto I = 1; I < 4; I++) {
- auto NextByte = Src0Mask & (0xFF << ((3 - I) * 8));
-
- if (is_contained(SrcBytes, NextByte)) {
- UniqueEntries = false;
- break;
- }
- SrcBytes.push_back(NextByte);
- }
-
- if (UniqueEntries) {
- UseOriginalSrc = true;
-
- auto *FirstElt = Src0s.begin();
- auto FirstEltOp =
- getDWordFromOffset(DAG, SL, FirstElt->SrcOp, FirstElt->DWordOffset);
-
- auto *SecondElt = Src1s.begin();
- auto SecondEltOp = getDWordFromOffset(DAG, SL, SecondElt->SrcOp,
- SecondElt->DWordOffset);
-
- Src0 = DAG.getBitcastedAnyExtOrTrunc(FirstEltOp, SL,
- MVT::getIntegerVT(32));
- Src1 = DAG.getBitcastedAnyExtOrTrunc(SecondEltOp, SL,
- MVT::getIntegerVT(32));
- }
- }
-
- if (!UseOriginalSrc) {
- Src0 = resolveSources(DAG, SL, Src0s, false, true);
- Src1 = resolveSources(DAG, SL, Src1s, false, true);
- }
-
- assert(IsSigned);
- SDValue Src2 =
- DAG.getExtOrTrunc(*IsSigned, Src2s[ChainLength - 1], SL, MVT::i32);
-
- SDValue IID = DAG.getTargetConstant(*IsSigned ? Intrinsic::amdgcn_sdot4
- : Intrinsic::amdgcn_udot4,
- SL, MVT::i64);
-
- assert(!VT.isVector());
- auto Dot = DAG.getNode(ISD::INTRINSIC_WO_CHAIN, SL, MVT::i32, IID, Src0,
- Src1, Src2, DAG.getTargetConstant(0, SL, MVT::i1));
-
- return DAG.getExtOrTrunc(*IsSigned, Dot, SL, VT);
- }
-
- if (VT != MVT::i32 || !DCI.isAfterLegalizeDAG())
- return SDValue();
-
- // add x, zext (setcc) => uaddo_carry x, 0, setcc
- // add x, sext (setcc) => usubo_carry x, 0, setcc
- unsigned Opc = LHS.getOpcode();
- if (Opc == ISD::ZERO_EXTEND || Opc == ISD::SIGN_EXTEND ||
- Opc == ISD::ANY_EXTEND || Opc == ISD::UADDO_CARRY)
- std::swap(RHS, LHS);
-
- Opc = RHS.getOpcode();
- switch (Opc) {
- default:
- break;
- case ISD::ZERO_EXTEND:
- case ISD::SIGN_EXTEND:
- case ISD::ANY_EXTEND: {
- auto Cond = RHS.getOperand(0);
- // If this won't be a real VOPC output, we would still need to insert an
- // extra instruction anyway.
- if (!isBoolSGPR(Cond))
- break;
- SDVTList VTList = DAG.getVTList(MVT::i32, MVT::i1);
- SDValue Args[] = {LHS, DAG.getConstant(0, SL, MVT::i32), Cond};
- Opc = (Opc == ISD::SIGN_EXTEND) ? ISD::USUBO_CARRY : ISD::UADDO_CARRY;
- return DAG.getNode(Opc, SL, VTList, Args);
- }
- case ISD::UADDO_CARRY: {
- // add x, (uaddo_carry y, 0, cc) => uaddo_carry x, y, cc
- if (!isNullConstant(RHS.getOperand(1)))
- break;
- SDValue Args[] = {LHS, RHS.getOperand(0), RHS.getOperand(2)};
- return DAG.getNode(ISD::UADDO_CARRY, SDLoc(N), RHS->getVTList(), Args);
- }
- }
- return SDValue();
-}
-
-SDValue SITargetLowering::performSubCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
-
- if (VT == MVT::i64) {
- if (SDValue Folded = foldAddSub64WithZeroLowBitsTo32(N, DCI))
- return Folded;
- }
-
- if (VT != MVT::i32)
- return SDValue();
-
- SDLoc SL(N);
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- // sub x, zext (setcc) => usubo_carry x, 0, setcc
- // sub x, sext (setcc) => uaddo_carry x, 0, setcc
- unsigned Opc = RHS.getOpcode();
- switch (Opc) {
- default:
- break;
- case ISD::ZERO_EXTEND:
- case ISD::SIGN_EXTEND:
- case ISD::ANY_EXTEND: {
- auto Cond = RHS.getOperand(0);
- // If this won't be a real VOPC output, we would still need to insert an
- // extra instruction anyway.
- if (!isBoolSGPR(Cond))
- break;
- SDVTList VTList = DAG.getVTList(MVT::i32, MVT::i1);
- SDValue Args[] = {LHS, DAG.getConstant(0, SL, MVT::i32), Cond};
- Opc = (Opc == ISD::SIGN_EXTEND) ? ISD::UADDO_CARRY : ISD::USUBO_CARRY;
- return DAG.getNode(Opc, SL, VTList, Args);
- }
- }
-
- if (LHS.getOpcode() == ISD::USUBO_CARRY) {
- // sub (usubo_carry x, 0, cc), y => usubo_carry x, y, cc
- if (!isNullConstant(LHS.getOperand(1)))
- return SDValue();
- SDValue Args[] = {LHS.getOperand(0), RHS, LHS.getOperand(2)};
- return DAG.getNode(ISD::USUBO_CARRY, SDLoc(N), LHS->getVTList(), Args);
- }
- return SDValue();
-}
-
-SDValue
-SITargetLowering::performAddCarrySubCarryCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
-
- if (N->getValueType(0) != MVT::i32)
- return SDValue();
-
- if (!isNullConstant(N->getOperand(1)))
- return SDValue();
-
- SelectionDAG &DAG = DCI.DAG;
- SDValue LHS = N->getOperand(0);
-
- // uaddo_carry (add x, y), 0, cc => uaddo_carry x, y, cc
- // usubo_carry (sub x, y), 0, cc => usubo_carry x, y, cc
- unsigned LHSOpc = LHS.getOpcode();
- unsigned Opc = N->getOpcode();
- if ((LHSOpc == ISD::ADD && Opc == ISD::UADDO_CARRY) ||
- (LHSOpc == ISD::SUB && Opc == ISD::USUBO_CARRY)) {
- SDValue Args[] = {LHS.getOperand(0), LHS.getOperand(1), N->getOperand(2)};
- return DAG.getNode(Opc, SDLoc(N), N->getVTList(), Args);
- }
- return SDValue();
-}
-
-SDValue SITargetLowering::performFAddCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- if (DCI.getDAGCombineLevel() < AfterLegalizeDAG)
- return SDValue();
-
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
-
- SDLoc SL(N);
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- // These should really be instruction patterns, but writing patterns with
- // source modifiers is a pain.
-
- // fadd (fadd (a, a), b) -> mad 2.0, a, b
- if (LHS.getOpcode() == ISD::FADD) {
- SDValue A = LHS.getOperand(0);
- if (A == LHS.getOperand(1)) {
- unsigned FusedOp = getFusedOpcode(DAG, N, LHS.getNode());
- if (FusedOp != 0) {
- const SDValue Two = DAG.getConstantFP(2.0, SL, VT);
- return DAG.getNode(FusedOp, SL, VT, A, Two, RHS);
- }
- }
- }
-
- // fadd (b, fadd (a, a)) -> mad 2.0, a, b
- if (RHS.getOpcode() == ISD::FADD) {
- SDValue A = RHS.getOperand(0);
- if (A == RHS.getOperand(1)) {
- unsigned FusedOp = getFusedOpcode(DAG, N, RHS.getNode());
- if (FusedOp != 0) {
- const SDValue Two = DAG.getConstantFP(2.0, SL, VT);
- return DAG.getNode(FusedOp, SL, VT, A, Two, LHS);
- }
- }
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performFSubCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- if (DCI.getDAGCombineLevel() < AfterLegalizeDAG)
- return SDValue();
-
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
- EVT VT = N->getValueType(0);
- assert(!VT.isVector());
-
- // Try to get the fneg to fold into the source modifier. This undoes generic
- // DAG combines and folds them into the mad.
- //
- // Only do this if we are not trying to support denormals. v_mad_f32 does
- // not support denormals ever.
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
- if (LHS.getOpcode() == ISD::FADD) {
- // (fsub (fadd a, a), c) -> mad 2.0, a, (fneg c)
- SDValue A = LHS.getOperand(0);
- if (A == LHS.getOperand(1)) {
- unsigned FusedOp = getFusedOpcode(DAG, N, LHS.getNode());
- if (FusedOp != 0) {
- const SDValue Two = DAG.getConstantFP(2.0, SL, VT);
- SDValue NegRHS = DAG.getNode(ISD::FNEG, SL, VT, RHS);
-
- return DAG.getNode(FusedOp, SL, VT, A, Two, NegRHS);
- }
- }
- }
-
- if (RHS.getOpcode() == ISD::FADD) {
- // (fsub c, (fadd a, a)) -> mad -2.0, a, c
-
- SDValue A = RHS.getOperand(0);
- if (A == RHS.getOperand(1)) {
- unsigned FusedOp = getFusedOpcode(DAG, N, RHS.getNode());
- if (FusedOp != 0) {
- const SDValue NegTwo = DAG.getConstantFP(-2.0, SL, VT);
- return DAG.getNode(FusedOp, SL, VT, A, NegTwo, LHS);
- }
- }
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performFDivCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
- EVT VT = N->getValueType(0);
- if (VT != MVT::f16 || !Subtarget->has16BitInsts())
- return SDValue();
-
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- SDNodeFlags Flags = N->getFlags();
- SDNodeFlags RHSFlags = RHS->getFlags();
- if (!Flags.hasAllowContract() || !RHSFlags.hasAllowContract() ||
- !RHS->hasOneUse())
- return SDValue();
-
- if (const ConstantFPSDNode *CLHS = dyn_cast<ConstantFPSDNode>(LHS)) {
- bool IsNegative = false;
- if (CLHS->isExactlyValue(1.0) ||
- (IsNegative = CLHS->isExactlyValue(-1.0))) {
- // fdiv contract 1.0, (sqrt contract x) -> rsq for f16
- // fdiv contract -1.0, (sqrt contract x) -> fneg(rsq) for f16
- if (RHS.getOpcode() == ISD::FSQRT) {
- // TODO: Or in RHS flags, somehow missing from SDNodeFlags
- SDValue Rsq =
- DAG.getNode(AMDGPUISD::RSQ, SL, VT, RHS.getOperand(0), Flags);
- return IsNegative ? DAG.getNode(ISD::FNEG, SL, VT, Rsq, Flags) : Rsq;
- }
- }
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performFMulCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
- EVT ScalarVT = VT.getScalarType();
- EVT IntVT = VT.changeElementType(MVT::i32);
-
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
-
- // It is cheaper to realize i32 inline constants as compared against
- // materializing f16 or f64 (or even non-inline f32) values,
- // possible via ldexp usage, as shown below :
- //
- // Given : A = 2^a & B = 2^b ; where a and b are integers.
- // fmul x, (select y, A, B) -> ldexp( x, (select i32 y, a, b) )
- // fmul x, (select y, -A, -B) -> ldexp( (fneg x), (select i32 y, a, b) )
- if ((ScalarVT == MVT::f64 || ScalarVT == MVT::f32 || ScalarVT == MVT::f16) &&
- (RHS.hasOneUse() && RHS.getOpcode() == ISD::SELECT)) {
- const ConstantFPSDNode *TrueNode = isConstOrConstSplatFP(RHS.getOperand(1));
- if (!TrueNode)
- return SDValue();
- const ConstantFPSDNode *FalseNode =
- isConstOrConstSplatFP(RHS.getOperand(2));
- if (!FalseNode)
- return SDValue();
-
- if (TrueNode->isNegative() != FalseNode->isNegative())
- return SDValue();
-
- // For f32, only non-inline constants should be transformed.
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
- if (ScalarVT == MVT::f32 &&
- TII->isInlineConstant(TrueNode->getValueAPF()) &&
- TII->isInlineConstant(FalseNode->getValueAPF()))
- return SDValue();
-
- int TrueNodeExpVal = TrueNode->getValueAPF().getExactLog2Abs();
- if (TrueNodeExpVal == INT_MIN)
- return SDValue();
- int FalseNodeExpVal = FalseNode->getValueAPF().getExactLog2Abs();
- if (FalseNodeExpVal == INT_MIN)
- return SDValue();
-
- SDLoc SL(N);
- SDValue SelectNode =
- DAG.getNode(ISD::SELECT, SL, IntVT, RHS.getOperand(0),
- DAG.getSignedConstant(TrueNodeExpVal, SL, IntVT),
- DAG.getSignedConstant(FalseNodeExpVal, SL, IntVT));
-
- LHS = TrueNode->isNegative()
- ? DAG.getNode(ISD::FNEG, SL, VT, LHS, LHS->getFlags())
- : LHS;
-
- return DAG.getNode(ISD::FLDEXP, SL, VT, LHS, SelectNode, N->getFlags());
- }
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performFMACombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
- SDLoc SL(N);
-
- if (!Subtarget->hasDot10Insts() || VT != MVT::f32)
- return SDValue();
-
- // FMA((F32)S0.x, (F32)S1. x, FMA((F32)S0.y, (F32)S1.y, (F32)z)) ->
- // FDOT2((V2F16)S0, (V2F16)S1, (F32)z))
- SDValue Op1 = N->getOperand(0);
- SDValue Op2 = N->getOperand(1);
- SDValue FMA = N->getOperand(2);
-
- if (FMA.getOpcode() != ISD::FMA || Op1.getOpcode() != ISD::FP_EXTEND ||
- Op2.getOpcode() != ISD::FP_EXTEND)
- return SDValue();
-
- // fdot2_f32_f16 always flushes fp32 denormal operand and output to zero,
- // regardless of the denorm mode setting. Therefore,
- // unsafe-fp-math/fp-contract is sufficient to allow generating fdot2.
- const TargetOptions &Options = DAG.getTarget().Options;
- if (Options.AllowFPOpFusion == FPOpFusion::Fast || Options.UnsafeFPMath ||
- (N->getFlags().hasAllowContract() &&
- FMA->getFlags().hasAllowContract())) {
- Op1 = Op1.getOperand(0);
- Op2 = Op2.getOperand(0);
- if (Op1.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
- Op2.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
- return SDValue();
-
- SDValue Vec1 = Op1.getOperand(0);
- SDValue Idx1 = Op1.getOperand(1);
- SDValue Vec2 = Op2.getOperand(0);
-
- SDValue FMAOp1 = FMA.getOperand(0);
- SDValue FMAOp2 = FMA.getOperand(1);
- SDValue FMAAcc = FMA.getOperand(2);
-
- if (FMAOp1.getOpcode() != ISD::FP_EXTEND ||
- FMAOp2.getOpcode() != ISD::FP_EXTEND)
- return SDValue();
-
- FMAOp1 = FMAOp1.getOperand(0);
- FMAOp2 = FMAOp2.getOperand(0);
- if (FMAOp1.getOpcode() != ISD::EXTRACT_VECTOR_ELT ||
- FMAOp2.getOpcode() != ISD::EXTRACT_VECTOR_ELT)
- return SDValue();
-
- SDValue Vec3 = FMAOp1.getOperand(0);
- SDValue Vec4 = FMAOp2.getOperand(0);
- SDValue Idx2 = FMAOp1.getOperand(1);
-
- if (Idx1 != Op2.getOperand(1) || Idx2 != FMAOp2.getOperand(1) ||
- // Idx1 and Idx2 cannot be the same.
- Idx1 == Idx2)
- return SDValue();
-
- if (Vec1 == Vec2 || Vec3 == Vec4)
- return SDValue();
-
- if (Vec1.getValueType() != MVT::v2f16 || Vec2.getValueType() != MVT::v2f16)
- return SDValue();
-
- if ((Vec1 == Vec3 && Vec2 == Vec4) || (Vec1 == Vec4 && Vec2 == Vec3)) {
- return DAG.getNode(AMDGPUISD::FDOT2, SL, MVT::f32, Vec1, Vec2, FMAAcc,
- DAG.getTargetConstant(0, SL, MVT::i1));
- }
- }
- return SDValue();
-}
-
-SDValue SITargetLowering::performSetCCCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
-
- SDValue LHS = N->getOperand(0);
- SDValue RHS = N->getOperand(1);
- EVT VT = LHS.getValueType();
- ISD::CondCode CC = cast<CondCodeSDNode>(N->getOperand(2))->get();
-
- auto *CRHS = dyn_cast<ConstantSDNode>(RHS);
- if (!CRHS) {
- CRHS = dyn_cast<ConstantSDNode>(LHS);
- if (CRHS) {
- std::swap(LHS, RHS);
- CC = getSetCCSwappedOperands(CC);
- }
- }
-
- if (CRHS) {
- if (VT == MVT::i32 && LHS.getOpcode() == ISD::SIGN_EXTEND &&
- isBoolSGPR(LHS.getOperand(0))) {
- // setcc (sext from i1 cc), -1, ne|sgt|ult) => not cc => xor cc, -1
- // setcc (sext from i1 cc), -1, eq|sle|uge) => cc
- // setcc (sext from i1 cc), 0, eq|sge|ule) => not cc => xor cc, -1
- // setcc (sext from i1 cc), 0, ne|ugt|slt) => cc
- if ((CRHS->isAllOnes() &&
- (CC == ISD::SETNE || CC == ISD::SETGT || CC == ISD::SETULT)) ||
- (CRHS->isZero() &&
- (CC == ISD::SETEQ || CC == ISD::SETGE || CC == ISD::SETULE)))
- return DAG.getNode(ISD::XOR, SL, MVT::i1, LHS.getOperand(0),
- DAG.getAllOnesConstant(SL, MVT::i1));
- if ((CRHS->isAllOnes() &&
- (CC == ISD::SETEQ || CC == ISD::SETLE || CC == ISD::SETUGE)) ||
- (CRHS->isZero() &&
- (CC == ISD::SETNE || CC == ISD::SETUGT || CC == ISD::SETLT)))
- return LHS.getOperand(0);
- }
-
- const APInt &CRHSVal = CRHS->getAPIntValue();
- if ((CC == ISD::SETEQ || CC == ISD::SETNE) &&
- LHS.getOpcode() == ISD::SELECT &&
- isa<ConstantSDNode>(LHS.getOperand(1)) &&
- isa<ConstantSDNode>(LHS.getOperand(2)) &&
- LHS.getConstantOperandVal(1) != LHS.getConstantOperandVal(2) &&
- isBoolSGPR(LHS.getOperand(0))) {
- // Given CT != FT:
- // setcc (select cc, CT, CF), CF, eq => xor cc, -1
- // setcc (select cc, CT, CF), CF, ne => cc
- // setcc (select cc, CT, CF), CT, ne => xor cc, -1
- // setcc (select cc, CT, CF), CT, eq => cc
- const APInt &CT = LHS.getConstantOperandAPInt(1);
- const APInt &CF = LHS.getConstantOperandAPInt(2);
-
- if ((CF == CRHSVal && CC == ISD::SETEQ) ||
- (CT == CRHSVal && CC == ISD::SETNE))
- return DAG.getNode(ISD::XOR, SL, MVT::i1, LHS.getOperand(0),
- DAG.getAllOnesConstant(SL, MVT::i1));
- if ((CF == CRHSVal && CC == ISD::SETNE) ||
- (CT == CRHSVal && CC == ISD::SETEQ))
- return LHS.getOperand(0);
- }
- }
-
- if (VT != MVT::f32 && VT != MVT::f64 &&
- (!Subtarget->has16BitInsts() || VT != MVT::f16))
- return SDValue();
-
- // Match isinf/isfinite pattern
- // (fcmp oeq (fabs x), inf) -> (fp_class x, (p_infinity | n_infinity))
- // (fcmp one (fabs x), inf) -> (fp_class x,
- // (p_normal | n_normal | p_subnormal | n_subnormal | p_zero | n_zero)
- if ((CC == ISD::SETOEQ || CC == ISD::SETONE) &&
- LHS.getOpcode() == ISD::FABS) {
- const ConstantFPSDNode *CRHS = dyn_cast<ConstantFPSDNode>(RHS);
- if (!CRHS)
- return SDValue();
-
- const APFloat &APF = CRHS->getValueAPF();
- if (APF.isInfinity() && !APF.isNegative()) {
- const unsigned IsInfMask =
- SIInstrFlags::P_INFINITY | SIInstrFlags::N_INFINITY;
- const unsigned IsFiniteMask =
- SIInstrFlags::N_ZERO | SIInstrFlags::P_ZERO | SIInstrFlags::N_NORMAL |
- SIInstrFlags::P_NORMAL | SIInstrFlags::N_SUBNORMAL |
- SIInstrFlags::P_SUBNORMAL;
- unsigned Mask = CC == ISD::SETOEQ ? IsInfMask : IsFiniteMask;
- return DAG.getNode(AMDGPUISD::FP_CLASS, SL, MVT::i1, LHS.getOperand(0),
- DAG.getConstant(Mask, SL, MVT::i32));
- }
- }
-
- return SDValue();
-}
-
-SDValue
-SITargetLowering::performCvtF32UByteNCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- SelectionDAG &DAG = DCI.DAG;
- SDLoc SL(N);
- unsigned Offset = N->getOpcode() - AMDGPUISD::CVT_F32_UBYTE0;
-
- SDValue Src = N->getOperand(0);
- SDValue Shift = N->getOperand(0);
-
- // TODO: Extend type shouldn't matter (assuming legal types).
- if (Shift.getOpcode() == ISD::ZERO_EXTEND)
- Shift = Shift.getOperand(0);
-
- if (Shift.getOpcode() == ISD::SRL || Shift.getOpcode() == ISD::SHL) {
- // cvt_f32_ubyte1 (shl x, 8) -> cvt_f32_ubyte0 x
- // cvt_f32_ubyte3 (shl x, 16) -> cvt_f32_ubyte1 x
- // cvt_f32_ubyte0 (srl x, 16) -> cvt_f32_ubyte2 x
- // cvt_f32_ubyte1 (srl x, 16) -> cvt_f32_ubyte3 x
- // cvt_f32_ubyte0 (srl x, 8) -> cvt_f32_ubyte1 x
- if (auto *C = dyn_cast<ConstantSDNode>(Shift.getOperand(1))) {
- SDValue Shifted = DAG.getZExtOrTrunc(
- Shift.getOperand(0), SDLoc(Shift.getOperand(0)), MVT::i32);
-
- unsigned ShiftOffset = 8 * Offset;
- if (Shift.getOpcode() == ISD::SHL)
- ShiftOffset -= C->getZExtValue();
- else
- ShiftOffset += C->getZExtValue();
-
- if (ShiftOffset < 32 && (ShiftOffset % 8) == 0) {
- return DAG.getNode(AMDGPUISD::CVT_F32_UBYTE0 + ShiftOffset / 8, SL,
- MVT::f32, Shifted);
- }
- }
- }
-
- const TargetLowering &TLI = DAG.getTargetLoweringInfo();
- APInt DemandedBits = APInt::getBitsSet(32, 8 * Offset, 8 * Offset + 8);
- if (TLI.SimplifyDemandedBits(Src, DemandedBits, DCI)) {
- // We simplified Src. If this node is not dead, visit it again so it is
- // folded properly.
- if (N->getOpcode() != ISD::DELETED_NODE)
- DCI.AddToWorklist(N);
- return SDValue(N, 0);
- }
-
- // Handle (or x, (srl y, 8)) pattern when known bits are zero.
- if (SDValue DemandedSrc =
- TLI.SimplifyMultipleUseDemandedBits(Src, DemandedBits, DAG))
- return DAG.getNode(N->getOpcode(), SL, MVT::f32, DemandedSrc);
-
- return SDValue();
-}
-
-SDValue SITargetLowering::performClampCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- ConstantFPSDNode *CSrc = dyn_cast<ConstantFPSDNode>(N->getOperand(0));
- if (!CSrc)
- return SDValue();
-
- const MachineFunction &MF = DCI.DAG.getMachineFunction();
- const APFloat &F = CSrc->getValueAPF();
- APFloat Zero = APFloat::getZero(F.getSemantics());
- if (F < Zero ||
- (F.isNaN() && MF.getInfo<SIMachineFunctionInfo>()->getMode().DX10Clamp)) {
- return DCI.DAG.getConstantFP(Zero, SDLoc(N), N->getValueType(0));
- }
-
- APFloat One(F.getSemantics(), "1.0");
- if (F > One)
- return DCI.DAG.getConstantFP(One, SDLoc(N), N->getValueType(0));
-
- return SDValue(CSrc, 0);
-}
-
-SDValue SITargetLowering::PerformDAGCombine(SDNode *N,
- DAGCombinerInfo &DCI) const {
- switch (N->getOpcode()) {
- case ISD::ADD:
- case ISD::SUB:
- case ISD::SHL:
- case ISD::SRL:
- case ISD::SRA:
- case ISD::AND:
- case ISD::OR:
- case ISD::XOR:
- case ISD::MUL:
- case ISD::SETCC:
- case ISD::SELECT:
- case ISD::SMIN:
- case ISD::SMAX:
- case ISD::UMIN:
- case ISD::UMAX:
- if (auto Res = promoteUniformOpToI32(SDValue(N, 0), DCI))
- return Res;
- break;
- default:
- break;
- }
-
- if (getTargetMachine().getOptLevel() == CodeGenOptLevel::None)
- return SDValue();
-
- switch (N->getOpcode()) {
- case ISD::ADD:
- return performAddCombine(N, DCI);
- case ISD::SUB:
- return performSubCombine(N, DCI);
- case ISD::UADDO_CARRY:
- case ISD::USUBO_CARRY:
- return performAddCarrySubCarryCombine(N, DCI);
- case ISD::FADD:
- return performFAddCombine(N, DCI);
- case ISD::FSUB:
- return performFSubCombine(N, DCI);
- case ISD::FDIV:
- return performFDivCombine(N, DCI);
- case ISD::FMUL:
- return performFMulCombine(N, DCI);
- case ISD::SETCC:
- return performSetCCCombine(N, DCI);
- case ISD::FMAXNUM:
- case ISD::FMINNUM:
- case ISD::FMAXNUM_IEEE:
- case ISD::FMINNUM_IEEE:
- case ISD::FMAXIMUM:
- case ISD::FMINIMUM:
- case ISD::FMAXIMUMNUM:
- case ISD::FMINIMUMNUM:
- case ISD::SMAX:
- case ISD::SMIN:
- case ISD::UMAX:
- case ISD::UMIN:
- case AMDGPUISD::FMIN_LEGACY:
- case AMDGPUISD::FMAX_LEGACY:
- return performMinMaxCombine(N, DCI);
- case ISD::FMA:
- return performFMACombine(N, DCI);
- case ISD::AND:
- return performAndCombine(N, DCI);
- case ISD::OR:
- return performOrCombine(N, DCI);
- case ISD::FSHR: {
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
- if (N->getValueType(0) == MVT::i32 && N->isDivergent() &&
- TII->pseudoToMCOpcode(AMDGPU::V_PERM_B32_e64) != -1) {
- return matchPERM(N, DCI);
- }
- break;
- }
- case ISD::XOR:
- return performXorCombine(N, DCI);
- case ISD::ZERO_EXTEND:
- return performZeroExtendCombine(N, DCI);
- case ISD::SIGN_EXTEND_INREG:
- return performSignExtendInRegCombine(N, DCI);
- case AMDGPUISD::FP_CLASS:
- return performClassCombine(N, DCI);
- case ISD::FCANONICALIZE:
- return performFCanonicalizeCombine(N, DCI);
- case AMDGPUISD::RCP:
- return performRcpCombine(N, DCI);
- case ISD::FLDEXP:
- case AMDGPUISD::FRACT:
- case AMDGPUISD::RSQ:
- case AMDGPUISD::RCP_LEGACY:
- case AMDGPUISD::RCP_IFLAG:
- case AMDGPUISD::RSQ_CLAMP: {
- // FIXME: This is probably wrong. If src is an sNaN, it won't be quieted
- SDValue Src = N->getOperand(0);
- if (Src.isUndef())
- return Src;
- break;
- }
- case ISD::SINT_TO_FP:
- case ISD::UINT_TO_FP:
- return performUCharToFloatCombine(N, DCI);
- case ISD::FCOPYSIGN:
- return performFCopySignCombine(N, DCI);
- case AMDGPUISD::CVT_F32_UBYTE0:
- case AMDGPUISD::CVT_F32_UBYTE1:
- case AMDGPUISD::CVT_F32_UBYTE2:
- case AMDGPUISD::CVT_F32_UBYTE3:
- return performCvtF32UByteNCombine(N, DCI);
- case AMDGPUISD::FMED3:
- return performFMed3Combine(N, DCI);
- case AMDGPUISD::CVT_PKRTZ_F16_F32:
- return performCvtPkRTZCombine(N, DCI);
- case AMDGPUISD::CLAMP:
- return performClampCombine(N, DCI);
- case ISD::SCALAR_TO_VECTOR: {
- SelectionDAG &DAG = DCI.DAG;
- EVT VT = N->getValueType(0);
-
- // v2i16 (scalar_to_vector i16:x) -> v2i16 (bitcast (any_extend i16:x))
- if (VT == MVT::v2i16 || VT == MVT::v2f16 || VT == MVT::v2bf16) {
- SDLoc SL(N);
- SDValue Src = N->getOperand(0);
- EVT EltVT = Src.getValueType();
- if (EltVT != MVT::i16)
- Src = DAG.getNode(ISD::BITCAST, SL, MVT::i16, Src);
-
- SDValue Ext = DAG.getNode(ISD::ANY_EXTEND, SL, MVT::i32, Src);
- return DAG.getNode(ISD::BITCAST, SL, VT, Ext);
- }
-
- break;
- }
- case ISD::EXTRACT_VECTOR_ELT:
- return performExtractVectorEltCombine(N, DCI);
- case ISD::INSERT_VECTOR_ELT:
- return performInsertVectorEltCombine(N, DCI);
- case ISD::FP_ROUND:
- return performFPRoundCombine(N, DCI);
- case ISD::LOAD: {
- if (SDValue Widened = widenLoad(cast<LoadSDNode>(N), DCI))
- return Widened;
- [[fallthrough]];
- }
- default: {
- if (!DCI.isBeforeLegalize()) {
- if (MemSDNode *MemNode = dyn_cast<MemSDNode>(N))
- return performMemSDNodeCombine(MemNode, DCI);
- }
-
- break;
- }
- }
-
- return AMDGPUTargetLowering::PerformDAGCombine(N, DCI);
-}
-
-/// Helper function for adjustWritemask
-static unsigned SubIdx2Lane(unsigned Idx) {
- switch (Idx) {
- default:
- return ~0u;
- case AMDGPU::sub0:
- return 0;
- case AMDGPU::sub1:
- return 1;
- case AMDGPU::sub2:
- return 2;
- case AMDGPU::sub3:
- return 3;
- case AMDGPU::sub4:
- return 4; // Possible with TFE/LWE
- }
-}
-
-/// Adjust the writemask of MIMG, VIMAGE or VSAMPLE instructions
-SDNode *SITargetLowering::adjustWritemask(MachineSDNode *&Node,
- SelectionDAG &DAG) const {
- unsigned Opcode = Node->getMachineOpcode();
-
- // Subtract 1 because the vdata output is not a MachineSDNode operand.
- int D16Idx = AMDGPU::getNamedOperandIdx(Opcode, AMDGPU::OpName::d16) - 1;
- if (D16Idx >= 0 && Node->getConstantOperandVal(D16Idx))
- return Node; // not implemented for D16
-
- SDNode *Users[5] = {nullptr};
- unsigned Lane = 0;
- unsigned DmaskIdx =
- AMDGPU::getNamedOperandIdx(Opcode, AMDGPU::OpName::dmask) - 1;
- unsigned OldDmask = Node->getConstantOperandVal(DmaskIdx);
- unsigned NewDmask = 0;
- unsigned TFEIdx = AMDGPU::getNamedOperandIdx(Opcode, AMDGPU::OpName::tfe) - 1;
- unsigned LWEIdx = AMDGPU::getNamedOperandIdx(Opcode, AMDGPU::OpName::lwe) - 1;
- bool UsesTFC = (int(TFEIdx) >= 0 && Node->getConstantOperandVal(TFEIdx)) ||
- (int(LWEIdx) >= 0 && Node->getConstantOperandVal(LWEIdx));
- unsigned TFCLane = 0;
- bool HasChain = Node->getNumValues() > 1;
-
- if (OldDmask == 0) {
- // These are folded out, but on the chance it happens don't assert.
- return Node;
- }
-
- unsigned OldBitsSet = llvm::popcount(OldDmask);
- // Work out which is the TFE/LWE lane if that is enabled.
- if (UsesTFC) {
- TFCLane = OldBitsSet;
- }
-
- // Try to figure out the used register components
- for (SDUse &Use : Node->uses()) {
-
- // Don't look at users of the chain.
- if (Use.getResNo() != 0)
- continue;
-
- SDNode *User = Use.getUser();
-
- // Abort if we can't understand the usage
- if (!User->isMachineOpcode() ||
- User->getMachineOpcode() != TargetOpcode::EXTRACT_SUBREG)
- return Node;
-
- // Lane means which subreg of %vgpra_vgprb_vgprc_vgprd is used.
- // Note that subregs are packed, i.e. Lane==0 is the first bit set
- // in OldDmask, so it can be any of X,Y,Z,W; Lane==1 is the second bit
- // set, etc.
- Lane = SubIdx2Lane(User->getConstantOperandVal(1));
- if (Lane == ~0u)
- return Node;
-
- // Check if the use is for the TFE/LWE generated result at VGPRn+1.
- if (UsesTFC && Lane == TFCLane) {
- Users[Lane] = User;
- } else {
- // Set which texture component corresponds to the lane.
- unsigned Comp;
- for (unsigned i = 0, Dmask = OldDmask; (i <= Lane) && (Dmask != 0); i++) {
- Comp = llvm::countr_zero(Dmask);
- Dmask &= ~(1 << Comp);
- }
-
- // Abort if we have more than one user per component.
- if (Users[Lane])
- return Node;
-
- Users[Lane] = User;
- NewDmask |= 1 << Comp;
- }
- }
-
- // Don't allow 0 dmask, as hardware assumes one channel enabled.
- bool NoChannels = !NewDmask;
- if (NoChannels) {
- if (!UsesTFC) {
- // No uses of the result and not using TFC. Then do nothing.
- return Node;
- }
- // If the original dmask has one channel - then nothing to do
- if (OldBitsSet == 1)
- return Node;
- // Use an arbitrary dmask - required for the instruction to work
- NewDmask = 1;
- }
- // Abort if there's no change
- if (NewDmask == OldDmask)
- return Node;
-
- unsigned BitsSet = llvm::popcount(NewDmask);
-
- // Check for TFE or LWE - increase the number of channels by one to account
- // for the extra return value
- // This will need adjustment for D16 if this is also included in
- // adjustWriteMask (this function) but at present D16 are excluded.
- unsigned NewChannels = BitsSet + UsesTFC;
-
- int NewOpcode =
- AMDGPU::getMaskedMIMGOp(Node->getMachineOpcode(), NewChannels);
- assert(NewOpcode != -1 &&
- NewOpcode != static_cast<int>(Node->getMachineOpcode()) &&
- "failed to find equivalent MIMG op");
-
- // Adjust the writemask in the node
- SmallVector<SDValue, 12> Ops;
- llvm::append_range(Ops, Node->ops().take_front(DmaskIdx));
- Ops.push_back(DAG.getTargetConstant(NewDmask, SDLoc(Node), MVT::i32));
- llvm::append_range(Ops, Node->ops().drop_front(DmaskIdx + 1));
-
- MVT SVT = Node->getValueType(0).getVectorElementType().getSimpleVT();
-
- MVT ResultVT = NewChannels == 1
- ? SVT
- : MVT::getVectorVT(SVT, NewChannels == 3 ? 4
- : NewChannels == 5 ? 8
- : NewChannels);
- SDVTList NewVTList =
- HasChain ? DAG.getVTList(ResultVT, MVT::Other) : DAG.getVTList(ResultVT);
-
- MachineSDNode *NewNode =
- DAG.getMachineNode(NewOpcode, SDLoc(Node), NewVTList, Ops);
-
- if (HasChain) {
- // Update chain.
- DAG.setNodeMemRefs(NewNode, Node->memoperands());
- DAG.ReplaceAllUsesOfValueWith(SDValue(Node, 1), SDValue(NewNode, 1));
- }
-
- if (NewChannels == 1) {
- assert(Node->hasNUsesOfValue(1, 0));
- SDNode *Copy =
- DAG.getMachineNode(TargetOpcode::COPY, SDLoc(Node),
- Users[Lane]->getValueType(0), SDValue(NewNode, 0));
- DAG.ReplaceAllUsesWith(Users[Lane], Copy);
- return nullptr;
- }
-
- // Update the users of the node with the new indices
- for (unsigned i = 0, Idx = AMDGPU::sub0; i < 5; ++i) {
- SDNode *User = Users[i];
- if (!User) {
- // Handle the special case of NoChannels. We set NewDmask to 1 above, but
- // Users[0] is still nullptr because channel 0 doesn't really have a use.
- if (i || !NoChannels)
- continue;
- } else {
- SDValue Op = DAG.getTargetConstant(Idx, SDLoc(User), MVT::i32);
- SDNode *NewUser = DAG.UpdateNodeOperands(User, SDValue(NewNode, 0), Op);
- if (NewUser != User) {
- DAG.ReplaceAllUsesWith(SDValue(User, 0), SDValue(NewUser, 0));
- DAG.RemoveDeadNode(User);
- }
- }
-
- switch (Idx) {
- default:
- break;
- case AMDGPU::sub0:
- Idx = AMDGPU::sub1;
- break;
- case AMDGPU::sub1:
- Idx = AMDGPU::sub2;
- break;
- case AMDGPU::sub2:
- Idx = AMDGPU::sub3;
- break;
- case AMDGPU::sub3:
- Idx = AMDGPU::sub4;
- break;
- }
- }
-
- DAG.RemoveDeadNode(Node);
- return nullptr;
-}
-
-static bool isFrameIndexOp(SDValue Op) {
- if (Op.getOpcode() == ISD::AssertZext)
- Op = Op.getOperand(0);
-
- return isa<FrameIndexSDNode>(Op);
-}
-
-/// Legalize target independent instructions (e.g. INSERT_SUBREG)
-/// with frame index operands.
-/// LLVM assumes that inputs are to these instructions are registers.
-SDNode *
-SITargetLowering::legalizeTargetIndependentNode(SDNode *Node,
- SelectionDAG &DAG) const {
- if (Node->getOpcode() == ISD::CopyToReg) {
- RegisterSDNode *DestReg = cast<RegisterSDNode>(Node->getOperand(1));
- SDValue SrcVal = Node->getOperand(2);
-
- // Insert a copy to a VReg_1 virtual register so LowerI1Copies doesn't have
- // to try understanding copies to physical registers.
- if (SrcVal.getValueType() == MVT::i1 && DestReg->getReg().isPhysical()) {
- SDLoc SL(Node);
- MachineRegisterInfo &MRI = DAG.getMachineFunction().getRegInfo();
- SDValue VReg = DAG.getRegister(
- MRI.createVirtualRegister(&AMDGPU::VReg_1RegClass), MVT::i1);
-
- SDNode *Glued = Node->getGluedNode();
- SDValue ToVReg = DAG.getCopyToReg(
- Node->getOperand(0), SL, VReg, SrcVal,
- SDValue(Glued, Glued ? Glued->getNumValues() - 1 : 0));
- SDValue ToResultReg = DAG.getCopyToReg(ToVReg, SL, SDValue(DestReg, 0),
- VReg, ToVReg.getValue(1));
- DAG.ReplaceAllUsesWith(Node, ToResultReg.getNode());
- DAG.RemoveDeadNode(Node);
- return ToResultReg.getNode();
- }
- }
-
- SmallVector<SDValue, 8> Ops;
- for (unsigned i = 0; i < Node->getNumOperands(); ++i) {
- if (!isFrameIndexOp(Node->getOperand(i))) {
- Ops.push_back(Node->getOperand(i));
- continue;
- }
-
- SDLoc DL(Node);
- Ops.push_back(SDValue(DAG.getMachineNode(AMDGPU::S_MOV_B32, DL,
- Node->getOperand(i).getValueType(),
- Node->getOperand(i)),
- 0));
- }
-
- return DAG.UpdateNodeOperands(Node, Ops);
-}
-
-/// Fold the instructions after selecting them.
-/// Returns null if users were already updated.
-SDNode *SITargetLowering::PostISelFolding(MachineSDNode *Node,
- SelectionDAG &DAG) const {
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
- unsigned Opcode = Node->getMachineOpcode();
-
- if (TII->isImage(Opcode) && !TII->get(Opcode).mayStore() &&
- !TII->isGather4(Opcode) &&
- AMDGPU::hasNamedOperand(Opcode, AMDGPU::OpName::dmask)) {
- return adjustWritemask(Node, DAG);
- }
-
- if (Opcode == AMDGPU::INSERT_SUBREG || Opcode == AMDGPU::REG_SEQUENCE) {
- legalizeTargetIndependentNode(Node, DAG);
- return Node;
- }
-
- switch (Opcode) {
- case AMDGPU::V_DIV_SCALE_F32_e64:
- case AMDGPU::V_DIV_SCALE_F64_e64: {
- // Satisfy the operand register constraint when one of the inputs is
- // undefined. Ordinarily each undef value will have its own implicit_def of
- // a vreg, so force these to use a single register.
- SDValue Src0 = Node->getOperand(1);
- SDValue Src1 = Node->getOperand(3);
- SDValue Src2 = Node->getOperand(5);
-
- if ((Src0.isMachineOpcode() &&
- Src0.getMachineOpcode() != AMDGPU::IMPLICIT_DEF) &&
- (Src0 == Src1 || Src0 == Src2))
- break;
-
- MVT VT = Src0.getValueType().getSimpleVT();
- const TargetRegisterClass *RC =
- getRegClassFor(VT, Src0.getNode()->isDivergent());
-
- MachineRegisterInfo &MRI = DAG.getMachineFunction().getRegInfo();
- SDValue UndefReg = DAG.getRegister(MRI.createVirtualRegister(RC), VT);
-
- SDValue ImpDef = DAG.getCopyToReg(DAG.getEntryNode(), SDLoc(Node), UndefReg,
- Src0, SDValue());
-
- // src0 must be the same register as src1 or src2, even if the value is
- // undefined, so make sure we don't violate this constraint.
- if (Src0.isMachineOpcode() &&
- Src0.getMachineOpcode() == AMDGPU::IMPLICIT_DEF) {
- if (Src1.isMachineOpcode() &&
- Src1.getMachineOpcode() != AMDGPU::IMPLICIT_DEF)
- Src0 = Src1;
- else if (Src2.isMachineOpcode() &&
- Src2.getMachineOpcode() != AMDGPU::IMPLICIT_DEF)
- Src0 = Src2;
- else {
- assert(Src1.getMachineOpcode() == AMDGPU::IMPLICIT_DEF);
- Src0 = UndefReg;
- Src1 = UndefReg;
- }
- } else
- break;
-
- SmallVector<SDValue, 9> Ops(Node->ops());
- Ops[1] = Src0;
- Ops[3] = Src1;
- Ops[5] = Src2;
- Ops.push_back(ImpDef.getValue(1));
- return DAG.getMachineNode(Opcode, SDLoc(Node), Node->getVTList(), Ops);
- }
- default:
- break;
- }
-
- return Node;
-}
-
-// Any MIMG instructions that use tfe or lwe require an initialization of the
-// result register that will be written in the case of a memory access failure.
-// The required code is also added to tie this init code to the result of the
-// img instruction.
-void SITargetLowering::AddMemOpInit(MachineInstr &MI) const {
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
- const SIRegisterInfo &TRI = TII->getRegisterInfo();
- MachineRegisterInfo &MRI = MI.getMF()->getRegInfo();
- MachineBasicBlock &MBB = *MI.getParent();
-
- int DstIdx =
- AMDGPU::getNamedOperandIdx(MI.getOpcode(), AMDGPU::OpName::vdata);
- unsigned InitIdx = 0;
-
- if (TII->isImage(MI)) {
- MachineOperand *TFE = TII->getNamedOperand(MI, AMDGPU::OpName::tfe);
- MachineOperand *LWE = TII->getNamedOperand(MI, AMDGPU::OpName::lwe);
- MachineOperand *D16 = TII->getNamedOperand(MI, AMDGPU::OpName::d16);
-
- if (!TFE && !LWE) // intersect_ray
- return;
-
- unsigned TFEVal = TFE ? TFE->getImm() : 0;
- unsigned LWEVal = LWE ? LWE->getImm() : 0;
- unsigned D16Val = D16 ? D16->getImm() : 0;
-
- if (!TFEVal && !LWEVal)
- return;
-
- // At least one of TFE or LWE are non-zero
- // We have to insert a suitable initialization of the result value and
- // tie this to the dest of the image instruction.
-
- // Calculate which dword we have to initialize to 0.
- MachineOperand *MO_Dmask = TII->getNamedOperand(MI, AMDGPU::OpName::dmask);
-
- // check that dmask operand is found.
- assert(MO_Dmask && "Expected dmask operand in instruction");
-
- unsigned dmask = MO_Dmask->getImm();
- // Determine the number of active lanes taking into account the
- // Gather4 special case
- unsigned ActiveLanes = TII->isGather4(MI) ? 4 : llvm::popcount(dmask);
-
- bool Packed = !Subtarget->hasUnpackedD16VMem();
-
- InitIdx = D16Val && Packed ? ((ActiveLanes + 1) >> 1) + 1 : ActiveLanes + 1;
-
- // Abandon attempt if the dst size isn't large enough
- // - this is in fact an error but this is picked up elsewhere and
- // reported correctly.
- uint32_t DstSize =
- TRI.getRegSizeInBits(*TII->getOpRegClass(MI, DstIdx)) / 32;
- if (DstSize < InitIdx)
- return;
- } else if (TII->isMUBUF(MI) && AMDGPU::getMUBUFTfe(MI.getOpcode())) {
- InitIdx = TRI.getRegSizeInBits(*TII->getOpRegClass(MI, DstIdx)) / 32;
- } else {
- return;
- }
-
- const DebugLoc &DL = MI.getDebugLoc();
-
- // Create a register for the initialization value.
- Register PrevDst = MRI.cloneVirtualRegister(MI.getOperand(DstIdx).getReg());
- unsigned NewDst = 0; // Final initialized value will be in here
-
- // If PRTStrictNull feature is enabled (the default) then initialize
- // all the result registers to 0, otherwise just the error indication
- // register (VGPRn+1)
- unsigned SizeLeft = Subtarget->usePRTStrictNull() ? InitIdx : 1;
- unsigned CurrIdx = Subtarget->usePRTStrictNull() ? 0 : (InitIdx - 1);
-
- BuildMI(MBB, MI, DL, TII->get(AMDGPU::IMPLICIT_DEF), PrevDst);
- for (; SizeLeft; SizeLeft--, CurrIdx++) {
- NewDst = MRI.createVirtualRegister(TII->getOpRegClass(MI, DstIdx));
- // Initialize dword
- Register SubReg = MRI.createVirtualRegister(&AMDGPU::VGPR_32RegClass);
- // clang-format off
- BuildMI(MBB, MI, DL, TII->get(AMDGPU::V_MOV_B32_e32), SubReg)
- .addImm(0);
- // clang-format on
- // Insert into the super-reg
- BuildMI(MBB, MI, DL, TII->get(TargetOpcode::INSERT_SUBREG), NewDst)
- .addReg(PrevDst)
- .addReg(SubReg)
- .addImm(SIRegisterInfo::getSubRegFromChannel(CurrIdx));
-
- PrevDst = NewDst;
- }
-
- // Add as an implicit operand
- MI.addOperand(MachineOperand::CreateReg(NewDst, false, true));
-
- // Tie the just added implicit operand to the dst
- MI.tieOperands(DstIdx, MI.getNumOperands() - 1);
-}
-
-/// Assign the register class depending on the number of
-/// bits set in the writemask
-void SITargetLowering::AdjustInstrPostInstrSelection(MachineInstr &MI,
- SDNode *Node) const {
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
-
- MachineFunction *MF = MI.getParent()->getParent();
- MachineRegisterInfo &MRI = MF->getRegInfo();
- SIMachineFunctionInfo *Info = MF->getInfo<SIMachineFunctionInfo>();
-
- if (TII->isVOP3(MI.getOpcode())) {
- // Make sure constant bus requirements are respected.
- TII->legalizeOperandsVOP3(MRI, MI);
-
- // Prefer VGPRs over AGPRs in mAI instructions where possible.
- // This saves a chain-copy of registers and better balance register
- // use between vgpr and agpr as agpr tuples tend to be big.
- if (!MI.getDesc().operands().empty()) {
- unsigned Opc = MI.getOpcode();
- bool HasAGPRs = Info->mayNeedAGPRs();
- const SIRegisterInfo *TRI = Subtarget->getRegisterInfo();
- int16_t Src2Idx = AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src2);
- for (auto I :
- {AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src0),
- AMDGPU::getNamedOperandIdx(Opc, AMDGPU::OpName::src1), Src2Idx}) {
- if (I == -1)
- break;
- if ((I == Src2Idx) && (HasAGPRs))
- break;
- MachineOperand &Op = MI.getOperand(I);
- if (!Op.isReg() || !Op.getReg().isVirtual())
- continue;
- auto *RC = TRI->getRegClassForReg(MRI, Op.getReg());
- if (!TRI->hasAGPRs(RC))
- continue;
- auto *Src = MRI.getUniqueVRegDef(Op.getReg());
- if (!Src || !Src->isCopy() ||
- !TRI->isSGPRReg(MRI, Src->getOperand(1).getReg()))
- continue;
- auto *NewRC = TRI->getEquivalentVGPRClass(RC);
- // All uses of agpr64 and agpr32 can also accept vgpr except for
- // v_accvgpr_read, but we do not produce agpr reads during selection,
- // so no use checks are needed.
- MRI.setRegClass(Op.getReg(), NewRC);
- }
-
- if (TII->isMAI(MI)) {
- // The ordinary src0, src1, src2 were legalized above.
- //
- // We have to also legalize the appended v_mfma_ld_scale_b32 operands,
- // as a separate instruction.
- int Src0Idx = AMDGPU::getNamedOperandIdx(MI.getOpcode(),
- AMDGPU::OpName::scale_src0);
- if (Src0Idx != -1) {
- int Src1Idx = AMDGPU::getNamedOperandIdx(MI.getOpcode(),
- AMDGPU::OpName::scale_src1);
- if (TII->usesConstantBus(MRI, MI, Src0Idx) &&
- TII->usesConstantBus(MRI, MI, Src1Idx))
- TII->legalizeOpWithMove(MI, Src1Idx);
- }
- }
-
- if (!HasAGPRs)
- return;
-
- // Resolve the rest of AV operands to AGPRs.
- if (auto *Src2 = TII->getNamedOperand(MI, AMDGPU::OpName::src2)) {
- if (Src2->isReg() && Src2->getReg().isVirtual()) {
- auto *RC = TRI->getRegClassForReg(MRI, Src2->getReg());
- if (TRI->isVectorSuperClass(RC)) {
- auto *NewRC = TRI->getEquivalentAGPRClass(RC);
- MRI.setRegClass(Src2->getReg(), NewRC);
- if (Src2->isTied())
- MRI.setRegClass(MI.getOperand(0).getReg(), NewRC);
- }
- }
- }
- }
-
- return;
- }
-
- if (TII->isImage(MI))
- TII->enforceOperandRCAlignment(MI, AMDGPU::OpName::vaddr);
-}
-
-static SDValue buildSMovImm32(SelectionDAG &DAG, const SDLoc &DL,
- uint64_t Val) {
- SDValue K = DAG.getTargetConstant(Val, DL, MVT::i32);
- return SDValue(DAG.getMachineNode(AMDGPU::S_MOV_B32, DL, MVT::i32, K), 0);
-}
-
-MachineSDNode *SITargetLowering::wrapAddr64Rsrc(SelectionDAG &DAG,
- const SDLoc &DL,
- SDValue Ptr) const {
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
-
- // Build the half of the subregister with the constants before building the
- // full 128-bit register. If we are building multiple resource descriptors,
- // this will allow CSEing of the 2-component register.
- const SDValue Ops0[] = {
- DAG.getTargetConstant(AMDGPU::SGPR_64RegClassID, DL, MVT::i32),
- buildSMovImm32(DAG, DL, 0),
- DAG.getTargetConstant(AMDGPU::sub0, DL, MVT::i32),
- buildSMovImm32(DAG, DL, TII->getDefaultRsrcDataFormat() >> 32),
- DAG.getTargetConstant(AMDGPU::sub1, DL, MVT::i32)};
-
- SDValue SubRegHi = SDValue(
- DAG.getMachineNode(AMDGPU::REG_SEQUENCE, DL, MVT::v2i32, Ops0), 0);
-
- // Combine the constants and the pointer.
- const SDValue Ops1[] = {
- DAG.getTargetConstant(AMDGPU::SGPR_128RegClassID, DL, MVT::i32), Ptr,
- DAG.getTargetConstant(AMDGPU::sub0_sub1, DL, MVT::i32), SubRegHi,
- DAG.getTargetConstant(AMDGPU::sub2_sub3, DL, MVT::i32)};
-
- return DAG.getMachineNode(AMDGPU::REG_SEQUENCE, DL, MVT::v4i32, Ops1);
-}
-
-/// Return a resource descriptor with the 'Add TID' bit enabled
-/// The TID (Thread ID) is multiplied by the stride value (bits [61:48]
-/// of the resource descriptor) to create an offset, which is added to
-/// the resource pointer.
-MachineSDNode *SITargetLowering::buildRSRC(SelectionDAG &DAG, const SDLoc &DL,
- SDValue Ptr, uint32_t RsrcDword1,
- uint64_t RsrcDword2And3) const {
- SDValue PtrLo = DAG.getTargetExtractSubreg(AMDGPU::sub0, DL, MVT::i32, Ptr);
- SDValue PtrHi = DAG.getTargetExtractSubreg(AMDGPU::sub1, DL, MVT::i32, Ptr);
- if (RsrcDword1) {
- PtrHi =
- SDValue(DAG.getMachineNode(AMDGPU::S_OR_B32, DL, MVT::i32, PtrHi,
- DAG.getConstant(RsrcDword1, DL, MVT::i32)),
- 0);
- }
-
- SDValue DataLo =
- buildSMovImm32(DAG, DL, RsrcDword2And3 & UINT64_C(0xFFFFFFFF));
- SDValue DataHi = buildSMovImm32(DAG, DL, RsrcDword2And3 >> 32);
-
- const SDValue Ops[] = {
- DAG.getTargetConstant(AMDGPU::SGPR_128RegClassID, DL, MVT::i32),
- PtrLo,
- DAG.getTargetConstant(AMDGPU::sub0, DL, MVT::i32),
- PtrHi,
- DAG.getTargetConstant(AMDGPU::sub1, DL, MVT::i32),
- DataLo,
- DAG.getTargetConstant(AMDGPU::sub2, DL, MVT::i32),
- DataHi,
- DAG.getTargetConstant(AMDGPU::sub3, DL, MVT::i32)};
-
- return DAG.getMachineNode(AMDGPU::REG_SEQUENCE, DL, MVT::v4i32, Ops);
-}
-
-//===----------------------------------------------------------------------===//
-// SI Inline Assembly Support
-//===----------------------------------------------------------------------===//
-
-std::pair<unsigned, const TargetRegisterClass *>
-SITargetLowering::getRegForInlineAsmConstraint(const TargetRegisterInfo *TRI_,
- StringRef Constraint,
- MVT VT) const {
- const SIRegisterInfo *TRI = static_cast<const SIRegisterInfo *>(TRI_);
-
- const TargetRegisterClass *RC = nullptr;
- if (Constraint.size() == 1) {
- const unsigned BitWidth = VT.getSizeInBits();
- switch (Constraint[0]) {
- default:
- return TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
- case 's':
- case 'r':
- switch (BitWidth) {
- case 16:
- RC = &AMDGPU::SReg_32RegClass;
- break;
- case 64:
- RC = &AMDGPU::SGPR_64RegClass;
- break;
- default:
- RC = SIRegisterInfo::getSGPRClassForBitWidth(BitWidth);
- if (!RC)
- return std::pair(0U, nullptr);
- break;
- }
- break;
- case 'v':
- switch (BitWidth) {
- case 16:
- RC = Subtarget->useRealTrue16Insts() ? &AMDGPU::VGPR_16RegClass
- : &AMDGPU::VGPR_32RegClass;
- break;
- default:
- RC = TRI->getVGPRClassForBitWidth(BitWidth);
- if (!RC)
- return std::pair(0U, nullptr);
- break;
- }
- break;
- case 'a':
- if (!Subtarget->hasMAIInsts())
- break;
- switch (BitWidth) {
- case 16:
- RC = &AMDGPU::AGPR_32RegClass;
- break;
- default:
- RC = TRI->getAGPRClassForBitWidth(BitWidth);
- if (!RC)
- return std::pair(0U, nullptr);
- break;
- }
- break;
- }
- // We actually support i128, i16 and f16 as inline parameters
- // even if they are not reported as legal
- if (RC && (isTypeLegal(VT) || VT.SimpleTy == MVT::i128 ||
- VT.SimpleTy == MVT::i16 || VT.SimpleTy == MVT::f16))
- return std::pair(0U, RC);
- }
-
- if (Constraint.starts_with("{") && Constraint.ends_with("}")) {
- StringRef RegName(Constraint.data() + 1, Constraint.size() - 2);
- if (RegName.consume_front("v")) {
- RC = &AMDGPU::VGPR_32RegClass;
- } else if (RegName.consume_front("s")) {
- RC = &AMDGPU::SGPR_32RegClass;
- } else if (RegName.consume_front("a")) {
- RC = &AMDGPU::AGPR_32RegClass;
- }
-
- if (RC) {
- uint32_t Idx;
- if (RegName.consume_front("[")) {
- uint32_t End;
- bool Failed = RegName.consumeInteger(10, Idx);
- Failed |= !RegName.consume_front(":");
- Failed |= RegName.consumeInteger(10, End);
- Failed |= !RegName.consume_back("]");
- if (!Failed) {
- uint32_t Width = (End - Idx + 1) * 32;
- // Prohibit constraints for register ranges with a width that does not
- // match the required type.
- if (VT.SimpleTy != MVT::Other && Width != VT.getSizeInBits())
- return std::pair(0U, nullptr);
- MCRegister Reg = RC->getRegister(Idx);
- if (SIRegisterInfo::isVGPRClass(RC))
- RC = TRI->getVGPRClassForBitWidth(Width);
- else if (SIRegisterInfo::isSGPRClass(RC))
- RC = TRI->getSGPRClassForBitWidth(Width);
- else if (SIRegisterInfo::isAGPRClass(RC))
- RC = TRI->getAGPRClassForBitWidth(Width);
- if (RC) {
- Reg = TRI->getMatchingSuperReg(Reg, AMDGPU::sub0, RC);
- if (!Reg) {
- // The register class does not contain the requested register,
- // e.g., because it is an SGPR pair that would violate alignment
- // requirements.
- return std::pair(0U, nullptr);
- }
- return std::pair(Reg, RC);
- }
- }
- } else {
- // Check for lossy scalar/vector conversions.
- if (VT.isVector() && VT.getSizeInBits() != 32)
- return std::pair(0U, nullptr);
- bool Failed = RegName.getAsInteger(10, Idx);
- if (!Failed && Idx < RC->getNumRegs())
- return std::pair(RC->getRegister(Idx), RC);
- }
- }
- }
-
- auto Ret = TargetLowering::getRegForInlineAsmConstraint(TRI, Constraint, VT);
- if (Ret.first)
- Ret.second = TRI->getPhysRegBaseClass(Ret.first);
-
- return Ret;
-}
-
-static bool isImmConstraint(StringRef Constraint) {
- if (Constraint.size() == 1) {
- switch (Constraint[0]) {
- default:
- break;
- case 'I':
- case 'J':
- case 'A':
- case 'B':
- case 'C':
- return true;
- }
- } else if (Constraint == "DA" || Constraint == "DB") {
- return true;
- }
- return false;
-}
-
-SITargetLowering::ConstraintType
-SITargetLowering::getConstraintType(StringRef Constraint) const {
- if (Constraint.size() == 1) {
- switch (Constraint[0]) {
- default:
- break;
- case 's':
- case 'v':
- case 'a':
- return C_RegisterClass;
- }
- }
- if (isImmConstraint(Constraint)) {
- return C_Other;
- }
- return TargetLowering::getConstraintType(Constraint);
-}
-
-static uint64_t clearUnusedBits(uint64_t Val, unsigned Size) {
- if (!AMDGPU::isInlinableIntLiteral(Val)) {
- Val = Val & maskTrailingOnes<uint64_t>(Size);
- }
- return Val;
-}
-
-void SITargetLowering::LowerAsmOperandForConstraint(SDValue Op,
- StringRef Constraint,
- std::vector<SDValue> &Ops,
- SelectionDAG &DAG) const {
- if (isImmConstraint(Constraint)) {
- uint64_t Val;
- if (getAsmOperandConstVal(Op, Val) &&
- checkAsmConstraintVal(Op, Constraint, Val)) {
- Val = clearUnusedBits(Val, Op.getScalarValueSizeInBits());
- Ops.push_back(DAG.getTargetConstant(Val, SDLoc(Op), MVT::i64));
- }
- } else {
- TargetLowering::LowerAsmOperandForConstraint(Op, Constraint, Ops, DAG);
- }
-}
-
-bool SITargetLowering::getAsmOperandConstVal(SDValue Op, uint64_t &Val) const {
- unsigned Size = Op.getScalarValueSizeInBits();
- if (Size > 64)
- return false;
-
- if (Size == 16 && !Subtarget->has16BitInsts())
- return false;
-
- if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(Op)) {
- Val = C->getSExtValue();
- return true;
- }
- if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Op)) {
- Val = C->getValueAPF().bitcastToAPInt().getSExtValue();
- return true;
- }
- if (BuildVectorSDNode *V = dyn_cast<BuildVectorSDNode>(Op)) {
- if (Size != 16 || Op.getNumOperands() != 2)
- return false;
- if (Op.getOperand(0).isUndef() || Op.getOperand(1).isUndef())
- return false;
- if (ConstantSDNode *C = V->getConstantSplatNode()) {
- Val = C->getSExtValue();
- return true;
- }
- if (ConstantFPSDNode *C = V->getConstantFPSplatNode()) {
- Val = C->getValueAPF().bitcastToAPInt().getSExtValue();
- return true;
- }
- }
-
- return false;
-}
-
-bool SITargetLowering::checkAsmConstraintVal(SDValue Op, StringRef Constraint,
- uint64_t Val) const {
- if (Constraint.size() == 1) {
- switch (Constraint[0]) {
- case 'I':
- return AMDGPU::isInlinableIntLiteral(Val);
- case 'J':
- return isInt<16>(Val);
- case 'A':
- return checkAsmConstraintValA(Op, Val);
- case 'B':
- return isInt<32>(Val);
- case 'C':
- return isUInt<32>(clearUnusedBits(Val, Op.getScalarValueSizeInBits())) ||
- AMDGPU::isInlinableIntLiteral(Val);
- default:
- break;
- }
- } else if (Constraint.size() == 2) {
- if (Constraint == "DA") {
- int64_t HiBits = static_cast<int32_t>(Val >> 32);
- int64_t LoBits = static_cast<int32_t>(Val);
- return checkAsmConstraintValA(Op, HiBits, 32) &&
- checkAsmConstraintValA(Op, LoBits, 32);
- }
- if (Constraint == "DB") {
- return true;
- }
- }
- llvm_unreachable("Invalid asm constraint");
-}
-
-bool SITargetLowering::checkAsmConstraintValA(SDValue Op, uint64_t Val,
- unsigned MaxSize) const {
- unsigned Size = std::min<unsigned>(Op.getScalarValueSizeInBits(), MaxSize);
- bool HasInv2Pi = Subtarget->hasInv2PiInlineImm();
- if (Size == 16) {
- MVT VT = Op.getSimpleValueType();
- switch (VT.SimpleTy) {
- default:
- return false;
- case MVT::i16:
- return AMDGPU::isInlinableLiteralI16(Val, HasInv2Pi);
- case MVT::f16:
- return AMDGPU::isInlinableLiteralFP16(Val, HasInv2Pi);
- case MVT::bf16:
- return AMDGPU::isInlinableLiteralBF16(Val, HasInv2Pi);
- case MVT::v2i16:
- return AMDGPU::getInlineEncodingV2I16(Val).has_value();
- case MVT::v2f16:
- return AMDGPU::getInlineEncodingV2F16(Val).has_value();
- case MVT::v2bf16:
- return AMDGPU::getInlineEncodingV2BF16(Val).has_value();
- }
- }
- if ((Size == 32 && AMDGPU::isInlinableLiteral32(Val, HasInv2Pi)) ||
- (Size == 64 && AMDGPU::isInlinableLiteral64(Val, HasInv2Pi)))
- return true;
- return false;
-}
-
-static int getAlignedAGPRClassID(unsigned UnalignedClassID) {
- switch (UnalignedClassID) {
- case AMDGPU::VReg_64RegClassID:
- return AMDGPU::VReg_64_Align2RegClassID;
- case AMDGPU::VReg_96RegClassID:
- return AMDGPU::VReg_96_Align2RegClassID;
- case AMDGPU::VReg_128RegClassID:
- return AMDGPU::VReg_128_Align2RegClassID;
- case AMDGPU::VReg_160RegClassID:
- return AMDGPU::VReg_160_Align2RegClassID;
- case AMDGPU::VReg_192RegClassID:
- return AMDGPU::VReg_192_Align2RegClassID;
- case AMDGPU::VReg_224RegClassID:
- return AMDGPU::VReg_224_Align2RegClassID;
- case AMDGPU::VReg_256RegClassID:
- return AMDGPU::VReg_256_Align2RegClassID;
- case AMDGPU::VReg_288RegClassID:
- return AMDGPU::VReg_288_Align2RegClassID;
- case AMDGPU::VReg_320RegClassID:
- return AMDGPU::VReg_320_Align2RegClassID;
- case AMDGPU::VReg_352RegClassID:
- return AMDGPU::VReg_352_Align2RegClassID;
- case AMDGPU::VReg_384RegClassID:
- return AMDGPU::VReg_384_Align2RegClassID;
- case AMDGPU::VReg_512RegClassID:
- return AMDGPU::VReg_512_Align2RegClassID;
- case AMDGPU::VReg_1024RegClassID:
- return AMDGPU::VReg_1024_Align2RegClassID;
- case AMDGPU::AReg_64RegClassID:
- return AMDGPU::AReg_64_Align2RegClassID;
- case AMDGPU::AReg_96RegClassID:
- return AMDGPU::AReg_96_Align2RegClassID;
- case AMDGPU::AReg_128RegClassID:
- return AMDGPU::AReg_128_Align2RegClassID;
- case AMDGPU::AReg_160RegClassID:
- return AMDGPU::AReg_160_Align2RegClassID;
- case AMDGPU::AReg_192RegClassID:
- return AMDGPU::AReg_192_Align2RegClassID;
- case AMDGPU::AReg_256RegClassID:
- return AMDGPU::AReg_256_Align2RegClassID;
- case AMDGPU::AReg_512RegClassID:
- return AMDGPU::AReg_512_Align2RegClassID;
- case AMDGPU::AReg_1024RegClassID:
- return AMDGPU::AReg_1024_Align2RegClassID;
- default:
- return -1;
- }
-}
-
-// Figure out which registers should be reserved for stack access. Only after
-// the function is legalized do we know all of the non-spill stack objects or if
-// calls are present.
-void SITargetLowering::finalizeLowering(MachineFunction &MF) const {
- MachineRegisterInfo &MRI = MF.getRegInfo();
- SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
- const GCNSubtarget &ST = MF.getSubtarget<GCNSubtarget>();
- const SIRegisterInfo *TRI = Subtarget->getRegisterInfo();
- const SIInstrInfo *TII = ST.getInstrInfo();
-
- if (Info->isEntryFunction()) {
- // Callable functions have fixed registers used for stack access.
- reservePrivateMemoryRegs(getTargetMachine(), MF, *TRI, *Info);
- }
-
- // TODO: Move this logic to getReservedRegs()
- // Reserve the SGPR(s) to save/restore EXEC for WWM spill/copy handling.
- unsigned MaxNumSGPRs = ST.getMaxNumSGPRs(MF);
- Register SReg = ST.isWave32()
- ? AMDGPU::SGPR_32RegClass.getRegister(MaxNumSGPRs - 1)
- : TRI->getAlignedHighSGPRForRC(MF, /*Align=*/2,
- &AMDGPU::SGPR_64RegClass);
- Info->setSGPRForEXECCopy(SReg);
-
- assert(!TRI->isSubRegister(Info->getScratchRSrcReg(),
- Info->getStackPtrOffsetReg()));
- if (Info->getStackPtrOffsetReg() != AMDGPU::SP_REG)
- MRI.replaceRegWith(AMDGPU::SP_REG, Info->getStackPtrOffsetReg());
-
- // We need to worry about replacing the default register with itself in case
- // of MIR testcases missing the MFI.
- if (Info->getScratchRSrcReg() != AMDGPU::PRIVATE_RSRC_REG)
- MRI.replaceRegWith(AMDGPU::PRIVATE_RSRC_REG, Info->getScratchRSrcReg());
-
- if (Info->getFrameOffsetReg() != AMDGPU::FP_REG)
- MRI.replaceRegWith(AMDGPU::FP_REG, Info->getFrameOffsetReg());
-
- Info->limitOccupancy(MF);
-
- if (ST.isWave32() && !MF.empty()) {
- for (auto &MBB : MF) {
- for (auto &MI : MBB) {
- TII->fixImplicitOperands(MI);
- }
- }
- }
-
- // FIXME: This is a hack to fixup AGPR classes to use the properly aligned
- // classes if required. Ideally the register class constraints would differ
- // per-subtarget, but there's no easy way to achieve that right now. This is
- // not a problem for VGPRs because the correctly aligned VGPR class is implied
- // from using them as the register class for legal types.
- if (ST.needsAlignedVGPRs()) {
- for (unsigned I = 0, E = MRI.getNumVirtRegs(); I != E; ++I) {
- const Register Reg = Register::index2VirtReg(I);
- const TargetRegisterClass *RC = MRI.getRegClassOrNull(Reg);
- if (!RC)
- continue;
- int NewClassID = getAlignedAGPRClassID(RC->getID());
- if (NewClassID != -1)
- MRI.setRegClass(Reg, TRI->getRegClass(NewClassID));
- }
- }
-
- TargetLoweringBase::finalizeLowering(MF);
-}
-
-void SITargetLowering::computeKnownBitsForTargetNode(const SDValue Op,
- KnownBits &Known,
- const APInt &DemandedElts,
- const SelectionDAG &DAG,
- unsigned Depth) const {
- Known.resetAll();
- unsigned Opc = Op.getOpcode();
- switch (Opc) {
- case ISD::INTRINSIC_WO_CHAIN: {
- unsigned IID = Op.getConstantOperandVal(0);
- switch (IID) {
- case Intrinsic::amdgcn_mbcnt_lo:
- case Intrinsic::amdgcn_mbcnt_hi: {
- const GCNSubtarget &ST =
- DAG.getMachineFunction().getSubtarget<GCNSubtarget>();
- // Wave64 mbcnt_lo returns at most 32 + src1. Otherwise these return at
- // most 31 + src1.
- Known.Zero.setBitsFrom(
- IID == Intrinsic::amdgcn_mbcnt_lo ? ST.getWavefrontSizeLog2() : 5);
- KnownBits Known2 = DAG.computeKnownBits(Op.getOperand(2), Depth + 1);
- Known = KnownBits::add(Known, Known2);
- return;
- }
- }
- break;
- }
- }
- return AMDGPUTargetLowering::computeKnownBitsForTargetNode(
- Op, Known, DemandedElts, DAG, Depth);
-}
-
-void SITargetLowering::computeKnownBitsForFrameIndex(
- const int FI, KnownBits &Known, const MachineFunction &MF) const {
- TargetLowering::computeKnownBitsForFrameIndex(FI, Known, MF);
-
- // Set the high bits to zero based on the maximum allowed scratch size per
- // wave. We can't use vaddr in MUBUF instructions if we don't know the address
- // calculation won't overflow, so assume the sign bit is never set.
- Known.Zero.setHighBits(getSubtarget()->getKnownHighZeroBitsForFrameIndex());
-}
-
-static void knownBitsForWorkitemID(const GCNSubtarget &ST,
- GISelValueTracking &VT, KnownBits &Known,
- unsigned Dim) {
- unsigned MaxValue =
- ST.getMaxWorkitemID(VT.getMachineFunction().getFunction(), Dim);
- Known.Zero.setHighBits(llvm::countl_zero(MaxValue));
-}
-
-void SITargetLowering::computeKnownBitsForTargetInstr(
- GISelValueTracking &VT, Register R, KnownBits &Known,
- const APInt &DemandedElts, const MachineRegisterInfo &MRI,
- unsigned Depth) const {
- const MachineInstr *MI = MRI.getVRegDef(R);
- switch (MI->getOpcode()) {
- case AMDGPU::G_INTRINSIC:
- case AMDGPU::G_INTRINSIC_CONVERGENT: {
- Intrinsic::ID IID = cast<GIntrinsic>(MI)->getIntrinsicID();
- switch (IID) {
- case Intrinsic::amdgcn_workitem_id_x:
- knownBitsForWorkitemID(*getSubtarget(), VT, Known, 0);
- break;
- case Intrinsic::amdgcn_workitem_id_y:
- knownBitsForWorkitemID(*getSubtarget(), VT, Known, 1);
- break;
- case Intrinsic::amdgcn_workitem_id_z:
- knownBitsForWorkitemID(*getSubtarget(), VT, Known, 2);
- break;
- case Intrinsic::amdgcn_mbcnt_lo:
- case Intrinsic::amdgcn_mbcnt_hi: {
- // Wave64 mbcnt_lo returns at most 32 + src1. Otherwise these return at
- // most 31 + src1.
- Known.Zero.setBitsFrom(IID == Intrinsic::amdgcn_mbcnt_lo
- ? getSubtarget()->getWavefrontSizeLog2()
- : 5);
- KnownBits Known2;
- VT.computeKnownBitsImpl(MI->getOperand(3).getReg(), Known2, DemandedElts,
- Depth + 1);
- Known = KnownBits::add(Known, Known2);
- break;
- }
- case Intrinsic::amdgcn_groupstaticsize: {
- // We can report everything over the maximum size as 0. We can't report
- // based on the actual size because we don't know if it's accurate or not
- // at any given point.
- Known.Zero.setHighBits(
- llvm::countl_zero(getSubtarget()->getAddressableLocalMemorySize()));
- break;
- }
- }
- break;
- }
- case AMDGPU::G_AMDGPU_BUFFER_LOAD_UBYTE:
- Known.Zero.setHighBits(24);
- break;
- case AMDGPU::G_AMDGPU_BUFFER_LOAD_USHORT:
- Known.Zero.setHighBits(16);
- break;
- case AMDGPU::G_AMDGPU_SMED3:
- case AMDGPU::G_AMDGPU_UMED3: {
- auto [Dst, Src0, Src1, Src2] = MI->getFirst4Regs();
-
- KnownBits Known2;
- VT.computeKnownBitsImpl(Src2, Known2, DemandedElts, Depth + 1);
- if (Known2.isUnknown())
- break;
-
- KnownBits Known1;
- VT.computeKnownBitsImpl(Src1, Known1, DemandedElts, Depth + 1);
- if (Known1.isUnknown())
- break;
-
- KnownBits Known0;
- VT.computeKnownBitsImpl(Src0, Known0, DemandedElts, Depth + 1);
- if (Known0.isUnknown())
- break;
-
- // TODO: Handle LeadZero/LeadOne from UMIN/UMAX handling.
- Known.Zero = Known0.Zero & Known1.Zero & Known2.Zero;
- Known.One = Known0.One & Known1.One & Known2.One;
- break;
- }
- }
-}
-
-Align SITargetLowering::computeKnownAlignForTargetInstr(
- GISelValueTracking &VT, Register R, const MachineRegisterInfo &MRI,
- unsigned Depth) const {
- const MachineInstr *MI = MRI.getVRegDef(R);
- if (auto *GI = dyn_cast<GIntrinsic>(MI)) {
- // FIXME: Can this move to generic code? What about the case where the call
- // site specifies a lower alignment?
- Intrinsic::ID IID = GI->getIntrinsicID();
- LLVMContext &Ctx = VT.getMachineFunction().getFunction().getContext();
- AttributeList Attrs =
- Intrinsic::getAttributes(Ctx, IID, Intrinsic::getType(Ctx, IID));
- if (MaybeAlign RetAlign = Attrs.getRetAlignment())
- return *RetAlign;
- }
- return Align(1);
-}
-
-Align SITargetLowering::getPrefLoopAlignment(MachineLoop *ML) const {
- const Align PrefAlign = TargetLowering::getPrefLoopAlignment(ML);
- const Align CacheLineAlign = Align(64);
-
- // Pre-GFX10 target did not benefit from loop alignment
- if (!ML || DisableLoopAlignment || !getSubtarget()->hasInstPrefetch() ||
- getSubtarget()->hasInstFwdPrefetchBug())
- return PrefAlign;
-
- // On GFX10 I$ is 4 x 64 bytes cache lines.
- // By default prefetcher keeps one cache line behind and reads two ahead.
- // We can modify it with S_INST_PREFETCH for larger loops to have two lines
- // behind and one ahead.
- // Therefor we can benefit from aligning loop headers if loop fits 192 bytes.
- // If loop fits 64 bytes it always spans no more than two cache lines and
- // does not need an alignment.
- // Else if loop is less or equal 128 bytes we do not need to modify prefetch,
- // Else if loop is less or equal 192 bytes we need two lines behind.
-
- const SIInstrInfo *TII = getSubtarget()->getInstrInfo();
- const MachineBasicBlock *Header = ML->getHeader();
- if (Header->getAlignment() != PrefAlign)
- return Header->getAlignment(); // Already processed.
-
- unsigned LoopSize = 0;
- for (const MachineBasicBlock *MBB : ML->blocks()) {
- // If inner loop block is aligned assume in average half of the alignment
- // size to be added as nops.
- if (MBB != Header)
- LoopSize += MBB->getAlignment().value() / 2;
-
- for (const MachineInstr &MI : *MBB) {
- LoopSize += TII->getInstSizeInBytes(MI);
- if (LoopSize > 192)
- return PrefAlign;
- }
- }
-
- if (LoopSize <= 64)
- return PrefAlign;
-
- if (LoopSize <= 128)
- return CacheLineAlign;
-
- // If any of parent loops is surrounded by prefetch instructions do not
- // insert new for inner loop, which would reset parent's settings.
- for (MachineLoop *P = ML->getParentLoop(); P; P = P->getParentLoop()) {
- if (MachineBasicBlock *Exit = P->getExitBlock()) {
- auto I = Exit->getFirstNonDebugInstr();
- if (I != Exit->end() && I->getOpcode() == AMDGPU::S_INST_PREFETCH)
- return CacheLineAlign;
- }
- }
-
- MachineBasicBlock *Pre = ML->getLoopPreheader();
- MachineBasicBlock *Exit = ML->getExitBlock();
-
- if (Pre && Exit) {
- auto PreTerm = Pre->getFirstTerminator();
- if (PreTerm == Pre->begin() ||
- std::prev(PreTerm)->getOpcode() != AMDGPU::S_INST_PREFETCH)
- BuildMI(*Pre, PreTerm, DebugLoc(), TII->get(AMDGPU::S_INST_PREFETCH))
- .addImm(1); // prefetch 2 lines behind PC
-
- auto ExitHead = Exit->getFirstNonDebugInstr();
- if (ExitHead == Exit->end() ||
- ExitHead->getOpcode() != AMDGPU::S_INST_PREFETCH)
- BuildMI(*Exit, ExitHead, DebugLoc(), TII->get(AMDGPU::S_INST_PREFETCH))
- .addImm(2); // prefetch 1 line behind PC
- }
-
- return CacheLineAlign;
-}
-
-LLVM_ATTRIBUTE_UNUSED
-static bool isCopyFromRegOfInlineAsm(const SDNode *N) {
- assert(N->getOpcode() == ISD::CopyFromReg);
- do {
- // Follow the chain until we find an INLINEASM node.
- N = N->getOperand(0).getNode();
- if (N->getOpcode() == ISD::INLINEASM || N->getOpcode() == ISD::INLINEASM_BR)
- return true;
- } while (N->getOpcode() == ISD::CopyFromReg);
- return false;
-}
-
-bool SITargetLowering::isSDNodeSourceOfDivergence(const SDNode *N,
- FunctionLoweringInfo *FLI,
- UniformityInfo *UA) const {
- switch (N->getOpcode()) {
- case ISD::CopyFromReg: {
- const RegisterSDNode *R = cast<RegisterSDNode>(N->getOperand(1));
- const MachineRegisterInfo &MRI = FLI->MF->getRegInfo();
- const SIRegisterInfo *TRI = Subtarget->getRegisterInfo();
- Register Reg = R->getReg();
-
- // FIXME: Why does this need to consider isLiveIn?
- if (Reg.isPhysical() || MRI.isLiveIn(Reg))
- return !TRI->isSGPRReg(MRI, Reg);
-
- if (const Value *V = FLI->getValueFromVirtualReg(R->getReg()))
- return UA->isDivergent(V);
-
- assert(Reg == FLI->DemoteRegister || isCopyFromRegOfInlineAsm(N));
- return !TRI->isSGPRReg(MRI, Reg);
- }
- case ISD::LOAD: {
- const LoadSDNode *L = cast<LoadSDNode>(N);
- unsigned AS = L->getAddressSpace();
- // A flat load may access private memory.
- return AS == AMDGPUAS::PRIVATE_ADDRESS || AS == AMDGPUAS::FLAT_ADDRESS;
- }
- case ISD::CALLSEQ_END:
- return true;
- case ISD::INTRINSIC_WO_CHAIN:
- return AMDGPU::isIntrinsicSourceOfDivergence(N->getConstantOperandVal(0));
- case ISD::INTRINSIC_W_CHAIN:
- return AMDGPU::isIntrinsicSourceOfDivergence(N->getConstantOperandVal(1));
- case AMDGPUISD::ATOMIC_CMP_SWAP:
- case AMDGPUISD::BUFFER_ATOMIC_SWAP:
- case AMDGPUISD::BUFFER_ATOMIC_ADD:
- case AMDGPUISD::BUFFER_ATOMIC_SUB:
- case AMDGPUISD::BUFFER_ATOMIC_SMIN:
- case AMDGPUISD::BUFFER_ATOMIC_UMIN:
- case AMDGPUISD::BUFFER_ATOMIC_SMAX:
- case AMDGPUISD::BUFFER_ATOMIC_UMAX:
- case AMDGPUISD::BUFFER_ATOMIC_AND:
- case AMDGPUISD::BUFFER_ATOMIC_OR:
- case AMDGPUISD::BUFFER_ATOMIC_XOR:
- case AMDGPUISD::BUFFER_ATOMIC_INC:
- case AMDGPUISD::BUFFER_ATOMIC_DEC:
- case AMDGPUISD::BUFFER_ATOMIC_CMPSWAP:
- case AMDGPUISD::BUFFER_ATOMIC_CSUB:
- case AMDGPUISD::BUFFER_ATOMIC_FADD:
- case AMDGPUISD::BUFFER_ATOMIC_FMIN:
- case AMDGPUISD::BUFFER_ATOMIC_FMAX:
- // Target-specific read-modify-write atomics are sources of divergence.
- return true;
- default:
- if (auto *A = dyn_cast<AtomicSDNode>(N)) {
- // Generic read-modify-write atomics are sources of divergence.
- return A->readMem() && A->writeMem();
- }
- return false;
- }
-}
-
-bool SITargetLowering::denormalsEnabledForType(const SelectionDAG &DAG,
- EVT VT) const {
- switch (VT.getScalarType().getSimpleVT().SimpleTy) {
- case MVT::f32:
- return !denormalModeIsFlushAllF32(DAG.getMachineFunction());
- case MVT::f64:
- case MVT::f16:
- return !denormalModeIsFlushAllF64F16(DAG.getMachineFunction());
- default:
- return false;
- }
-}
-
-bool SITargetLowering::denormalsEnabledForType(
- LLT Ty, const MachineFunction &MF) const {
- switch (Ty.getScalarSizeInBits()) {
- case 32:
- return !denormalModeIsFlushAllF32(MF);
- case 64:
- case 16:
- return !denormalModeIsFlushAllF64F16(MF);
- default:
- return false;
- }
-}
-
-bool SITargetLowering::isKnownNeverNaNForTargetNode(SDValue Op,
- const APInt &DemandedElts,
- const SelectionDAG &DAG,
- bool SNaN,
- unsigned Depth) const {
- if (Op.getOpcode() == AMDGPUISD::CLAMP) {
- const MachineFunction &MF = DAG.getMachineFunction();
- const SIMachineFunctionInfo *Info = MF.getInfo<SIMachineFunctionInfo>();
-
- if (Info->getMode().DX10Clamp)
- return true; // Clamped to 0.
- return DAG.isKnownNeverNaN(Op.getOperand(0), SNaN, Depth + 1);
- }
-
- return AMDGPUTargetLowering::isKnownNeverNaNForTargetNode(Op, DemandedElts,
- DAG, SNaN, Depth);
-}
-
-// On older subtargets, global FP atomic instructions have a hardcoded FP mode
-// and do not support FP32 denormals, and only support v2f16/f64 denormals.
-static bool atomicIgnoresDenormalModeOrFPModeIsFTZ(const AtomicRMWInst *RMW) {
- if (RMW->hasMetadata("amdgpu.ignore.denormal.mode"))
- return true;
-
- const fltSemantics &Flt = RMW->getType()->getScalarType()->getFltSemantics();
- auto DenormMode = RMW->getFunction()->getDenormalMode(Flt);
- if (DenormMode == DenormalMode::getPreserveSign())
- return true;
-
- // TODO: Remove this.
- return RMW->getFunction()
- ->getFnAttribute("amdgpu-unsafe-fp-atomics")
- .getValueAsBool();
-}
-
-static OptimizationRemark emitAtomicRMWLegalRemark(const AtomicRMWInst *RMW) {
- LLVMContext &Ctx = RMW->getContext();
- StringRef MemScope =
- Ctx.getSyncScopeName(RMW->getSyncScopeID()).value_or("system");
-
- return OptimizationRemark(DEBUG_TYPE, "Passed", RMW)
- << "Hardware instruction generated for atomic "
- << RMW->getOperationName(RMW->getOperation())
- << " operation at memory scope " << MemScope;
-}
-
-static bool isV2F16OrV2BF16(Type *Ty) {
- if (auto *VT = dyn_cast<FixedVectorType>(Ty)) {
- Type *EltTy = VT->getElementType();
- return VT->getNumElements() == 2 &&
- (EltTy->isHalfTy() || EltTy->isBFloatTy());
- }
-
- return false;
-}
-
-static bool isV2F16(Type *Ty) {
- FixedVectorType *VT = dyn_cast<FixedVectorType>(Ty);
- return VT && VT->getNumElements() == 2 && VT->getElementType()->isHalfTy();
-}
-
-static bool isV2BF16(Type *Ty) {
- FixedVectorType *VT = dyn_cast<FixedVectorType>(Ty);
- return VT && VT->getNumElements() == 2 && VT->getElementType()->isBFloatTy();
-}
-
-/// \return true if atomicrmw integer ops work for the type.
-static bool isAtomicRMWLegalIntTy(Type *Ty) {
- if (auto *IT = dyn_cast<IntegerType>(Ty)) {
- unsigned BW = IT->getBitWidth();
- return BW == 32 || BW == 64;
- }
-
- return false;
-}
-
-/// \return true if this atomicrmw xchg type can be selected.
-static bool isAtomicRMWLegalXChgTy(const AtomicRMWInst *RMW) {
- Type *Ty = RMW->getType();
- if (isAtomicRMWLegalIntTy(Ty))
- return true;
-
- if (PointerType *PT = dyn_cast<PointerType>(Ty)) {
- const DataLayout &DL = RMW->getFunction()->getParent()->getDataLayout();
- unsigned BW = DL.getPointerSizeInBits(PT->getAddressSpace());
- return BW == 32 || BW == 64;
- }
-
- if (Ty->isFloatTy() || Ty->isDoubleTy())
- return true;
-
- if (FixedVectorType *VT = dyn_cast<FixedVectorType>(Ty)) {
- return VT->getNumElements() == 2 &&
- VT->getElementType()->getPrimitiveSizeInBits() == 16;
- }
-
- return false;
-}
-
-/// \returns true if it's valid to emit a native instruction for \p RMW, based
-/// on the properties of the target memory.
-static bool globalMemoryFPAtomicIsLegal(const GCNSubtarget &Subtarget,
- const AtomicRMWInst *RMW,
- bool HasSystemScope) {
- // The remote/fine-grained access logic is different from the integer
- // atomics. Without AgentScopeFineGrainedRemoteMemoryAtomics support,
- // fine-grained access does not work, even for a device local allocation.
- //
- // With AgentScopeFineGrainedRemoteMemoryAtomics, system scoped device local
- // allocations work.
- if (HasSystemScope) {
- if (Subtarget.supportsAgentScopeFineGrainedRemoteMemoryAtomics() &&
- RMW->hasMetadata("amdgpu.no.remote.memory"))
- return true;
- } else if (Subtarget.supportsAgentScopeFineGrainedRemoteMemoryAtomics())
- return true;
-
- return RMW->hasMetadata("amdgpu.no.fine.grained.memory");
-}
-
-/// \return Action to perform on AtomicRMWInsts for integer operations.
-static TargetLowering::AtomicExpansionKind
-atomicSupportedIfLegalIntType(const AtomicRMWInst *RMW) {
- return isAtomicRMWLegalIntTy(RMW->getType())
- ? TargetLowering::AtomicExpansionKind::None
- : TargetLowering::AtomicExpansionKind::CmpXChg;
-}
-
-/// Return if a flat address space atomicrmw can access private memory.
-static bool flatInstrMayAccessPrivate(const Instruction *I) {
- const MDNode *NoaliasAddrSpaceMD =
- I->getMetadata(LLVMContext::MD_noalias_addrspace);
- if (!NoaliasAddrSpaceMD)
- return true;
-
- for (unsigned I = 0, E = NoaliasAddrSpaceMD->getNumOperands() / 2; I != E;
- ++I) {
- auto *Low = mdconst::extract<ConstantInt>(
- NoaliasAddrSpaceMD->getOperand(2 * I + 0));
- if (Low->getValue().uge(AMDGPUAS::PRIVATE_ADDRESS)) {
- auto *High = mdconst::extract<ConstantInt>(
- NoaliasAddrSpaceMD->getOperand(2 * I + 1));
- return High->getValue().ule(AMDGPUAS::PRIVATE_ADDRESS);
- }
- }
-
- return true;
-}
-
-TargetLowering::AtomicExpansionKind
-SITargetLowering::shouldExpandAtomicRMWInIR(AtomicRMWInst *RMW) const {
- unsigned AS = RMW->getPointerAddressSpace();
- if (AS == AMDGPUAS::PRIVATE_ADDRESS)
- return AtomicExpansionKind::NotAtomic;
-
- // 64-bit flat atomics that dynamically reside in private memory will silently
- // be dropped.
- //
- // Note that we will emit a new copy of the original atomic in the expansion,
- // which will be incrementally relegalized.
- const DataLayout &DL = RMW->getFunction()->getDataLayout();
- if (AS == AMDGPUAS::FLAT_ADDRESS &&
- DL.getTypeSizeInBits(RMW->getType()) == 64 &&
- flatInstrMayAccessPrivate(RMW))
- return AtomicExpansionKind::Expand;
-
- auto ReportUnsafeHWInst = [=](TargetLowering::AtomicExpansionKind Kind) {
- OptimizationRemarkEmitter ORE(RMW->getFunction());
- ORE.emit([=]() {
- return emitAtomicRMWLegalRemark(RMW) << " due to an unsafe request.";
- });
- return Kind;
- };
-
- auto SSID = RMW->getSyncScopeID();
- bool HasSystemScope =
- SSID == SyncScope::System ||
- SSID == RMW->getContext().getOrInsertSyncScopeID("one-as");
-
- auto Op = RMW->getOperation();
- switch (Op) {
- case AtomicRMWInst::Xchg: {
- // PCIe supports add and xchg for system atomics.
- return isAtomicRMWLegalXChgTy(RMW)
- ? TargetLowering::AtomicExpansionKind::None
- : TargetLowering::AtomicExpansionKind::CmpXChg;
- }
- case AtomicRMWInst::Add:
- case AtomicRMWInst::And:
- case AtomicRMWInst::UIncWrap:
- case AtomicRMWInst::UDecWrap:
- return atomicSupportedIfLegalIntType(RMW);
- case AtomicRMWInst::Sub:
- case AtomicRMWInst::Or:
- case AtomicRMWInst::Xor: {
- // Atomic sub/or/xor do not work over PCI express, but atomic add
- // does. InstCombine transforms these with 0 to or, so undo that.
- if (HasSystemScope && AMDGPU::isFlatGlobalAddrSpace(AS)) {
- if (Constant *ConstVal = dyn_cast<Constant>(RMW->getValOperand());
- ConstVal && ConstVal->isNullValue())
- return AtomicExpansionKind::Expand;
- }
-
- return atomicSupportedIfLegalIntType(RMW);
- }
- case AtomicRMWInst::FAdd: {
- Type *Ty = RMW->getType();
-
- // TODO: Handle REGION_ADDRESS
- if (AS == AMDGPUAS::LOCAL_ADDRESS) {
- // DS F32 FP atomics do respect the denormal mode, but the rounding mode
- // is fixed to round-to-nearest-even.
- //
- // F64 / PK_F16 / PK_BF16 never flush and are also fixed to
- // round-to-nearest-even.
- //
- // We ignore the rounding mode problem, even in strictfp. The C++ standard
- // suggests it is OK if the floating-point mode may not match the calling
- // thread.
- if (Ty->isFloatTy()) {
- return Subtarget->hasLDSFPAtomicAddF32() ? AtomicExpansionKind::None
- : AtomicExpansionKind::CmpXChg;
- }
-
- if (Ty->isDoubleTy()) {
- // Ignores denormal mode, but we don't consider flushing mandatory.
- return Subtarget->hasLDSFPAtomicAddF64() ? AtomicExpansionKind::None
- : AtomicExpansionKind::CmpXChg;
- }
-
- if (Subtarget->hasAtomicDsPkAdd16Insts() && isV2F16OrV2BF16(Ty))
- return AtomicExpansionKind::None;
-
- return AtomicExpansionKind::CmpXChg;
- }
-
- // LDS atomics respect the denormal mode from the mode register.
- //
- // Traditionally f32 global/buffer memory atomics would unconditionally
- // flush denormals, but newer targets do not flush. f64/f16/bf16 cases never
- // flush.
- //
- // On targets with flat atomic fadd, denormals would flush depending on
- // whether the target address resides in LDS or global memory. We consider
- // this flat-maybe-flush as will-flush.
- if (Ty->isFloatTy() &&
- !Subtarget->hasMemoryAtomicFaddF32DenormalSupport() &&
- !atomicIgnoresDenormalModeOrFPModeIsFTZ(RMW))
- return AtomicExpansionKind::CmpXChg;
-
- // FIXME: These ReportUnsafeHWInsts are imprecise. Some of these cases are
- // safe. The message phrasing also should be better.
- if (globalMemoryFPAtomicIsLegal(*Subtarget, RMW, HasSystemScope)) {
- if (AS == AMDGPUAS::FLAT_ADDRESS) {
- // gfx942, gfx12
- if (Subtarget->hasAtomicFlatPkAdd16Insts() && isV2F16OrV2BF16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- } else if (AMDGPU::isExtendedGlobalAddrSpace(AS)) {
- // gfx90a, gfx942, gfx12
- if (Subtarget->hasAtomicBufferGlobalPkAddF16Insts() && isV2F16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
-
- // gfx942, gfx12
- if (Subtarget->hasAtomicGlobalPkAddBF16Inst() && isV2BF16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- } else if (AS == AMDGPUAS::BUFFER_FAT_POINTER) {
- // gfx90a, gfx942, gfx12
- if (Subtarget->hasAtomicBufferGlobalPkAddF16Insts() && isV2F16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
-
- // While gfx90a/gfx942 supports v2bf16 for global/flat, it does not for
- // buffer. gfx12 does have the buffer version.
- if (Subtarget->hasAtomicBufferPkAddBF16Inst() && isV2BF16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- }
-
- // global and flat atomic fadd f64: gfx90a, gfx942.
- if (Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst() && Ty->isDoubleTy())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
-
- if (AS != AMDGPUAS::FLAT_ADDRESS) {
- if (Ty->isFloatTy()) {
- // global/buffer atomic fadd f32 no-rtn: gfx908, gfx90a, gfx942,
- // gfx11+.
- if (RMW->use_empty() && Subtarget->hasAtomicFaddNoRtnInsts())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- // global/buffer atomic fadd f32 rtn: gfx90a, gfx942, gfx11+.
- if (!RMW->use_empty() && Subtarget->hasAtomicFaddRtnInsts())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- } else {
- // gfx908
- if (RMW->use_empty() &&
- Subtarget->hasAtomicBufferGlobalPkAddF16NoRtnInsts() &&
- isV2F16(Ty))
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- }
- }
-
- // flat atomic fadd f32: gfx942, gfx11+.
- if (AS == AMDGPUAS::FLAT_ADDRESS && Ty->isFloatTy()) {
- if (Subtarget->hasFlatAtomicFaddF32Inst())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
-
- // If it is in flat address space, and the type is float, we will try to
- // expand it, if the target supports global and lds atomic fadd. The
- // reason we need that is, in the expansion, we emit the check of
- // address space. If it is in global address space, we emit the global
- // atomic fadd; if it is in shared address space, we emit the LDS atomic
- // fadd.
- if (Subtarget->hasLDSFPAtomicAddF32()) {
- if (RMW->use_empty() && Subtarget->hasAtomicFaddNoRtnInsts())
- return AtomicExpansionKind::Expand;
- if (!RMW->use_empty() && Subtarget->hasAtomicFaddRtnInsts())
- return AtomicExpansionKind::Expand;
- }
- }
- }
-
- return AtomicExpansionKind::CmpXChg;
- }
- case AtomicRMWInst::FMin:
- case AtomicRMWInst::FMax: {
- Type *Ty = RMW->getType();
-
- // LDS float and double fmin/fmax were always supported.
- if (AS == AMDGPUAS::LOCAL_ADDRESS) {
- return Ty->isFloatTy() || Ty->isDoubleTy() ? AtomicExpansionKind::None
- : AtomicExpansionKind::CmpXChg;
- }
-
- if (globalMemoryFPAtomicIsLegal(*Subtarget, RMW, HasSystemScope)) {
- // For flat and global cases:
- // float, double in gfx7. Manual claims denormal support.
- // Removed in gfx8.
- // float, double restored in gfx10.
- // double removed again in gfx11, so only f32 for gfx11/gfx12.
- //
- // For gfx9, gfx90a and gfx942 support f64 for global (same as fadd), but
- // no f32.
- if (AS == AMDGPUAS::FLAT_ADDRESS) {
- if (Subtarget->hasAtomicFMinFMaxF32FlatInsts() && Ty->isFloatTy())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- if (Subtarget->hasAtomicFMinFMaxF64FlatInsts() && Ty->isDoubleTy())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- } else if (AMDGPU::isExtendedGlobalAddrSpace(AS) ||
- AS == AMDGPUAS::BUFFER_FAT_POINTER) {
- if (Subtarget->hasAtomicFMinFMaxF32GlobalInsts() && Ty->isFloatTy())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- if (Subtarget->hasAtomicFMinFMaxF64GlobalInsts() && Ty->isDoubleTy())
- return ReportUnsafeHWInst(AtomicExpansionKind::None);
- }
- }
-
- return AtomicExpansionKind::CmpXChg;
- }
- case AtomicRMWInst::Min:
- case AtomicRMWInst::Max:
- case AtomicRMWInst::UMin:
- case AtomicRMWInst::UMax: {
- if (AMDGPU::isFlatGlobalAddrSpace(AS) ||
- AS == AMDGPUAS::BUFFER_FAT_POINTER) {
- // Always expand system scope min/max atomics.
- if (HasSystemScope)
- return AtomicExpansionKind::CmpXChg;
- }
-
- return atomicSupportedIfLegalIntType(RMW);
- }
- case AtomicRMWInst::Nand:
- case AtomicRMWInst::FSub:
- default:
- return AtomicExpansionKind::CmpXChg;
- }
-
- llvm_unreachable("covered atomicrmw op switch");
-}
-
-TargetLowering::AtomicExpansionKind
-SITargetLowering::shouldExpandAtomicLoadInIR(LoadInst *LI) const {
- return LI->getPointerAddressSpace() == AMDGPUAS::PRIVATE_ADDRESS
- ? AtomicExpansionKind::NotAtomic
- : AtomicExpansionKind::None;
-}
-
-TargetLowering::AtomicExpansionKind
-SITargetLowering::shouldExpandAtomicStoreInIR(StoreInst *SI) const {
- return SI->getPointerAddressSpace() == AMDGPUAS::PRIVATE_ADDRESS
- ? AtomicExpansionKind::NotAtomic
- : AtomicExpansionKind::None;
-}
-
-TargetLowering::AtomicExpansionKind
-SITargetLowering::shouldExpandAtomicCmpXchgInIR(AtomicCmpXchgInst *CmpX) const {
- unsigned AddrSpace = CmpX->getPointerAddressSpace();
- if (AddrSpace == AMDGPUAS::PRIVATE_ADDRESS)
- return AtomicExpansionKind::NotAtomic;
-
- if (AddrSpace != AMDGPUAS::FLAT_ADDRESS || !flatInstrMayAccessPrivate(CmpX))
- return AtomicExpansionKind::None;
-
- const DataLayout &DL = CmpX->getDataLayout();
-
- Type *ValTy = CmpX->getNewValOperand()->getType();
-
- // If a 64-bit flat atomic may alias private, we need to avoid using the
- // atomic in the private case.
- return DL.getTypeSizeInBits(ValTy) == 64 ? AtomicExpansionKind::Expand
- : AtomicExpansionKind::None;
-}
-
-const TargetRegisterClass *
-SITargetLowering::getRegClassFor(MVT VT, bool isDivergent) const {
- const TargetRegisterClass *RC = TargetLoweringBase::getRegClassFor(VT, false);
- const SIRegisterInfo *TRI = Subtarget->getRegisterInfo();
- if (RC == &AMDGPU::VReg_1RegClass && !isDivergent)
- return Subtarget->isWave64() ? &AMDGPU::SReg_64RegClass
- : &AMDGPU::SReg_32RegClass;
- if (!TRI->isSGPRClass(RC) && !isDivergent)
- return TRI->getEquivalentSGPRClass(RC);
- if (TRI->isSGPRClass(RC) && isDivergent)
- return TRI->getEquivalentVGPRClass(RC);
-
- return RC;
-}
-
-// FIXME: This is a workaround for DivergenceAnalysis not understanding always
-// uniform values (as produced by the mask results of control flow intrinsics)
-// used outside of divergent blocks. The phi users need to also be treated as
-// always uniform.
-//
-// FIXME: DA is no longer in-use. Does this still apply to UniformityAnalysis?
-static bool hasCFUser(const Value *V, SmallPtrSet<const Value *, 16> &Visited,
- unsigned WaveSize) {
- // FIXME: We assume we never cast the mask results of a control flow
- // intrinsic.
- // Early exit if the type won't be consistent as a compile time hack.
- IntegerType *IT = dyn_cast<IntegerType>(V->getType());
- if (!IT || IT->getBitWidth() != WaveSize)
- return false;
-
- if (!isa<Instruction>(V))
- return false;
- if (!Visited.insert(V).second)
- return false;
- bool Result = false;
- for (const auto *U : V->users()) {
- if (const IntrinsicInst *Intrinsic = dyn_cast<IntrinsicInst>(U)) {
- if (V == U->getOperand(1)) {
- switch (Intrinsic->getIntrinsicID()) {
- default:
- Result = false;
- break;
- case Intrinsic::amdgcn_if_break:
- case Intrinsic::amdgcn_if:
- case Intrinsic::amdgcn_else:
- Result = true;
- break;
- }
- }
- if (V == U->getOperand(0)) {
- switch (Intrinsic->getIntrinsicID()) {
- default:
- Result = false;
- break;
- case Intrinsic::amdgcn_end_cf:
- case Intrinsic::amdgcn_loop:
- Result = true;
- break;
- }
- }
- } else {
- Result = hasCFUser(U, Visited, WaveSize);
- }
- if (Result)
- break;
- }
- return Result;
-}
-
-bool SITargetLowering::requiresUniformRegister(MachineFunction &MF,
- const Value *V) const {
- if (const CallInst *CI = dyn_cast<CallInst>(V)) {
- if (CI->isInlineAsm()) {
- // FIXME: This cannot give a correct answer. This should only trigger in
- // the case where inline asm returns mixed SGPR and VGPR results, used
- // outside the defining block. We don't have a specific result to
- // consider, so this assumes if any value is SGPR, the overall register
- // also needs to be SGPR.
- const SIRegisterInfo *SIRI = Subtarget->getRegisterInfo();
- TargetLowering::AsmOperandInfoVector TargetConstraints = ParseConstraints(
- MF.getDataLayout(), Subtarget->getRegisterInfo(), *CI);
- for (auto &TC : TargetConstraints) {
- if (TC.Type == InlineAsm::isOutput) {
- ComputeConstraintToUse(TC, SDValue());
- const TargetRegisterClass *RC =
- getRegForInlineAsmConstraint(SIRI, TC.ConstraintCode,
- TC.ConstraintVT)
- .second;
- if (RC && SIRI->isSGPRClass(RC))
- return true;
- }
- }
- }
- }
- SmallPtrSet<const Value *, 16> Visited;
- return hasCFUser(V, Visited, Subtarget->getWavefrontSize());
-}
-
-bool SITargetLowering::hasMemSDNodeUser(SDNode *N) const {
- for (SDUse &Use : N->uses()) {
- if (MemSDNode *M = dyn_cast<MemSDNode>(Use.getUser())) {
- if (getBasePtrIndex(M) == Use.getOperandNo())
- return true;
- }
- }
- return false;
-}
-
-bool SITargetLowering::isReassocProfitable(SelectionDAG &DAG, SDValue N0,
- SDValue N1) const {
- if (!N0.hasOneUse())
- return false;
- // Take care of the opportunity to keep N0 uniform
- if (N0->isDivergent() || !N1->isDivergent())
- return true;
- // Check if we have a good chance to form the memory access pattern with the
- // base and offset
- return (DAG.isBaseWithConstantOffset(N0) &&
- hasMemSDNodeUser(*N0->user_begin()));
-}
-
-bool SITargetLowering::isReassocProfitable(MachineRegisterInfo &MRI,
- Register N0, Register N1) const {
- return MRI.hasOneNonDBGUse(N0); // FIXME: handle regbanks
-}
-
-MachineMemOperand::Flags
-SITargetLowering::getTargetMMOFlags(const Instruction &I) const {
- // Propagate metadata set by AMDGPUAnnotateUniformValues to the MMO of a load.
- MachineMemOperand::Flags Flags = MachineMemOperand::MONone;
- if (I.getMetadata("amdgpu.noclobber"))
- Flags |= MONoClobber;
- if (I.getMetadata("amdgpu.last.use"))
- Flags |= MOLastUse;
- return Flags;
-}
-
-bool SITargetLowering::checkForPhysRegDependency(
- SDNode *Def, SDNode *User, unsigned Op, const TargetRegisterInfo *TRI,
- const TargetInstrInfo *TII, MCRegister &PhysReg, int &Cost) const {
- if (User->getOpcode() != ISD::CopyToReg)
- return false;
- if (!Def->isMachineOpcode())
- return false;
- MachineSDNode *MDef = dyn_cast<MachineSDNode>(Def);
- if (!MDef)
- return false;
-
- unsigned ResNo = User->getOperand(Op).getResNo();
- if (User->getOperand(Op)->getValueType(ResNo) != MVT::i1)
- return false;
- const MCInstrDesc &II = TII->get(MDef->getMachineOpcode());
- if (II.isCompare() && II.hasImplicitDefOfPhysReg(AMDGPU::SCC)) {
- PhysReg = AMDGPU::SCC;
- const TargetRegisterClass *RC =
- TRI->getMinimalPhysRegClass(PhysReg, Def->getSimpleValueType(ResNo));
- Cost = RC->getCopyCost();
- return true;
- }
- return false;
-}
-
-/// Check if it is profitable to hoist instruction in then/else to if.
-bool SITargetLowering::isProfitableToHoist(Instruction *I) const {
- if (!I->hasOneUse())
- return true;
-
- Instruction *User = I->user_back();
- // TODO: Add more patterns that are not profitable to hoist and
- // handle modifiers such as fabs and fneg
- switch (I->getOpcode()) {
- case Instruction::FMul: {
- if (User->getOpcode() != Instruction::FSub &&
- User->getOpcode() != Instruction::FAdd)
- return true;
-
- const TargetOptions &Options = getTargetMachine().Options;
-
- return ((!I->hasAllowContract() || !User->hasAllowContract()) &&
- Options.AllowFPOpFusion != FPOpFusion::Fast &&
- !Options.UnsafeFPMath) ||
- !isFMAFasterThanFMulAndFAdd(*I->getFunction(), User->getType());
- }
- default:
- return true;
- }
- return true;
-}
-
-void SITargetLowering::emitExpandAtomicAddrSpacePredicate(
- Instruction *AI) const {
- // Given: atomicrmw fadd ptr %addr, float %val ordering
- //
- // With this expansion we produce the following code:
- // [...]
- // %is.shared = call i1 @llvm.amdgcn.is.shared(ptr %addr)
- // br i1 %is.shared, label %atomicrmw.shared, label %atomicrmw.check.private
- //
- // atomicrmw.shared:
- // %cast.shared = addrspacecast ptr %addr to ptr addrspace(3)
- // %loaded.shared = atomicrmw fadd ptr addrspace(3) %cast.shared,
- // float %val ordering
- // br label %atomicrmw.phi
- //
- // atomicrmw.check.private:
- // %is.private = call i1 @llvm.amdgcn.is.private(ptr %int8ptr)
- // br i1 %is.private, label %atomicrmw.private, label %atomicrmw.global
- //
- // atomicrmw.private:
- // %cast.private = addrspacecast ptr %addr to ptr addrspace(5)
- // %loaded.private = load float, ptr addrspace(5) %cast.private
- // %val.new = fadd float %loaded.private, %val
- // store float %val.new, ptr addrspace(5) %cast.private
- // br label %atomicrmw.phi
- //
- // atomicrmw.global:
- // %cast.global = addrspacecast ptr %addr to ptr addrspace(1)
- // %loaded.global = atomicrmw fadd ptr addrspace(1) %cast.global,
- // float %val ordering
- // br label %atomicrmw.phi
- //
- // atomicrmw.phi:
- // %loaded.phi = phi float [ %loaded.shared, %atomicrmw.shared ],
- // [ %loaded.private, %atomicrmw.private ],
- // [ %loaded.global, %atomicrmw.global ]
- // br label %atomicrmw.end
- //
- // atomicrmw.end:
- // [...]
- //
- //
- // For 64-bit atomics which may reside in private memory, we perform a simpler
- // version that only inserts the private check, and uses the flat operation.
-
- IRBuilder<> Builder(AI);
- LLVMContext &Ctx = Builder.getContext();
-
- auto *RMW = dyn_cast<AtomicRMWInst>(AI);
- const unsigned PtrOpIdx = RMW ? AtomicRMWInst::getPointerOperandIndex()
- : AtomicCmpXchgInst::getPointerOperandIndex();
- Value *Addr = AI->getOperand(PtrOpIdx);
-
- /// TODO: Only need to check private, then emit flat-known-not private (no
- /// need for shared block, or cast to global).
- AtomicCmpXchgInst *CX = dyn_cast<AtomicCmpXchgInst>(AI);
-
- Align Alignment;
- if (RMW)
- Alignment = RMW->getAlign();
- else if (CX)
- Alignment = CX->getAlign();
- else
- llvm_unreachable("unhandled atomic operation");
-
- // FullFlatEmulation is true if we need to issue the private, shared, and
- // global cases.
- //
- // If this is false, we are only dealing with the flat-targeting-private case,
- // where we only insert a check for private and still use the flat instruction
- // for global and shared.
-
- bool FullFlatEmulation =
- RMW && RMW->getOperation() == AtomicRMWInst::FAdd &&
- ((Subtarget->hasAtomicFaddInsts() && RMW->getType()->isFloatTy()) ||
- (Subtarget->hasFlatBufferGlobalAtomicFaddF64Inst() &&
- RMW->getType()->isDoubleTy()));
-
- // If the return value isn't used, do not introduce a false use in the phi.
- bool ReturnValueIsUsed = !AI->use_empty();
-
- BasicBlock *BB = Builder.GetInsertBlock();
- Function *F = BB->getParent();
- BasicBlock *ExitBB =
- BB->splitBasicBlock(Builder.GetInsertPoint(), "atomicrmw.end");
- BasicBlock *SharedBB = nullptr;
-
- BasicBlock *CheckPrivateBB = BB;
- if (FullFlatEmulation) {
- SharedBB = BasicBlock::Create(Ctx, "atomicrmw.shared", F, ExitBB);
- CheckPrivateBB =
- BasicBlock::Create(Ctx, "atomicrmw.check.private", F, ExitBB);
- }
-
- BasicBlock *PrivateBB =
- BasicBlock::Create(Ctx, "atomicrmw.private", F, ExitBB);
- BasicBlock *GlobalBB = BasicBlock::Create(Ctx, "atomicrmw.global", F, ExitBB);
- BasicBlock *PhiBB = BasicBlock::Create(Ctx, "atomicrmw.phi", F, ExitBB);
-
- std::prev(BB->end())->eraseFromParent();
- Builder.SetInsertPoint(BB);
-
- Value *LoadedShared = nullptr;
- if (FullFlatEmulation) {
- CallInst *IsShared = Builder.CreateIntrinsic(Intrinsic::amdgcn_is_shared,
- {Addr}, nullptr, "is.shared");
- Builder.CreateCondBr(IsShared, SharedBB, CheckPrivateBB);
- Builder.SetInsertPoint(SharedBB);
- Value *CastToLocal = Builder.CreateAddrSpaceCast(
- Addr, PointerType::get(Ctx, AMDGPUAS::LOCAL_ADDRESS));
-
- Instruction *Clone = AI->clone();
- Clone->insertInto(SharedBB, SharedBB->end());
- Clone->getOperandUse(PtrOpIdx).set(CastToLocal);
- LoadedShared = Clone;
-
- Builder.CreateBr(PhiBB);
- Builder.SetInsertPoint(CheckPrivateBB);
- }
-
- CallInst *IsPrivate = Builder.CreateIntrinsic(Intrinsic::amdgcn_is_private,
- {Addr}, nullptr, "is.private");
- Builder.CreateCondBr(IsPrivate, PrivateBB, GlobalBB);
-
- Builder.SetInsertPoint(PrivateBB);
-
- Value *CastToPrivate = Builder.CreateAddrSpaceCast(
- Addr, PointerType::get(Ctx, AMDGPUAS::PRIVATE_ADDRESS));
-
- Value *LoadedPrivate;
- if (RMW) {
- LoadedPrivate = Builder.CreateAlignedLoad(
- RMW->getType(), CastToPrivate, RMW->getAlign(), "loaded.private");
-
- Value *NewVal = buildAtomicRMWValue(RMW->getOperation(), Builder,
- LoadedPrivate, RMW->getValOperand());
-
- Builder.CreateAlignedStore(NewVal, CastToPrivate, RMW->getAlign());
- } else {
- auto [ResultLoad, Equal] =
- buildCmpXchgValue(Builder, CastToPrivate, CX->getCompareOperand(),
- CX->getNewValOperand(), CX->getAlign());
-
- Value *Insert = Builder.CreateInsertValue(PoisonValue::get(CX->getType()),
- ResultLoad, 0);
- LoadedPrivate = Builder.CreateInsertValue(Insert, Equal, 1);
- }
-
- Builder.CreateBr(PhiBB);
-
- Builder.SetInsertPoint(GlobalBB);
-
- // Continue using a flat instruction if we only emitted the check for private.
- Instruction *LoadedGlobal = AI;
- if (FullFlatEmulation) {
- Value *CastToGlobal = Builder.CreateAddrSpaceCast(
- Addr, PointerType::get(Ctx, AMDGPUAS::GLOBAL_ADDRESS));
- AI->getOperandUse(PtrOpIdx).set(CastToGlobal);
- }
-
- AI->removeFromParent();
- AI->insertInto(GlobalBB, GlobalBB->end());
-
- // The new atomicrmw may go through another round of legalization later.
- if (!FullFlatEmulation) {
- // We inserted the runtime check already, make sure we do not try to
- // re-expand this.
- // TODO: Should union with any existing metadata.
- MDBuilder MDB(F->getContext());
- MDNode *RangeNotPrivate =
- MDB.createRange(APInt(32, AMDGPUAS::PRIVATE_ADDRESS),
- APInt(32, AMDGPUAS::PRIVATE_ADDRESS + 1));
- LoadedGlobal->setMetadata(LLVMContext::MD_noalias_addrspace,
- RangeNotPrivate);
- }
-
- Builder.CreateBr(PhiBB);
-
- Builder.SetInsertPoint(PhiBB);
-
- if (ReturnValueIsUsed) {
- PHINode *Loaded = Builder.CreatePHI(AI->getType(), 3);
- AI->replaceAllUsesWith(Loaded);
- if (FullFlatEmulation)
- Loaded->addIncoming(LoadedShared, SharedBB);
- Loaded->addIncoming(LoadedPrivate, PrivateBB);
- Loaded->addIncoming(LoadedGlobal, GlobalBB);
- Loaded->takeName(AI);
- }
-
- Builder.CreateBr(ExitBB);
-}
-
-void SITargetLowering::emitExpandAtomicRMW(AtomicRMWInst *AI) const {
- AtomicRMWInst::BinOp Op = AI->getOperation();
-
- if (Op == AtomicRMWInst::Sub || Op == AtomicRMWInst::Or ||
- Op == AtomicRMWInst::Xor) {
- if (const auto *ConstVal = dyn_cast<Constant>(AI->getValOperand());
- ConstVal && ConstVal->isNullValue()) {
- // atomicrmw or %ptr, 0 -> atomicrmw add %ptr, 0
- AI->setOperation(AtomicRMWInst::Add);
-
- // We may still need the private-alias-flat handling below.
-
- // TODO: Skip this for cases where we cannot access remote memory.
- }
- }
-
- // The non-flat expansions should only perform the de-canonicalization of
- // identity values.
- if (AI->getPointerAddressSpace() != AMDGPUAS::FLAT_ADDRESS)
- return;
-
- emitExpandAtomicAddrSpacePredicate(AI);
-}
-
-void SITargetLowering::emitExpandAtomicCmpXchg(AtomicCmpXchgInst *CI) const {
- emitExpandAtomicAddrSpacePredicate(CI);
-}
-
-LoadInst *
-SITargetLowering::lowerIdempotentRMWIntoFencedLoad(AtomicRMWInst *AI) const {
- IRBuilder<> Builder(AI);
- auto Order = AI->getOrdering();
-
- // The optimization removes store aspect of the atomicrmw. Therefore, cache
- // must be flushed if the atomic ordering had a release semantics. This is
- // not necessary a fence, a release fence just coincides to do that flush.
- // Avoid replacing of an atomicrmw with a release semantics.
- if (isReleaseOrStronger(Order))
- return nullptr;
-
- LoadInst *LI = Builder.CreateAlignedLoad(
- AI->getType(), AI->getPointerOperand(), AI->getAlign());
- LI->setAtomic(Order, AI->getSyncScopeID());
- LI->copyMetadata(*AI);
- LI->takeName(AI);
- AI->replaceAllUsesWith(LI);
- AI->eraseFromParent();
- return LI;
-}
+ VTSign->getVT() == M
\ No newline at end of file
More information about the llvm-commits
mailing list