[llvm] bc20bdb - AMDGPU/GlobalISel: Start rewriting load/store legality rules
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Sat Jun 6 07:00:00 PDT 2020
Author: Matt Arsenault
Date: 2020-06-06T09:59:46-04:00
New Revision: bc20bdb9f968122ef7801ac9124d68953fa0dd5e
URL: https://github.com/llvm/llvm-project/commit/bc20bdb9f968122ef7801ac9124d68953fa0dd5e
DIFF: https://github.com/llvm/llvm-project/commit/bc20bdb9f968122ef7801ac9124d68953fa0dd5e.diff
LOG: AMDGPU/GlobalISel: Start rewriting load/store legality rules
The current set is an incomprehensible mess riddled with ordering
hacks for various limitations in the legalizer at the time of writing,
many of which have been fixed. This takes a very small step in
correcting this.
The core first change is to start checking for fully legal cases
first, rather than trying to figure out all of the actions that could
need to be performed. It's recommended to check the legal cases first
for faster legality checks in the common case. This still has a table
listing some common cases, but it needs measuring whether this really
helps or not.
More significantly, stop trying to allow any arbitrary type with a
legal bitwidth as a legal memory type, and start using the bitcast
legalize action for them. Allowing loads of these weird vector types
produced new burdens we don't need for handling all of the
legalization artifacts. Unlike the SelectionDAG handling, this is
still not casting 64 or 16-bit element vectors to 32-bit
vectors. These cases should still be handled by increasing/decreasing
the number of 16-bit elements. This is primarily to fix 8-bit element
vectors.
Another change is to stop trying to handle the load-widening based on
a higher alignment. We should still do this, but the way it was
handled wasn't really correct. We really need to modify the MMO's size
at the same time, and not just increase the result type. The
LegalizerHelper does not do this, and I think this would really
require a separate WidenMemory action (or to add a memory action
payload to the LegalizeMutation). These will now fail to legalize.
The structure of the legalizer rules makes writing concise rules here
difficult. It would be easier if the same function could answer the
query the query, and report the action to perform at the same
time. Instead these two are split into distinct predicate and action
functions. This is mostly tolerable for other cases, but the
load/store rules get pretty complicated so it's difficult to keep two
versions of these functions in sync.
Added:
Modified:
llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-global.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-global.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect.mir
Removed:
################################################################################
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
index b3c7b6842258..cc549c4e9440 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
@@ -578,6 +578,15 @@ class LegalizeRuleSet {
return actionIf(LegalizeAction::Legal, always);
}
+ /// The specified type index is coerced if predicate is true.
+ LegalizeRuleSet &bitcastIf(LegalityPredicate Predicate,
+ LegalizeMutation Mutation) {
+ // We have no choice but conservatively assume that lowering with a
+ // free-form user provided Predicate properly handles all type indices:
+ markAllIdxsAsCovered();
+ return actionIf(LegalizeAction::Bitcast, Predicate, Mutation);
+ }
+
/// The instruction is lowered.
LegalizeRuleSet &lower() {
using namespace LegalizeMutations;
diff --git a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
index 678124c9fcc3..c1e9e2254692 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
@@ -114,6 +114,23 @@ static LegalizeMutation moreEltsToNext32Bit(unsigned TypeIdx) {
};
}
+static LegalizeMutation bitcastToRegisterType(unsigned TypeIdx) {
+ return [=](const LegalityQuery &Query) {
+ const LLT Ty = Query.Types[TypeIdx];
+ unsigned Size = Ty.getSizeInBits();
+
+ LLT CoercedTy;
+ if (Size < 32) {
+ // <2 x s8> -> s16
+ assert(Size == 16);
+ CoercedTy = LLT::scalar(16);
+ } else
+ CoercedTy = LLT::scalarOrVector(Size / 32, 32);
+
+ return std::make_pair(TypeIdx, CoercedTy);
+ };
+}
+
static LegalityPredicate vectorSmallerThan(unsigned TypeIdx, unsigned Size) {
return [=](const LegalityQuery &Query) {
const LLT QueryTy = Query.Types[TypeIdx];
@@ -135,19 +152,37 @@ static LegalityPredicate numElementsNotEven(unsigned TypeIdx) {
};
}
+static bool isRegisterSize(unsigned Size) {
+ return Size % 32 == 0 && Size <= 1024;
+}
+
+static bool isRegisterVectorElementType(LLT EltTy) {
+ const int EltSize = EltTy.getSizeInBits();
+ return EltSize == 16 || EltSize % 32 == 0;
+}
+
+static bool isRegisterVectorType(LLT Ty) {
+ const int EltSize = Ty.getElementType().getSizeInBits();
+ return EltSize == 32 || EltSize == 64 ||
+ (EltSize == 16 && Ty.getNumElements() % 2 == 0) ||
+ EltSize == 128 || EltSize == 256;
+}
+
+static bool isRegisterType(LLT Ty) {
+ if (!isRegisterSize(Ty.getSizeInBits()))
+ return false;
+
+ if (Ty.isVector())
+ return isRegisterVectorType(Ty);
+
+ return true;
+}
+
// Any combination of 32 or 64-bit elements up to 1024 bits, and multiples of
// v2s16.
static LegalityPredicate isRegisterType(unsigned TypeIdx) {
return [=](const LegalityQuery &Query) {
- const LLT Ty = Query.Types[TypeIdx];
- if (Ty.isVector()) {
- const int EltSize = Ty.getElementType().getSizeInBits();
- return EltSize == 32 || EltSize == 64 ||
- (EltSize == 16 && Ty.getNumElements() % 2 == 0) ||
- EltSize == 128 || EltSize == 256;
- }
-
- return Ty.getSizeInBits() % 32 == 0 && Ty.getSizeInBits() <= 1024;
+ return isRegisterType(Query.Types[TypeIdx]);
};
}
@@ -169,6 +204,102 @@ static LegalityPredicate isWideScalarTruncStore(unsigned TypeIdx) {
};
}
+// TODO: Should load to s16 be legal? Most loads extend to 32-bits, but we
+// handle some operations by just promoting the register during
+// selection. There are also d16 loads on GFX9+ which preserve the high bits.
+static unsigned maxSizeForAddrSpace(const GCNSubtarget &ST, unsigned AS,
+ bool IsLoad) {
+ switch (AS) {
+ case AMDGPUAS::PRIVATE_ADDRESS:
+ // FIXME: Private element size.
+ return 32;
+ case AMDGPUAS::LOCAL_ADDRESS:
+ return ST.useDS128() ? 128 : 64;
+ case AMDGPUAS::GLOBAL_ADDRESS:
+ case AMDGPUAS::CONSTANT_ADDRESS:
+ case AMDGPUAS::CONSTANT_ADDRESS_32BIT:
+ // Treat constant and global as identical. SMRD loads are sometimes usable for
+ // global loads (ideally constant address space should be eliminated)
+ // depending on the context. Legality cannot be context dependent, but
+ // RegBankSelect can split the load as necessary depending on the pointer
+ // register bank/uniformity and if the memory is invariant or not written in a
+ // kernel.
+ return IsLoad ? 512 : 128;
+ default:
+ // Flat addresses may contextually need to be split to 32-bit parts if they
+ // may alias scratch depending on the subtarget.
+ return 128;
+ }
+}
+
+static bool isLoadStoreSizeLegal(const GCNSubtarget &ST,
+ const LegalityQuery &Query,
+ unsigned Opcode) {
+ const LLT Ty = Query.Types[0];
+
+ // Handle G_LOAD, G_ZEXTLOAD, G_SEXTLOAD
+ const bool IsLoad = Opcode != AMDGPU::G_STORE;
+
+ unsigned RegSize = Ty.getSizeInBits();
+ unsigned MemSize = Query.MMODescrs[0].SizeInBits;
+ unsigned Align = Query.MMODescrs[0].AlignInBits;
+ unsigned AS = Query.Types[1].getAddressSpace();
+
+ // All of these need to be custom lowered to cast the pointer operand.
+ if (AS == AMDGPUAS::CONSTANT_ADDRESS_32BIT)
+ return false;
+
+ // TODO: We should be able to widen loads if the alignment is high enough, but
+ // we also need to modify the memory access size.
+#if 0
+ // Accept widening loads based on alignment.
+ if (IsLoad && MemSize < Size)
+ MemSize = std::max(MemSize, Align);
+#endif
+
+ // Only 1-byte and 2-byte to 32-bit extloads are valid.
+ if (MemSize != RegSize && RegSize != 32)
+ return false;
+
+ if (MemSize > maxSizeForAddrSpace(ST, AS, IsLoad))
+ return false;
+
+ switch (MemSize) {
+ case 8:
+ case 16:
+ case 32:
+ case 64:
+ case 128:
+ break;
+ case 96:
+ if (!ST.hasDwordx3LoadStores())
+ return false;
+ break;
+ case 256:
+ case 512:
+ // These may contextually need to be broken down.
+ break;
+ default:
+ return false;
+ }
+
+ assert(RegSize >= MemSize);
+
+ if (Align < MemSize) {
+ const SITargetLowering *TLI = ST.getTargetLowering();
+ if (!TLI->allowsMisalignedMemoryAccessesImpl(MemSize, AS, Align / 8))
+ return false;
+ }
+
+ return true;
+}
+
+static bool isLoadStoreLegal(const GCNSubtarget &ST, const LegalityQuery &Query,
+ unsigned Opcode) {
+ const LLT Ty = Query.Types[0];
+ return isRegisterType(Ty) && isLoadStoreSizeLegal(ST, Query, Opcode);
+}
+
AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
const GCNTargetMachine &TM)
: ST(ST_) {
@@ -711,32 +842,6 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
.scalarize(0)
.custom();
- // TODO: Should load to s16 be legal? Most loads extend to 32-bits, but we
- // handle some operations by just promoting the register during
- // selection. There are also d16 loads on GFX9+ which preserve the high bits.
- auto maxSizeForAddrSpace = [this](unsigned AS, bool IsLoad) -> unsigned {
- switch (AS) {
- // FIXME: Private element size.
- case AMDGPUAS::PRIVATE_ADDRESS:
- return 32;
- // FIXME: Check subtarget
- case AMDGPUAS::LOCAL_ADDRESS:
- return ST.useDS128() ? 128 : 64;
-
- // Treat constant and global as identical. SMRD loads are sometimes usable
- // for global loads (ideally constant address space should be eliminated)
- // depending on the context. Legality cannot be context dependent, but
- // RegBankSelect can split the load as necessary depending on the pointer
- // register bank/uniformity and if the memory is invariant or not written in
- // a kernel.
- case AMDGPUAS::CONSTANT_ADDRESS:
- case AMDGPUAS::GLOBAL_ADDRESS:
- return IsLoad ? 512 : 128;
- default:
- return 128;
- }
- };
-
const auto needToSplitMemOp = [=](const LegalityQuery &Query,
bool IsLoad) -> bool {
const LLT DstTy = Query.Types[0];
@@ -753,7 +858,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
const LLT PtrTy = Query.Types[1];
unsigned AS = PtrTy.getAddressSpace();
- if (MemSize > maxSizeForAddrSpace(AS, IsLoad))
+ if (MemSize > maxSizeForAddrSpace(ST, AS, IsLoad))
return true;
// Catch weird sized loads that don't evenly divide into the access sizes
@@ -776,7 +881,8 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
return false;
};
- const auto shouldWidenLoadResult = [=](const LegalityQuery &Query) -> bool {
+ const auto shouldWidenLoadResult = [=](const LegalityQuery &Query,
+ unsigned Opc) -> bool {
unsigned Size = Query.Types[0].getSizeInBits();
if (isPowerOf2_32(Size))
return false;
@@ -785,7 +891,7 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
return false;
unsigned AddrSpace = Query.Types[1].getAddressSpace();
- if (Size >= maxSizeForAddrSpace(AddrSpace, true))
+ if (Size >= maxSizeForAddrSpace(ST, AddrSpace, Opc))
return false;
unsigned Align = Query.MMODescrs[0].AlignInBits;
@@ -805,12 +911,11 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
const bool IsStore = Op == G_STORE;
auto &Actions = getActionDefinitionsBuilder(Op);
- // Whitelist the common cases.
- // TODO: Loads to s16 on gfx9
+ // Whitelist some common cases.
+ // TODO: Does this help compile time at all?
Actions.legalForTypesWithMemDesc({{S32, GlobalPtr, 32, GlobalAlign32},
{V2S32, GlobalPtr, 64, GlobalAlign32},
{V4S32, GlobalPtr, 128, GlobalAlign32},
- {S128, GlobalPtr, 128, GlobalAlign32},
{S64, GlobalPtr, 64, GlobalAlign32},
{V2S64, GlobalPtr, 128, GlobalAlign32},
{V2S16, GlobalPtr, 32, GlobalAlign32},
@@ -829,29 +934,49 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
{S32, PrivatePtr, 16, 16},
{V2S16, PrivatePtr, 32, 32},
- {S32, FlatPtr, 32, GlobalAlign32},
- {S32, FlatPtr, 16, GlobalAlign16},
- {S32, FlatPtr, 8, GlobalAlign8},
- {V2S16, FlatPtr, 32, GlobalAlign32},
-
{S32, ConstantPtr, 32, GlobalAlign32},
{V2S32, ConstantPtr, 64, GlobalAlign32},
{V4S32, ConstantPtr, 128, GlobalAlign32},
{S64, ConstantPtr, 64, GlobalAlign32},
- {S128, ConstantPtr, 128, GlobalAlign32},
{V2S32, ConstantPtr, 32, GlobalAlign32}});
+ Actions.legalIf(
+ [=](const LegalityQuery &Query) -> bool {
+ return isLoadStoreLegal(ST, Query, Op);
+ });
+
+ // Constant 32-bit is handled by addrspacecasting the 32-bit pointer to
+ // 64-bits.
+ //
+ // TODO: Should generalize bitcast action into coerce, which will also cover
+ // inserting addrspacecasts.
+ Actions.customIf(typeIs(1, Constant32Ptr));
+
+ // Turn any illegal element vectors into something easier to deal
+ // with. These will ultimately produce 32-bit scalar shifts to extract the
+ // parts anyway.
+ //
+ // For odd 16-bit element vectors, prefer to split those into pieces with
+ // 16-bit vector parts.
+ Actions.bitcastIf(
+ [=](const LegalityQuery &Query) -> bool {
+ LLT Ty = Query.Types[0];
+ return Ty.isVector() &&
+ isRegisterSize(Ty.getSizeInBits()) &&
+ !isRegisterVectorElementType(Ty.getElementType());
+ }, bitcastToRegisterType(0));
+
Actions
.customIf(typeIs(1, Constant32Ptr))
// Widen suitably aligned loads by loading extra elements.
.moreElementsIf([=](const LegalityQuery &Query) {
const LLT Ty = Query.Types[0];
return Op == G_LOAD && Ty.isVector() &&
- shouldWidenLoadResult(Query);
+ shouldWidenLoadResult(Query, Op);
}, moreElementsToNextPow2(0))
.widenScalarIf([=](const LegalityQuery &Query) {
const LLT Ty = Query.Types[0];
return Op == G_LOAD && !Ty.isVector() &&
- shouldWidenLoadResult(Query);
+ shouldWidenLoadResult(Query, Op);
}, widenScalarOrEltToNextPow2(0))
.narrowScalarIf(
[=](const LegalityQuery &Query) -> bool {
@@ -883,7 +1008,8 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
return std::make_pair(0, LLT::scalar(32 * (DstSize / 32)));
}
- unsigned MaxSize = maxSizeForAddrSpace(PtrTy.getAddressSpace(),
+ unsigned MaxSize = maxSizeForAddrSpace(ST,
+ PtrTy.getAddressSpace(),
Op == G_LOAD);
if (MemSize > MaxSize)
return std::make_pair(0, LLT::scalar(MaxSize));
@@ -901,7 +1027,8 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
const LLT PtrTy = Query.Types[1];
LLT EltTy = DstTy.getElementType();
- unsigned MaxSize = maxSizeForAddrSpace(PtrTy.getAddressSpace(),
+ unsigned MaxSize = maxSizeForAddrSpace(ST,
+ PtrTy.getAddressSpace(),
Op == G_LOAD);
// FIXME: Handle widened to power of 2 results better. This ends
@@ -963,37 +1090,6 @@ AMDGPULegalizerInfo::AMDGPULegalizerInfo(const GCNSubtarget &ST_,
// TODO: Need a bitcast lower option?
Actions
- .legalIf([=](const LegalityQuery &Query) {
- const LLT Ty0 = Query.Types[0];
- unsigned Size = Ty0.getSizeInBits();
- unsigned MemSize = Query.MMODescrs[0].SizeInBits;
- unsigned Align = Query.MMODescrs[0].AlignInBits;
-
- // FIXME: Widening store from alignment not valid.
- if (MemSize < Size)
- MemSize = std::max(MemSize, Align);
-
- // No extending vector loads.
- if (Size > MemSize && Ty0.isVector())
- return false;
-
- switch (MemSize) {
- case 8:
- case 16:
- return Size == 32;
- case 32:
- case 64:
- case 128:
- return true;
- case 96:
- return ST.hasDwordx3LoadStores();
- case 256:
- case 512:
- return true;
- default:
- return false;
- }
- })
.widenScalarToNextPow2(0)
.moreElementsIf(vectorSmallerThan(0, 32), moreEltsToNext32Bit(0));
}
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir
index 82b9531e8e4f..f0a1e2170668 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-constant.mir
@@ -635,38 +635,33 @@ body: |
; CI-LABEL: name: test_load_constant_s48_align8
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
; CI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p4) :: (load 6, align 8, addrspace 4)
- ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; VI-LABEL: name: test_load_constant_s48_align8
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
; VI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p4) :: (load 6, align 8, addrspace 4)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; VI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; VI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; VI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; VI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-LABEL: name: test_load_constant_s48_align8
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
; GFX9: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p4) :: (load 6, align 8, addrspace 4)
- ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; CI-MESA-LABEL: name: test_load_constant_s48_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
; CI-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p4) :: (load 6, align 8, addrspace 4)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-MESA-LABEL: name: test_load_constant_s48_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p4) :: (load 6, align 8, addrspace 4)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(s48) = G_LOAD %0 :: (load 6, align 8, addrspace 4)
%2:_(s64) = G_ZEXT %1
@@ -4134,63 +4129,127 @@ body: |
; CI-LABEL: name: test_load_constant_v2s8_align4
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_constant_v2s8_align4
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; VI: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_constant_v2s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_constant_v2s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_constant_v2s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, align 4, addrspace 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
@@ -4209,43 +4268,129 @@ body: |
; CI-LABEL: name: test_load_constant_v2s8_align2
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_constant_v2s8_align2
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; VI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_constant_v2s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_constant_v2s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_constant_v2s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(<2 x s8>) = G_LOAD %0 :: (load 2, align 2, addrspace 4)
@@ -4465,24 +4610,88 @@ body: |
; CI-LABEL: name: test_load_constant_v4s8_align4
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
- ; CI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_constant_v4s8_align4
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
- ; VI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_constant_v4s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
- ; GFX9: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_constant_v4s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
- ; CI-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_constant_v4s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
- ; GFX9-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 4, addrspace 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(<4 x s8>) = G_LOAD %0 :: (load 4, align 4, addrspace 4)
$vgpr0 = COPY %1
@@ -4496,99 +4705,114 @@ body: |
; CI-LABEL: name: test_load_constant_v4s8_align2
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, align 2, addrspace 4)
- ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CI: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 1 + 1, addrspace 4)
- ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; CI: [[PTR_ADD1:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p4) :: (load 1 + 2, align 2, addrspace 4)
- ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; CI: [[PTR_ADD2:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p4) :: (load 1 + 3, addrspace 4)
+ ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 2 + 2, addrspace 4)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; CI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; CI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_constant_v4s8_align2
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, align 2, addrspace 4)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; VI: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 1 + 1, addrspace 4)
- ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; VI: [[PTR_ADD1:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p4) :: (load 1 + 2, align 2, addrspace 4)
- ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; VI: [[PTR_ADD2:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p4) :: (load 1 + 3, addrspace 4)
+ ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 2 + 2, addrspace 4)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; VI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; VI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; VI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_constant_v4s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, align 2, addrspace 4)
- ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 1 + 1, addrspace 4)
- ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; GFX9: [[PTR_ADD1:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p4) :: (load 1 + 2, align 2, addrspace 4)
- ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; GFX9: [[PTR_ADD2:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p4) :: (load 1 + 3, addrspace 4)
+ ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 2 + 2, addrspace 4)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; GFX9: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; GFX9: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_constant_v4s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, align 2, addrspace 4)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CI-MESA: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 1 + 1, addrspace 4)
- ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; CI-MESA: [[PTR_ADD1:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p4) :: (load 1 + 2, align 2, addrspace 4)
- ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; CI-MESA: [[PTR_ADD2:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p4) :: (load 1 + 3, addrspace 4)
+ ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 2 + 2, addrspace 4)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; CI-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; CI-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_constant_v4s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 1, align 2, addrspace 4)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p4) :: (load 2, addrspace 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; GFX9-MESA: [[PTR_ADD:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 1 + 1, addrspace 4)
- ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; GFX9-MESA: [[PTR_ADD1:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p4) :: (load 1 + 2, align 2, addrspace 4)
- ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; GFX9-MESA: [[PTR_ADD2:%[0-9]+]]:_(p4) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p4) :: (load 1 + 3, addrspace 4)
+ ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p4) :: (load 2 + 2, addrspace 4)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; GFX9-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; GFX9-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
@@ -4716,24 +4940,29 @@ body: |
; CI-LABEL: name: test_load_constant_v8s8_align8
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
- ; CI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; VI-LABEL: name: test_load_constant_v8s8_align8
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
- ; VI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; VI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-LABEL: name: test_load_constant_v8s8_align8
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
- ; GFX9: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-MESA-LABEL: name: test_load_constant_v8s8_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-MESA-LABEL: name: test_load_constant_v8s8_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p4) :: (load 8, addrspace 4)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(<8 x s8>) = G_LOAD %0 :: (load 8, align 8, addrspace 4)
$vgpr0_vgpr1 = COPY %1
@@ -4747,24 +4976,29 @@ body: |
; CI-LABEL: name: test_load_constant_v16s8_align16
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; VI-LABEL: name: test_load_constant_v16s8_align16
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-LABEL: name: test_load_constant_v16s8_align16
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-MESA-LABEL: name: test_load_constant_v16s8_align16
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-MESA-LABEL: name: test_load_constant_v16s8_align16
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p4) :: (load 16, addrspace 4)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(<16 x s8>) = G_LOAD %0 :: (load 16, align 16, addrspace 4)
$vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
@@ -4778,24 +5012,29 @@ body: |
; CI-LABEL: name: test_load_constant_v32s8_align32
; CI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; VI-LABEL: name: test_load_constant_v32s8_align32
; VI: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-LABEL: name: test_load_constant_v32s8_align32
; GFX9: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; CI-MESA-LABEL: name: test_load_constant_v32s8_align32
; CI-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-MESA-LABEL: name: test_load_constant_v32s8_align32
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p4) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p4) :: (load 32, addrspace 4)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
%0:_(p4) = COPY $vgpr0_vgpr1
%1:_(<32 x s8>) = G_LOAD %0 :: (load 32, align 32, addrspace 4)
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir
index 34c5e56e2c56..73d2c9f9aa23 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-flat.mir
@@ -635,38 +635,33 @@ body: |
; CI-LABEL: name: test_load_flat_s48_align8
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
; CI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p0) :: (load 6, align 8)
- ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; VI-LABEL: name: test_load_flat_s48_align8
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
; VI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p0) :: (load 6, align 8)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; VI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; VI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; VI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; VI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-LABEL: name: test_load_flat_s48_align8
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
; GFX9: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p0) :: (load 6, align 8)
- ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; CI-MESA-LABEL: name: test_load_flat_s48_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
; CI-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p0) :: (load 6, align 8)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-MESA-LABEL: name: test_load_flat_s48_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p0) :: (load 6, align 8)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(s48) = G_LOAD %0 :: (load 6, align 8, addrspace 0)
%2:_(s64) = G_ZEXT %1
@@ -4154,63 +4149,127 @@ body: |
; CI-LABEL: name: test_load_flat_v2s8_align4
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_flat_v2s8_align4
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; VI: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_flat_v2s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_flat_v2s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_flat_v2s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2, align 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
@@ -4229,43 +4288,129 @@ body: |
; CI-LABEL: name: test_load_flat_v2s8_align2
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_flat_v2s8_align2
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; VI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_flat_v2s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_flat_v2s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_flat_v2s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 2)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(<2 x s8>) = G_LOAD %0 :: (load 2, align 2, addrspace 0)
@@ -4485,24 +4630,88 @@ body: |
; CI-LABEL: name: test_load_flat_v4s8_align4
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 4)
- ; CI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 4)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_flat_v4s8_align4
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 4)
- ; VI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 4)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_flat_v4s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 4)
- ; GFX9: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 4)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_flat_v4s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 4)
- ; CI-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 4)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_flat_v4s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p0) :: (load 4)
- ; GFX9-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 4)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(<4 x s8>) = G_LOAD %0 :: (load 4, align 4, addrspace 0)
$vgpr0 = COPY %1
@@ -4516,99 +4725,114 @@ body: |
; CI-LABEL: name: test_load_flat_v4s8_align2
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1, align 2)
- ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CI: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 1 + 1)
- ; CI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; CI: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p0) :: (load 1 + 2, align 2)
- ; CI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; CI: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load 1 + 3)
+ ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 2 + 2)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; CI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; CI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_flat_v4s8_align2
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1, align 2)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; VI: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 1 + 1)
- ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; VI: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p0) :: (load 1 + 2, align 2)
- ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; VI: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load 1 + 3)
+ ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 2 + 2)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; VI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; VI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; VI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_flat_v4s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1, align 2)
- ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 1 + 1)
- ; GFX9: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; GFX9: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p0) :: (load 1 + 2, align 2)
- ; GFX9: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; GFX9: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load 1 + 3)
+ ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 2 + 2)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; GFX9: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; GFX9: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_flat_v4s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1, align 2)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CI-MESA: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 1 + 1)
- ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; CI-MESA: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p0) :: (load 1 + 2, align 2)
- ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; CI-MESA: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load 1 + 3)
+ ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 2 + 2)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; CI-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; CI-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_flat_v4s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 1, align 2)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p0) :: (load 2)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; GFX9-MESA: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 1 + 1)
- ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; GFX9-MESA: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p0) :: (load 1 + 2, align 2)
- ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; GFX9-MESA: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p0) :: (load 1 + 3)
+ ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p0) :: (load 2 + 2)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; GFX9-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; GFX9-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
@@ -4736,24 +4960,29 @@ body: |
; CI-LABEL: name: test_load_flat_v8s8_align8
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p0) :: (load 8)
- ; CI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p0) :: (load 8)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; VI-LABEL: name: test_load_flat_v8s8_align8
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p0) :: (load 8)
- ; VI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p0) :: (load 8)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; VI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-LABEL: name: test_load_flat_v8s8_align8
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p0) :: (load 8)
- ; GFX9: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p0) :: (load 8)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-MESA-LABEL: name: test_load_flat_v8s8_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p0) :: (load 8)
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p0) :: (load 8)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-MESA-LABEL: name: test_load_flat_v8s8_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p0) :: (load 8)
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p0) :: (load 8)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(<8 x s8>) = G_LOAD %0 :: (load 8, align 8, addrspace 0)
$vgpr0_vgpr1 = COPY %1
@@ -4767,24 +4996,29 @@ body: |
; CI-LABEL: name: test_load_flat_v16s8_align16
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; VI-LABEL: name: test_load_flat_v16s8_align16
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-LABEL: name: test_load_flat_v16s8_align16
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-MESA-LABEL: name: test_load_flat_v16s8_align16
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-MESA-LABEL: name: test_load_flat_v16s8_align16
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(<16 x s8>) = G_LOAD %0 :: (load 16, align 16, addrspace 0)
$vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
@@ -4798,44 +5032,49 @@ body: |
; CI-LABEL: name: test_load_flat_v32s8_align32
; CI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
+ ; CI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
; CI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
; CI: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
- ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s8>) = G_CONCAT_VECTORS [[LOAD]](<16 x s8>), [[LOAD1]](<16 x s8>)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<32 x s8>)
+ ; CI: [[LOAD1:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
+ ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s32>) = G_CONCAT_VECTORS [[LOAD]](<4 x s32>), [[LOAD1]](<4 x s32>)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<8 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; VI-LABEL: name: test_load_flat_v32s8_align32
; VI: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
+ ; VI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
; VI: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; VI: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
- ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s8>) = G_CONCAT_VECTORS [[LOAD]](<16 x s8>), [[LOAD1]](<16 x s8>)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<32 x s8>)
+ ; VI: [[LOAD1:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
+ ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s32>) = G_CONCAT_VECTORS [[LOAD]](<4 x s32>), [[LOAD1]](<4 x s32>)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<8 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-LABEL: name: test_load_flat_v32s8_align32
; GFX9: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
; GFX9: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
- ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s8>) = G_CONCAT_VECTORS [[LOAD]](<16 x s8>), [[LOAD1]](<16 x s8>)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<32 x s8>)
+ ; GFX9: [[LOAD1:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s32>) = G_CONCAT_VECTORS [[LOAD]](<4 x s32>), [[LOAD1]](<4 x s32>)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<8 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; CI-MESA-LABEL: name: test_load_flat_v32s8_align32
; CI-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
; CI-MESA: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI-MESA: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
- ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s8>) = G_CONCAT_VECTORS [[LOAD]](<16 x s8>), [[LOAD1]](<16 x s8>)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<32 x s8>)
+ ; CI-MESA: [[LOAD1:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
+ ; CI-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s32>) = G_CONCAT_VECTORS [[LOAD]](<4 x s32>), [[LOAD1]](<4 x s32>)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<8 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-MESA-LABEL: name: test_load_flat_v32s8_align32
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p0) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p0) :: (load 16, align 32)
; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 16
; GFX9-MESA: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
- ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s8>) = G_CONCAT_VECTORS [[LOAD]](<16 x s8>), [[LOAD1]](<16 x s8>)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[CONCAT_VECTORS]](<32 x s8>)
+ ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[PTR_ADD]](p0) :: (load 16 + 16)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s32>) = G_CONCAT_VECTORS [[LOAD]](<4 x s32>), [[LOAD1]](<4 x s32>)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<8 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
%0:_(p0) = COPY $vgpr0_vgpr1
%1:_(<32 x s8>) = G_LOAD %0 :: (load 32, align 32, addrspace 0)
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
index 5c3ee75cc4dc..ee36dcad9bbd 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-global.mir
@@ -675,45 +675,39 @@ body: |
; SI-LABEL: name: test_load_global_s48_align8
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; SI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; SI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; SI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; SI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; SI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; SI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; SI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; CI-HSA-LABEL: name: test_load_global_s48_align8
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; CI-HSA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; CI-HSA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI-HSA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI-HSA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI-HSA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI-HSA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI-HSA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; CI-MESA-LABEL: name: test_load_global_s48_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; CI-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; CI-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; VI-LABEL: name: test_load_global_s48_align8
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; VI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; VI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; VI: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; VI: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; VI: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-HSA-LABEL: name: test_load_global_s48_align8
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; GFX9-HSA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9-HSA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9-HSA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9-HSA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9-HSA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
; GFX9-MESA-LABEL: name: test_load_global_s48_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p1) :: (load 6, align 8, addrspace 1)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 281474976710655
- ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9-MESA: [[AND:%[0-9]+]]:_(s64) = G_AND [[COPY1]], [[C]]
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[AND]](s64)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[TRUNC]](s48)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s48) = G_LOAD %0 :: (load 6, align 8, addrspace 1)
%2:_(s64) = G_ZEXT %1
@@ -3835,76 +3829,152 @@ body: |
; SI-LABEL: name: test_load_global_v2s8_align4
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; SI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; SI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; SI: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; SI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; SI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; SI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-HSA-LABEL: name: test_load_global_v2s8_align4
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI-HSA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI-HSA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI-HSA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI-HSA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-HSA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_global_v2s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
- ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
- ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[C]](s32)
- ; CI-MESA: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
- ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC]]
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_global_v2s8_align4
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; VI: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-HSA-LABEL: name: test_load_global_v2s8_align4
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9-HSA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9-HSA: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9-HSA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9-HSA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-HSA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_global_v2s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 4, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
- ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C]](s16)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
@@ -3923,51 +3993,154 @@ body: |
; SI-LABEL: name: test_load_global_v2s8_align2
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; SI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; SI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; SI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; SI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; SI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-HSA-LABEL: name: test_load_global_v2s8_align2
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-HSA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-HSA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-HSA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-HSA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_global_v2s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-MESA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-MESA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-MESA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_global_v2s8_align2
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; VI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-HSA-LABEL: name: test_load_global_v2s8_align2
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9-HSA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9-HSA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9-HSA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-HSA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_global_v2s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-MESA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9-MESA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9-MESA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9-MESA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-MESA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9-MESA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9-MESA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-MESA: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<2 x s8>) = G_LOAD %0 :: (load 2, align 2, addrspace 1)
@@ -4002,11 +4175,28 @@ body: |
; SI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-HSA-LABEL: name: test_load_global_v2s8_align1
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 1, addrspace 1)
- ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 1, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-HSA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-HSA: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-HSA: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-HSA: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-MESA-LABEL: name: test_load_global_v2s8_align1
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
@@ -4044,11 +4234,29 @@ body: |
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-HSA-LABEL: name: test_load_global_v2s8_align1
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 2, align 1, addrspace 1)
- ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, align 1, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9-HSA: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9-HSA: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9-HSA: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9-HSA: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9-HSA: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9-HSA: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9-HSA: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9-HSA: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9-HSA: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-MESA-LABEL: name: test_load_global_v2s8_align1
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
@@ -4204,28 +4412,104 @@ body: |
; SI-LABEL: name: test_load_global_v4s8_align4
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; SI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-HSA-LABEL: name: test_load_global_v4s8_align4
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; CI-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_global_v4s8_align4
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; CI-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_global_v4s8_align4
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; VI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v4s8_align4
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; GFX9-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v4s8_align4
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; GFX9-MESA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<4 x s8>) = G_LOAD %0 :: (load 4, align 4, addrspace 1)
$vgpr0 = COPY %1
@@ -4258,86 +4542,124 @@ body: |
; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; SI-LABEL: name: test_load_global_v4s8_align2
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, align 2, addrspace 1)
- ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; SI: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
- ; SI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; SI: [[PTR_ADD1:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p1) :: (load 1 + 2, align 2, addrspace 1)
- ; SI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; SI: [[PTR_ADD2:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p1) :: (load 1 + 3, addrspace 1)
+ ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 2 + 2, addrspace 1)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; SI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; SI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; SI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; SI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-HSA-LABEL: name: test_load_global_v4s8_align2
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1)
- ; CI-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_global_v4s8_align2
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, align 2, addrspace 1)
- ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CI-MESA: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
- ; CI-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; CI-MESA: [[PTR_ADD1:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; CI-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p1) :: (load 1 + 2, align 2, addrspace 1)
- ; CI-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; CI-MESA: [[PTR_ADD2:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; CI-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p1) :: (load 1 + 3, addrspace 1)
+ ; CI-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 2 + 2, addrspace 1)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; CI-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; CI-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; CI-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; CI-MESA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_global_v4s8_align2
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, align 2, addrspace 1)
- ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; VI: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
- ; VI: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; VI: [[PTR_ADD1:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p1) :: (load 1 + 2, align 2, addrspace 1)
- ; VI: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; VI: [[PTR_ADD2:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p1) :: (load 1 + 3, addrspace 1)
+ ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 2 + 2, addrspace 1)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; VI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; VI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; VI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v4s8_align2
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1)
- ; GFX9-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 2, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v4s8_align2
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, align 2, addrspace 1)
- ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; GFX9-MESA: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
- ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
- ; GFX9-MESA: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
- ; GFX9-MESA: [[PTR_ADD1:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C1]](s64)
- ; GFX9-MESA: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p1) :: (load 1 + 2, align 2, addrspace 1)
- ; GFX9-MESA: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
- ; GFX9-MESA: [[PTR_ADD2:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C2]](s64)
- ; GFX9-MESA: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p1) :: (load 1 + 3, addrspace 1)
+ ; GFX9-MESA: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 2 + 2, addrspace 1)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C1]](s32)
+ ; GFX9-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C2]](s32)
+ ; GFX9-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD1]], [[C3]](s32)
; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
@@ -4374,8 +4696,20 @@ body: |
; SI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-HSA-LABEL: name: test_load_global_v4s8_align1
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1)
- ; CI-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-MESA-LABEL: name: test_load_global_v4s8_align1
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
@@ -4416,8 +4750,22 @@ body: |
; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v4s8_align1
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1)
- ; GFX9-HSA: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, align 1, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9-HSA: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v4s8_align1
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
@@ -4452,28 +4800,34 @@ body: |
; SI-LABEL: name: test_load_global_v8s8_align8
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; SI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; SI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-HSA-LABEL: name: test_load_global_v8s8_align8
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; CI-HSA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; CI-HSA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI-HSA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-MESA-LABEL: name: test_load_global_v8s8_align8
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; CI-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; VI-LABEL: name: test_load_global_v8s8_align8
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; VI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; VI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v8s8_align8
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; GFX9-HSA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX9-HSA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9-HSA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v8s8_align8
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
- ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<8 x s8>) = G_LOAD %0 :: (load 8, align 8, addrspace 1)
$vgpr0_vgpr1 = COPY %1
@@ -4487,28 +4841,34 @@ body: |
; SI-LABEL: name: test_load_global_v16s8_align16
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-HSA-LABEL: name: test_load_global_v16s8_align16
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; CI-HSA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-MESA-LABEL: name: test_load_global_v16s8_align16
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; VI-LABEL: name: test_load_global_v16s8_align16
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v16s8_align16
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; GFX9-HSA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v16s8_align16
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<16 x s8>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[LOAD]](<16 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[COPY]](p1) :: (load 16, addrspace 1)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[LOAD]](<4 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<16 x s8>) = G_LOAD %0 :: (load 16, align 16, addrspace 1)
$vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
@@ -4522,28 +4882,34 @@ body: |
; SI-LABEL: name: test_load_global_v32s8_align32
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; SI: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; SI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; SI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; CI-HSA-LABEL: name: test_load_global_v32s8_align32
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-HSA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; CI-HSA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; CI-HSA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; CI-MESA-LABEL: name: test_load_global_v32s8_align32
; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CI-MESA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; CI-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; CI-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; VI-LABEL: name: test_load_global_v32s8_align32
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; VI: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-HSA-LABEL: name: test_load_global_v32s8_align32
; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; GFX9-HSA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; GFX9-HSA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
; GFX9-MESA-LABEL: name: test_load_global_v32s8_align32
; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<32 x s8>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
- ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[LOAD]](<32 x s8>)
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(<8 x s32>) = G_LOAD [[COPY]](p1) :: (load 32, addrspace 1)
+ ; GFX9-MESA: [[BITCAST:%[0-9]+]]:_(<32 x s8>) = G_BITCAST [[LOAD]](<8 x s32>)
+ ; GFX9-MESA: $vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY [[BITCAST]](<32 x s8>)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<32 x s8>) = G_LOAD %0 :: (load 32, align 32, addrspace 1)
$vgpr0_vgpr1_vgpr2_vgpr3_vgpr4_vgpr5_vgpr6_vgpr7 = COPY %1
@@ -11622,10 +11988,11 @@ body: |
; SI: [[DEF:%[0-9]+]]:_(s96) = G_IMPLICIT_DEF
; SI: [[INSERT:%[0-9]+]]:_(s96) = G_INSERT [[DEF]], [[LOAD1]](s64), 0
; SI: [[INSERT1:%[0-9]+]]:_(s96) = G_INSERT [[INSERT]], [[LOAD2]](s32), 64
- ; SI: [[COPY1:%[0-9]+]]:_(s96) = COPY [[TRUNC]](s96)
- ; SI: [[COPY2:%[0-9]+]]:_(s96) = COPY [[INSERT1]](s96)
- ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[COPY1]](s96)
- ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[COPY2]](s96)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s96>) = G_BUILD_VECTOR [[TRUNC]](s96), [[INSERT1]](s96)
+ ; SI: [[EXTRACT:%[0-9]+]]:_(s96) = G_EXTRACT [[BUILD_VECTOR]](<2 x s96>), 0
+ ; SI: [[EXTRACT1:%[0-9]+]]:_(s96) = G_EXTRACT [[BUILD_VECTOR]](<2 x s96>), 96
+ ; SI: $vgpr0_vgpr1_vgpr2 = COPY [[EXTRACT]](s96)
+ ; SI: $vgpr3_vgpr4_vgpr5 = COPY [[EXTRACT1]](s96)
; CI-HSA-LABEL: name: test_extload_global_v2s96_from_24_align16
; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; CI-HSA: [[LOAD:%[0-9]+]]:_(s96) = G_LOAD [[COPY]](p1) :: (load 12, align 16, addrspace 1)
@@ -11683,3 +12050,833 @@ body: |
$vgpr0_vgpr1_vgpr2 = COPY %2
$vgpr3_vgpr4_vgpr5 = COPY %3
...
+
+---
+name: test_load_global_v32s1_align4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; SI-LABEL: name: test_load_global_v32s1_align4
+ ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; SI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; SI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; SI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; SI: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; SI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; SI: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; SI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; SI: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; SI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; SI: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; SI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; SI: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; SI: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; SI: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; SI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; SI: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; SI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; SI: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; SI: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; SI: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; SI: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; SI: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; SI: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; SI: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; SI: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; SI: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; SI: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; SI: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; SI: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; SI: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; SI: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; SI: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; SI: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; SI: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; SI: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; SI: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; SI: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; SI: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; SI: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; SI: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; SI: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; SI: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; SI: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; SI: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; SI: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; SI: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; SI: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; SI: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; SI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; SI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; SI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; SI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; SI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; SI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; SI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; SI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; SI: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; SI: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; SI: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; SI: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; SI: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; SI: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; SI: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; SI: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32), [[COPY17]](s32), [[COPY18]](s32), [[COPY19]](s32), [[COPY20]](s32), [[COPY21]](s32), [[COPY22]](s32), [[COPY23]](s32), [[COPY24]](s32), [[COPY25]](s32), [[COPY26]](s32), [[COPY27]](s32), [[COPY28]](s32), [[COPY29]](s32), [[COPY30]](s32), [[COPY31]](s32), [[COPY32]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[BUILD_VECTOR]](<32 x s32>)
+ ; SI: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ ; CI-HSA-LABEL: name: test_load_global_v32s1_align4
+ ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CI-HSA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-HSA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; CI-HSA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; CI-HSA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; CI-HSA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; CI-HSA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; CI-HSA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; CI-HSA: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; CI-HSA: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; CI-HSA: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; CI-HSA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; CI-HSA: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; CI-HSA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; CI-HSA: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; CI-HSA: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; CI-HSA: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; CI-HSA: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; CI-HSA: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; CI-HSA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; CI-HSA: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; CI-HSA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; CI-HSA: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; CI-HSA: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; CI-HSA: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; CI-HSA: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; CI-HSA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; CI-HSA: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; CI-HSA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; CI-HSA: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; CI-HSA: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; CI-HSA: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; CI-HSA: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; CI-HSA: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; CI-HSA: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; CI-HSA: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; CI-HSA: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; CI-HSA: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; CI-HSA: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; CI-HSA: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; CI-HSA: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; CI-HSA: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; CI-HSA: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; CI-HSA: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; CI-HSA: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; CI-HSA: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; CI-HSA: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; CI-HSA: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; CI-HSA: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; CI-HSA: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; CI-HSA: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; CI-HSA: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; CI-HSA: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; CI-HSA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; CI-HSA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; CI-HSA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; CI-HSA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; CI-HSA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; CI-HSA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; CI-HSA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; CI-HSA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; CI-HSA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; CI-HSA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; CI-HSA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; CI-HSA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; CI-HSA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; CI-HSA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; CI-HSA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; CI-HSA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; CI-HSA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; CI-HSA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; CI-HSA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; CI-HSA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; CI-HSA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; CI-HSA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; CI-HSA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; CI-HSA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; CI-HSA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; CI-HSA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; CI-HSA: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32), [[COPY17]](s32), [[COPY18]](s32), [[COPY19]](s32), [[COPY20]](s32), [[COPY21]](s32), [[COPY22]](s32), [[COPY23]](s32), [[COPY24]](s32), [[COPY25]](s32), [[COPY26]](s32), [[COPY27]](s32), [[COPY28]](s32), [[COPY29]](s32), [[COPY30]](s32), [[COPY31]](s32), [[COPY32]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[BUILD_VECTOR]](<32 x s32>)
+ ; CI-HSA: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ ; CI-MESA-LABEL: name: test_load_global_v32s1_align4
+ ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CI-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; CI-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; CI-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; CI-MESA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; CI-MESA: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; CI-MESA: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; CI-MESA: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; CI-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; CI-MESA: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; CI-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; CI-MESA: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; CI-MESA: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; CI-MESA: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; CI-MESA: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; CI-MESA: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; CI-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; CI-MESA: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; CI-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; CI-MESA: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; CI-MESA: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; CI-MESA: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; CI-MESA: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; CI-MESA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; CI-MESA: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; CI-MESA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; CI-MESA: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; CI-MESA: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; CI-MESA: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; CI-MESA: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; CI-MESA: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; CI-MESA: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; CI-MESA: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; CI-MESA: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; CI-MESA: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; CI-MESA: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; CI-MESA: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; CI-MESA: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; CI-MESA: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; CI-MESA: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; CI-MESA: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; CI-MESA: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; CI-MESA: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; CI-MESA: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; CI-MESA: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; CI-MESA: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; CI-MESA: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; CI-MESA: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; CI-MESA: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; CI-MESA: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; CI-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; CI-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; CI-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; CI-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; CI-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; CI-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; CI-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; CI-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; CI-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; CI-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; CI-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; CI-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; CI-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; CI-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; CI-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; CI-MESA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; CI-MESA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; CI-MESA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; CI-MESA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; CI-MESA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; CI-MESA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; CI-MESA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; CI-MESA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; CI-MESA: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32), [[COPY17]](s32), [[COPY18]](s32), [[COPY19]](s32), [[COPY20]](s32), [[COPY21]](s32), [[COPY22]](s32), [[COPY23]](s32), [[COPY24]](s32), [[COPY25]](s32), [[COPY26]](s32), [[COPY27]](s32), [[COPY28]](s32), [[COPY29]](s32), [[COPY30]](s32), [[COPY31]](s32), [[COPY32]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[BUILD_VECTOR]](<32 x s32>)
+ ; CI-MESA: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ ; VI-LABEL: name: test_load_global_v32s1_align4
+ ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; VI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; VI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; VI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; VI: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; VI: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; VI: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; VI: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; VI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; VI: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; VI: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; VI: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; VI: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; VI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; VI: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; VI: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; VI: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; VI: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; VI: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; VI: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; VI: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; VI: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; VI: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; VI: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; VI: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; VI: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; VI: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; VI: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; VI: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; VI: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; VI: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; VI: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; VI: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; VI: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; VI: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; VI: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; VI: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; VI: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; VI: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; VI: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; VI: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; VI: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; VI: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; VI: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; VI: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; VI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; VI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; VI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; VI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; VI: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; VI: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; VI: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; VI: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; VI: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; VI: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; VI: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; VI: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; VI: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; VI: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; VI: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; VI: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; VI: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; VI: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; VI: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; VI: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<32 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32), [[COPY17]](s32), [[COPY18]](s32), [[COPY19]](s32), [[COPY20]](s32), [[COPY21]](s32), [[COPY22]](s32), [[COPY23]](s32), [[COPY24]](s32), [[COPY25]](s32), [[COPY26]](s32), [[COPY27]](s32), [[COPY28]](s32), [[COPY29]](s32), [[COPY30]](s32), [[COPY31]](s32), [[COPY32]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[BUILD_VECTOR]](<32 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ ; GFX9-HSA-LABEL: name: test_load_global_v32s1_align4
+ ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; GFX9-HSA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-HSA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; GFX9-HSA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; GFX9-HSA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; GFX9-HSA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; GFX9-HSA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; GFX9-HSA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; GFX9-HSA: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; GFX9-HSA: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; GFX9-HSA: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; GFX9-HSA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; GFX9-HSA: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; GFX9-HSA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; GFX9-HSA: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; GFX9-HSA: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; GFX9-HSA: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; GFX9-HSA: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; GFX9-HSA: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; GFX9-HSA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; GFX9-HSA: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; GFX9-HSA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; GFX9-HSA: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; GFX9-HSA: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; GFX9-HSA: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; GFX9-HSA: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; GFX9-HSA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; GFX9-HSA: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; GFX9-HSA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; GFX9-HSA: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; GFX9-HSA: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; GFX9-HSA: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; GFX9-HSA: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; GFX9-HSA: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; GFX9-HSA: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; GFX9-HSA: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; GFX9-HSA: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; GFX9-HSA: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; GFX9-HSA: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; GFX9-HSA: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; GFX9-HSA: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; GFX9-HSA: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; GFX9-HSA: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; GFX9-HSA: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; GFX9-HSA: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; GFX9-HSA: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; GFX9-HSA: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; GFX9-HSA: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; GFX9-HSA: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; GFX9-HSA: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; GFX9-HSA: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; GFX9-HSA: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; GFX9-HSA: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; GFX9-HSA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
+ ; GFX9-HSA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; GFX9-HSA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
+ ; GFX9-HSA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; GFX9-HSA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY9]](s32), [[COPY10]](s32)
+ ; GFX9-HSA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; GFX9-HSA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY11]](s32), [[COPY12]](s32)
+ ; GFX9-HSA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; GFX9-HSA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC6:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY13]](s32), [[COPY14]](s32)
+ ; GFX9-HSA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; GFX9-HSA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC7:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY15]](s32), [[COPY16]](s32)
+ ; GFX9-HSA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; GFX9-HSA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC8:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY17]](s32), [[COPY18]](s32)
+ ; GFX9-HSA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; GFX9-HSA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC9:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY19]](s32), [[COPY20]](s32)
+ ; GFX9-HSA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; GFX9-HSA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC10:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY21]](s32), [[COPY22]](s32)
+ ; GFX9-HSA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; GFX9-HSA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC11:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY23]](s32), [[COPY24]](s32)
+ ; GFX9-HSA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; GFX9-HSA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC12:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY25]](s32), [[COPY26]](s32)
+ ; GFX9-HSA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; GFX9-HSA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC13:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY27]](s32), [[COPY28]](s32)
+ ; GFX9-HSA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; GFX9-HSA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC14:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY29]](s32), [[COPY30]](s32)
+ ; GFX9-HSA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; GFX9-HSA: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC15:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY31]](s32), [[COPY32]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>), [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>), [[BUILD_VECTOR_TRUNC4]](<2 x s16>), [[BUILD_VECTOR_TRUNC5]](<2 x s16>), [[BUILD_VECTOR_TRUNC6]](<2 x s16>), [[BUILD_VECTOR_TRUNC7]](<2 x s16>), [[BUILD_VECTOR_TRUNC8]](<2 x s16>), [[BUILD_VECTOR_TRUNC9]](<2 x s16>), [[BUILD_VECTOR_TRUNC10]](<2 x s16>), [[BUILD_VECTOR_TRUNC11]](<2 x s16>), [[BUILD_VECTOR_TRUNC12]](<2 x s16>), [[BUILD_VECTOR_TRUNC13]](<2 x s16>), [[BUILD_VECTOR_TRUNC14]](<2 x s16>), [[BUILD_VECTOR_TRUNC15]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[CONCAT_VECTORS]](<32 x s16>)
+ ; GFX9-HSA: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ ; GFX9-MESA-LABEL: name: test_load_global_v32s1_align4
+ ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; GFX9-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
+ ; GFX9-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
+ ; GFX9-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
+ ; GFX9-MESA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; GFX9-MESA: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR7:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C7]](s32)
+ ; GFX9-MESA: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
+ ; GFX9-MESA: [[LSHR8:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C8]](s32)
+ ; GFX9-MESA: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
+ ; GFX9-MESA: [[LSHR9:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C9]](s32)
+ ; GFX9-MESA: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
+ ; GFX9-MESA: [[LSHR10:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C10]](s32)
+ ; GFX9-MESA: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; GFX9-MESA: [[LSHR11:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C11]](s32)
+ ; GFX9-MESA: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
+ ; GFX9-MESA: [[LSHR12:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C12]](s32)
+ ; GFX9-MESA: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
+ ; GFX9-MESA: [[LSHR13:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C13]](s32)
+ ; GFX9-MESA: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
+ ; GFX9-MESA: [[LSHR14:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C14]](s32)
+ ; GFX9-MESA: [[C15:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR15:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C15]](s32)
+ ; GFX9-MESA: [[C16:%[0-9]+]]:_(s32) = G_CONSTANT i32 17
+ ; GFX9-MESA: [[LSHR16:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C16]](s32)
+ ; GFX9-MESA: [[C17:%[0-9]+]]:_(s32) = G_CONSTANT i32 18
+ ; GFX9-MESA: [[LSHR17:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C17]](s32)
+ ; GFX9-MESA: [[C18:%[0-9]+]]:_(s32) = G_CONSTANT i32 19
+ ; GFX9-MESA: [[LSHR18:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C18]](s32)
+ ; GFX9-MESA: [[C19:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; GFX9-MESA: [[LSHR19:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C19]](s32)
+ ; GFX9-MESA: [[C20:%[0-9]+]]:_(s32) = G_CONSTANT i32 21
+ ; GFX9-MESA: [[LSHR20:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C20]](s32)
+ ; GFX9-MESA: [[C21:%[0-9]+]]:_(s32) = G_CONSTANT i32 22
+ ; GFX9-MESA: [[LSHR21:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C21]](s32)
+ ; GFX9-MESA: [[C22:%[0-9]+]]:_(s32) = G_CONSTANT i32 23
+ ; GFX9-MESA: [[LSHR22:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C22]](s32)
+ ; GFX9-MESA: [[C23:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR23:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C23]](s32)
+ ; GFX9-MESA: [[C24:%[0-9]+]]:_(s32) = G_CONSTANT i32 25
+ ; GFX9-MESA: [[LSHR24:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C24]](s32)
+ ; GFX9-MESA: [[C25:%[0-9]+]]:_(s32) = G_CONSTANT i32 26
+ ; GFX9-MESA: [[LSHR25:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C25]](s32)
+ ; GFX9-MESA: [[C26:%[0-9]+]]:_(s32) = G_CONSTANT i32 27
+ ; GFX9-MESA: [[LSHR26:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C26]](s32)
+ ; GFX9-MESA: [[C27:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; GFX9-MESA: [[LSHR27:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C27]](s32)
+ ; GFX9-MESA: [[C28:%[0-9]+]]:_(s32) = G_CONSTANT i32 29
+ ; GFX9-MESA: [[LSHR28:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C28]](s32)
+ ; GFX9-MESA: [[C29:%[0-9]+]]:_(s32) = G_CONSTANT i32 30
+ ; GFX9-MESA: [[LSHR29:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C29]](s32)
+ ; GFX9-MESA: [[C30:%[0-9]+]]:_(s32) = G_CONSTANT i32 31
+ ; GFX9-MESA: [[LSHR30:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C30]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
+ ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
+ ; GFX9-MESA: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LSHR7]](s32)
+ ; GFX9-MESA: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LSHR8]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY9]](s32), [[COPY10]](s32)
+ ; GFX9-MESA: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LSHR9]](s32)
+ ; GFX9-MESA: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LSHR10]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY11]](s32), [[COPY12]](s32)
+ ; GFX9-MESA: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LSHR11]](s32)
+ ; GFX9-MESA: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LSHR12]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC6:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY13]](s32), [[COPY14]](s32)
+ ; GFX9-MESA: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LSHR13]](s32)
+ ; GFX9-MESA: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LSHR14]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC7:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY15]](s32), [[COPY16]](s32)
+ ; GFX9-MESA: [[COPY17:%[0-9]+]]:_(s32) = COPY [[LSHR15]](s32)
+ ; GFX9-MESA: [[COPY18:%[0-9]+]]:_(s32) = COPY [[LSHR16]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC8:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY17]](s32), [[COPY18]](s32)
+ ; GFX9-MESA: [[COPY19:%[0-9]+]]:_(s32) = COPY [[LSHR17]](s32)
+ ; GFX9-MESA: [[COPY20:%[0-9]+]]:_(s32) = COPY [[LSHR18]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC9:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY19]](s32), [[COPY20]](s32)
+ ; GFX9-MESA: [[COPY21:%[0-9]+]]:_(s32) = COPY [[LSHR19]](s32)
+ ; GFX9-MESA: [[COPY22:%[0-9]+]]:_(s32) = COPY [[LSHR20]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC10:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY21]](s32), [[COPY22]](s32)
+ ; GFX9-MESA: [[COPY23:%[0-9]+]]:_(s32) = COPY [[LSHR21]](s32)
+ ; GFX9-MESA: [[COPY24:%[0-9]+]]:_(s32) = COPY [[LSHR22]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC11:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY23]](s32), [[COPY24]](s32)
+ ; GFX9-MESA: [[COPY25:%[0-9]+]]:_(s32) = COPY [[LSHR23]](s32)
+ ; GFX9-MESA: [[COPY26:%[0-9]+]]:_(s32) = COPY [[LSHR24]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC12:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY25]](s32), [[COPY26]](s32)
+ ; GFX9-MESA: [[COPY27:%[0-9]+]]:_(s32) = COPY [[LSHR25]](s32)
+ ; GFX9-MESA: [[COPY28:%[0-9]+]]:_(s32) = COPY [[LSHR26]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC13:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY27]](s32), [[COPY28]](s32)
+ ; GFX9-MESA: [[COPY29:%[0-9]+]]:_(s32) = COPY [[LSHR27]](s32)
+ ; GFX9-MESA: [[COPY30:%[0-9]+]]:_(s32) = COPY [[LSHR28]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC14:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY29]](s32), [[COPY30]](s32)
+ ; GFX9-MESA: [[COPY31:%[0-9]+]]:_(s32) = COPY [[LSHR29]](s32)
+ ; GFX9-MESA: [[COPY32:%[0-9]+]]:_(s32) = COPY [[LSHR30]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC15:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY31]](s32), [[COPY32]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<32 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>), [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>), [[BUILD_VECTOR_TRUNC4]](<2 x s16>), [[BUILD_VECTOR_TRUNC5]](<2 x s16>), [[BUILD_VECTOR_TRUNC6]](<2 x s16>), [[BUILD_VECTOR_TRUNC7]](<2 x s16>), [[BUILD_VECTOR_TRUNC8]](<2 x s16>), [[BUILD_VECTOR_TRUNC9]](<2 x s16>), [[BUILD_VECTOR_TRUNC10]](<2 x s16>), [[BUILD_VECTOR_TRUNC11]](<2 x s16>), [[BUILD_VECTOR_TRUNC12]](<2 x s16>), [[BUILD_VECTOR_TRUNC13]](<2 x s16>), [[BUILD_VECTOR_TRUNC14]](<2 x s16>), [[BUILD_VECTOR_TRUNC15]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<32 x s1>) = G_TRUNC [[CONCAT_VECTORS]](<32 x s16>)
+ ; GFX9-MESA: $vgpr0 = COPY [[TRUNC]](<32 x s1>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<32 x s1>) = G_LOAD %0 :: (load 4, align 4, addrspace 1)
+ $vgpr0 = COPY %1
+...
+
+---
+name: test_load_global_v8s4_align4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; SI-LABEL: name: test_load_global_v8s4_align4
+ ; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; SI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; SI: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
+ ; SI: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ ; CI-HSA-LABEL: name: test_load_global_v8s4_align4
+ ; CI-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; CI-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CI-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; CI-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-HSA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-HSA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-HSA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; CI-HSA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; CI-HSA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-HSA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; CI-HSA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; CI-HSA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; CI-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; CI-HSA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; CI-HSA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; CI-HSA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; CI-HSA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
+ ; CI-HSA: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
+ ; CI-HSA: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ ; CI-MESA-LABEL: name: test_load_global_v8s4_align4
+ ; CI-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; CI-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; CI-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; CI-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; CI-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; CI-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; CI-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; CI-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; CI-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; CI-MESA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; CI-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; CI-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; CI-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; CI-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; CI-MESA: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
+ ; CI-MESA: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
+ ; CI-MESA: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ ; VI-LABEL: name: test_load_global_v8s4_align4
+ ; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; VI: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; VI: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ ; GFX9-HSA-LABEL: name: test_load_global_v8s4_align4
+ ; GFX9-HSA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX9-HSA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-HSA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; GFX9-HSA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-HSA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-HSA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-HSA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; GFX9-HSA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-HSA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-HSA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-HSA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; GFX9-HSA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; GFX9-HSA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-HSA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; GFX9-HSA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; GFX9-HSA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; GFX9-HSA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-HSA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-HSA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-HSA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-HSA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; GFX9-HSA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
+ ; GFX9-HSA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; GFX9-HSA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; GFX9-HSA: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
+ ; GFX9-HSA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>), [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>)
+ ; GFX9-HSA: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[CONCAT_VECTORS]](<8 x s16>)
+ ; GFX9-HSA: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ ; GFX9-MESA-LABEL: name: test_load_global_v8s4_align4
+ ; GFX9-MESA: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX9-MESA: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX9-MESA: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
+ ; GFX9-MESA: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9-MESA: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9-MESA: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9-MESA: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; GFX9-MESA: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9-MESA: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9-MESA: [[LSHR3:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C3]](s32)
+ ; GFX9-MESA: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 20
+ ; GFX9-MESA: [[LSHR4:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C4]](s32)
+ ; GFX9-MESA: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9-MESA: [[LSHR5:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C5]](s32)
+ ; GFX9-MESA: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 28
+ ; GFX9-MESA: [[LSHR6:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C6]](s32)
+ ; GFX9-MESA: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9-MESA: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9-MESA: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9-MESA: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9-MESA: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LSHR3]](s32)
+ ; GFX9-MESA: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LSHR4]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
+ ; GFX9-MESA: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LSHR5]](s32)
+ ; GFX9-MESA: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LSHR6]](s32)
+ ; GFX9-MESA: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
+ ; GFX9-MESA: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>), [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>)
+ ; GFX9-MESA: [[TRUNC:%[0-9]+]]:_(<8 x s4>) = G_TRUNC [[CONCAT_VECTORS]](<8 x s16>)
+ ; GFX9-MESA: $vgpr0 = COPY [[TRUNC]](<8 x s4>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<8 x s4>) = G_LOAD %0 :: (load 4, align 4, addrspace 1)
+ $vgpr0 = COPY %1
+...
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
index a4359d2d37ec..0164bcc2ca73 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-local.mir
@@ -637,28 +637,33 @@ body: |
; SI-LABEL: name: test_load_local_s48_align8
; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; SI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p3) :: (load 6, align 8, addrspace 3)
- ; SI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; SI: $vgpr0_vgpr1 = COPY [[COPY1]](s64)
+ ; SI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; SI: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[TRUNC]](s48)
+ ; SI: $vgpr0_vgpr1 = COPY [[ANYEXT]](s64)
; CI-LABEL: name: test_load_local_s48_align8
; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; CI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p3) :: (load 6, align 8, addrspace 3)
- ; CI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI: $vgpr0_vgpr1 = COPY [[COPY1]](s64)
+ ; CI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[TRUNC]](s48)
+ ; CI: $vgpr0_vgpr1 = COPY [[ANYEXT]](s64)
; CI-DS128-LABEL: name: test_load_local_s48_align8
; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; CI-DS128: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p3) :: (load 6, align 8, addrspace 3)
- ; CI-DS128: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; CI-DS128: $vgpr0_vgpr1 = COPY [[COPY1]](s64)
+ ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; CI-DS128: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[TRUNC]](s48)
+ ; CI-DS128: $vgpr0_vgpr1 = COPY [[ANYEXT]](s64)
; VI-LABEL: name: test_load_local_s48_align8
; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; VI: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p3) :: (load 6, align 8, addrspace 3)
- ; VI: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; VI: $vgpr0_vgpr1 = COPY [[COPY1]](s64)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[TRUNC]](s48)
+ ; VI: $vgpr0_vgpr1 = COPY [[ANYEXT]](s64)
; GFX9-LABEL: name: test_load_local_s48_align8
; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; GFX9: [[LOAD:%[0-9]+]]:_(s64) = G_LOAD [[COPY]](p3) :: (load 6, align 8, addrspace 3)
- ; GFX9: [[COPY1:%[0-9]+]]:_(s64) = COPY [[LOAD]](s64)
- ; GFX9: $vgpr0_vgpr1 = COPY [[COPY1]](s64)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(s48) = G_TRUNC [[LOAD]](s64)
+ ; GFX9: [[ANYEXT:%[0-9]+]]:_(s64) = G_ANYEXT [[TRUNC]](s48)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[ANYEXT]](s64)
%0:_(p3) = COPY $vgpr0
%1:_(s48) = G_LOAD %0 :: (load 6, align 8, addrspace 3)
%2:_(s64) = G_ANYEXT %1
@@ -5241,43 +5246,128 @@ body: |
; SI-LABEL: name: test_load_local_v2s8_align2
; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
- ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; SI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; SI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; SI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; SI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; SI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-LABEL: name: test_load_local_v2s8_align2
; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-DS128-LABEL: name: test_load_local_v2s8_align2
; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI-DS128: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
- ; CI-DS128: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
+ ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-DS128: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-DS128: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-DS128: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-DS128: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-DS128: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI-DS128: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI-DS128: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI-DS128: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI-DS128: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI-DS128: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI-DS128: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI-DS128: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI-DS128: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI-DS128: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_local_v2s8_align2
; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; VI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_local_v2s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 2, addrspace 3)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(p3) = COPY $vgpr0
%1:_(<2 x s8>) = G_LOAD %0 :: (load 2, align 2, addrspace 3)
@@ -5470,24 +5560,86 @@ body: |
; SI-LABEL: name: test_load_local_v4s8_align4
; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
- ; SI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-LABEL: name: test_load_local_v4s8_align4
; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
- ; CI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-DS128-LABEL: name: test_load_local_v4s8_align4
; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI-DS128: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
- ; CI-DS128: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
+ ; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-DS128: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI-DS128: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI-DS128: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI-DS128: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-DS128: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_local_v4s8_align4
; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
- ; VI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_local_v4s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
- ; GFX9: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 4, addrspace 3)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
%0:_(p3) = COPY $vgpr0
%1:_(<4 x s8>) = G_LOAD %0 :: (load 4, align 4, addrspace 3)
$vgpr0 = COPY %1
@@ -5501,24 +5653,29 @@ body: |
; SI-LABEL: name: test_load_local_v8s8_align8
; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
- ; SI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; SI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-LABEL: name: test_load_local_v8s8_align8
; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
- ; CI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-DS128-LABEL: name: test_load_local_v8s8_align8
; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; CI-DS128: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
- ; CI-DS128: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; CI-DS128: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
+ ; CI-DS128: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; CI-DS128: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; VI-LABEL: name: test_load_local_v8s8_align8
; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
- ; VI: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; VI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-LABEL: name: test_load_local_v8s8_align8
; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<8 x s8>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
- ; GFX9: $vgpr0_vgpr1 = COPY [[LOAD]](<8 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(<2 x s32>) = G_LOAD [[COPY]](p3) :: (load 8, addrspace 3)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[LOAD]](<2 x s32>)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
%0:_(p3) = COPY $vgpr0
%1:_(<8 x s8>) = G_LOAD %0 :: (load 8, align 8, addrspace 3)
$vgpr0_vgpr1 = COPY %1
@@ -5533,338 +5690,315 @@ body: |
; SI-LABEL: name: test_load_local_v16s8_align16
; SI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3)
+ ; SI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; SI: [[PTR_ADD:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C]](s32)
; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p3) :: (load 1 + 1, addrspace 3)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; SI: [[PTR_ADD1:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C1]](s32)
; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p3) :: (load 1 + 2, addrspace 3)
+ ; SI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; SI: [[PTR_ADD2:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C2]](s32)
; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p3) :: (load 1 + 3, addrspace 3)
+ ; SI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; SI: [[PTR_ADD3:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C3]](s32)
; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p3) :: (load 1 + 4, addrspace 3)
- ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
- ; SI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
+ ; SI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
+ ; SI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p3) :: (load 1 + 5, addrspace 3)
- ; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
- ; SI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
+ ; SI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
+ ; SI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p3) :: (load 1 + 6, addrspace 3)
- ; SI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
- ; SI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C6]](s32)
+ ; SI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
+ ; SI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p3) :: (load 1 + 7, addrspace 3)
- ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
- ; SI: [[TRUNC:%[0-9]+]]:_(<8 x s8>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
- ; SI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
- ; SI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C7]](s32)
+ ; SI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32)
+ ; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p3) :: (load 1 + 8, addrspace 3)
+ ; SI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; SI: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p3) :: (load 1 + 9, addrspace 3)
+ ; SI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; SI: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; SI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p3) :: (load 1 + 10, addrspace 3)
+ ; SI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; SI: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; SI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p3) :: (load 1 + 11, addrspace 3)
+ ; SI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; SI: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C3]](s32)
; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p3) :: (load 1 + 12, addrspace 3)
- ; SI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C4]](s32)
+ ; SI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
+ ; SI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p3) :: (load 1 + 13, addrspace 3)
- ; SI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C5]](s32)
+ ; SI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
+ ; SI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; SI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p3) :: (load 1 + 14, addrspace 3)
- ; SI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C6]](s32)
+ ; SI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
+ ; SI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; SI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p3) :: (load 1 + 15, addrspace 3)
- ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; SI: [[TRUNC1:%[0-9]+]]:_(<8 x s8>) = G_TRUNC [[BUILD_VECTOR1]](<8 x s32>)
- ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<8 x s8>), [[TRUNC1]](<8 x s8>)
- ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<16 x s8>)
+ ; SI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32)
+ ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<4 x s32>)
+ ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-LABEL: name: test_load_local_v16s8_align16
; CI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3)
+ ; CI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; CI: [[PTR_ADD:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C]](s32)
; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p3) :: (load 1 + 1, addrspace 3)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; CI: [[PTR_ADD1:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C1]](s32)
; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p3) :: (load 1 + 2, addrspace 3)
+ ; CI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; CI: [[PTR_ADD2:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C2]](s32)
; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p3) :: (load 1 + 3, addrspace 3)
+ ; CI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CI: [[PTR_ADD3:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C3]](s32)
; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p3) :: (load 1 + 4, addrspace 3)
- ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
- ; CI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
+ ; CI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
+ ; CI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p3) :: (load 1 + 5, addrspace 3)
- ; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
- ; CI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
+ ; CI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
+ ; CI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p3) :: (load 1 + 6, addrspace 3)
- ; CI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
- ; CI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C6]](s32)
+ ; CI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
+ ; CI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p3) :: (load 1 + 7, addrspace 3)
- ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
- ; CI: [[TRUNC:%[0-9]+]]:_(<8 x s8>) = G_TRUNC [[BUILD_VECTOR]](<8 x s32>)
- ; CI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
- ; CI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C7]](s32)
+ ; CI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32)
+ ; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p3) :: (load 1 + 8, addrspace 3)
+ ; CI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; CI: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p3) :: (load 1 + 9, addrspace 3)
+ ; CI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; CI: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; CI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p3) :: (load 1 + 10, addrspace 3)
+ ; CI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; CI: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; CI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p3) :: (load 1 + 11, addrspace 3)
+ ; CI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; CI: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C3]](s32)
; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p3) :: (load 1 + 12, addrspace 3)
- ; CI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C4]](s32)
+ ; CI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
+ ; CI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p3) :: (load 1 + 13, addrspace 3)
- ; CI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C5]](s32)
+ ; CI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
+ ; CI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; CI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p3) :: (load 1 + 14, addrspace 3)
- ; CI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C6]](s32)
+ ; CI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
+ ; CI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; CI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p3) :: (load 1 + 15, addrspace 3)
- ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<8 x s32>) = G_BUILD_VECTOR [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; CI: [[TRUNC1:%[0-9]+]]:_(<8 x s8>) = G_TRUNC [[BUILD_VECTOR1]](<8 x s32>)
- ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<8 x s8>), [[TRUNC1]](<8 x s8>)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<16 x s8>)
+ ; CI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[MV2]](s32), [[MV3]](s32)
+ ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s32>) = G_CONCAT_VECTORS [[BUILD_VECTOR]](<2 x s32>), [[BUILD_VECTOR1]](<2 x s32>)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[CONCAT_VECTORS]](<4 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-DS128-LABEL: name: test_load_local_v16s8_align16
; CI-DS128: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; CI-DS128: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3)
+ ; CI-DS128: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; CI-DS128: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; CI-DS128: [[PTR_ADD:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C]](s32)
; CI-DS128: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p3) :: (load 1 + 1, addrspace 3)
+ ; CI-DS128: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; CI-DS128: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; CI-DS128: [[PTR_ADD1:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C1]](s32)
; CI-DS128: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p3) :: (load 1 + 2, addrspace 3)
+ ; CI-DS128: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; CI-DS128: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; CI-DS128: [[PTR_ADD2:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C2]](s32)
; CI-DS128: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p3) :: (load 1 + 3, addrspace 3)
+ ; CI-DS128: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; CI-DS128: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; CI-DS128: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CI-DS128: [[PTR_ADD3:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C3]](s32)
; CI-DS128: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p3) :: (load 1 + 4, addrspace 3)
- ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
- ; CI-DS128: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
+ ; CI-DS128: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
+ ; CI-DS128: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; CI-DS128: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p3) :: (load 1 + 5, addrspace 3)
- ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
- ; CI-DS128: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
+ ; CI-DS128: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
+ ; CI-DS128: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; CI-DS128: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p3) :: (load 1 + 6, addrspace 3)
- ; CI-DS128: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
- ; CI-DS128: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C6]](s32)
+ ; CI-DS128: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
+ ; CI-DS128: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; CI-DS128: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p3) :: (load 1 + 7, addrspace 3)
- ; CI-DS128: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
- ; CI-DS128: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C7]](s32)
+ ; CI-DS128: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; CI-DS128: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
+ ; CI-DS128: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI-DS128: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
; CI-DS128: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p3) :: (load 1 + 8, addrspace 3)
- ; CI-DS128: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
- ; CI-DS128: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C8]](s32)
+ ; CI-DS128: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
+ ; CI-DS128: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; CI-DS128: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p3) :: (load 1 + 9, addrspace 3)
- ; CI-DS128: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
- ; CI-DS128: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C9]](s32)
+ ; CI-DS128: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
+ ; CI-DS128: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; CI-DS128: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p3) :: (load 1 + 10, addrspace 3)
- ; CI-DS128: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
- ; CI-DS128: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C10]](s32)
+ ; CI-DS128: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
+ ; CI-DS128: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; CI-DS128: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p3) :: (load 1 + 11, addrspace 3)
- ; CI-DS128: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
- ; CI-DS128: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C11]](s32)
+ ; CI-DS128: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; CI-DS128: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
+ ; CI-DS128: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; CI-DS128: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
; CI-DS128: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p3) :: (load 1 + 12, addrspace 3)
- ; CI-DS128: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
- ; CI-DS128: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C12]](s32)
+ ; CI-DS128: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
+ ; CI-DS128: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; CI-DS128: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p3) :: (load 1 + 13, addrspace 3)
- ; CI-DS128: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
- ; CI-DS128: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C13]](s32)
+ ; CI-DS128: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
+ ; CI-DS128: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; CI-DS128: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p3) :: (load 1 + 14, addrspace 3)
- ; CI-DS128: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
- ; CI-DS128: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C14]](s32)
+ ; CI-DS128: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
+ ; CI-DS128: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; CI-DS128: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p3) :: (load 1 + 15, addrspace 3)
- ; CI-DS128: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI-DS128: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI-DS128: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI-DS128: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; CI-DS128: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; CI-DS128: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; CI-DS128: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; CI-DS128: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; CI-DS128: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; CI-DS128: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; CI-DS128: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; CI-DS128: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; CI-DS128: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; CI-DS128: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; CI-DS128: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; CI-DS128: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; CI-DS128: [[TRUNC:%[0-9]+]]:_(<16 x s8>) = G_TRUNC [[BUILD_VECTOR]](<16 x s32>)
- ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[TRUNC]](<16 x s8>)
+ ; CI-DS128: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; CI-DS128: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; CI-DS128: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; CI-DS128: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; CI-DS128: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; VI-LABEL: name: test_load_local_v16s8_align16
; VI: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; VI: [[PTR_ADD:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C]](s32)
; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p3) :: (load 1 + 1, addrspace 3)
+ ; VI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; VI: [[PTR_ADD1:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C1]](s32)
; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p3) :: (load 1 + 2, addrspace 3)
+ ; VI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; VI: [[PTR_ADD2:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C2]](s32)
; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p3) :: (load 1 + 3, addrspace 3)
+ ; VI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; VI: [[PTR_ADD3:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C3]](s32)
; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p3) :: (load 1 + 4, addrspace 3)
- ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
- ; VI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
+ ; VI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
+ ; VI: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p3) :: (load 1 + 5, addrspace 3)
- ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
- ; VI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
+ ; VI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
+ ; VI: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p3) :: (load 1 + 6, addrspace 3)
- ; VI: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
- ; VI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C6]](s32)
+ ; VI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
+ ; VI: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p3) :: (load 1 + 7, addrspace 3)
- ; VI: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
- ; VI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C7]](s32)
+ ; VI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
+ ; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p3) :: (load 1 + 8, addrspace 3)
- ; VI: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
- ; VI: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C8]](s32)
+ ; VI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
+ ; VI: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p3) :: (load 1 + 9, addrspace 3)
- ; VI: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
- ; VI: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C9]](s32)
+ ; VI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
+ ; VI: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p3) :: (load 1 + 10, addrspace 3)
- ; VI: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
- ; VI: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C10]](s32)
+ ; VI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
+ ; VI: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p3) :: (load 1 + 11, addrspace 3)
- ; VI: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
- ; VI: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C11]](s32)
+ ; VI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
+ ; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; VI: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p3) :: (load 1 + 12, addrspace 3)
- ; VI: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
- ; VI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C12]](s32)
+ ; VI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
+ ; VI: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p3) :: (load 1 + 13, addrspace 3)
- ; VI: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
- ; VI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C13]](s32)
+ ; VI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
+ ; VI: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; VI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p3) :: (load 1 + 14, addrspace 3)
- ; VI: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
- ; VI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C14]](s32)
+ ; VI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
+ ; VI: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; VI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p3) :: (load 1 + 15, addrspace 3)
- ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; VI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; VI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; VI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; VI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<16 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32), [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32), [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32), [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; VI: [[TRUNC:%[0-9]+]]:_(<16 x s8>) = G_TRUNC [[BUILD_VECTOR]](<16 x s32>)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[TRUNC]](<16 x s8>)
+ ; VI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-LABEL: name: test_load_local_v16s8_align16
; GFX9: [[COPY:%[0-9]+]]:_(p3) = COPY $vgpr0
; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p3) :: (load 1, addrspace 3)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C]](s32)
; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p3) :: (load 1 + 1, addrspace 3)
+ ; GFX9: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; GFX9: [[PTR_ADD1:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C1]](s32)
; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p3) :: (load 1 + 2, addrspace 3)
+ ; GFX9: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; GFX9: [[PTR_ADD2:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C2]](s32)
; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p3) :: (load 1 + 3, addrspace 3)
+ ; GFX9: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; GFX9: [[PTR_ADD3:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C3]](s32)
; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p3) :: (load 1 + 4, addrspace 3)
- ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
- ; GFX9: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
+ ; GFX9: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
+ ; GFX9: [[PTR_ADD4:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p3) :: (load 1 + 5, addrspace 3)
- ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 6
- ; GFX9: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
+ ; GFX9: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
+ ; GFX9: [[PTR_ADD5:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p3) :: (load 1 + 6, addrspace 3)
- ; GFX9: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 7
- ; GFX9: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C6]](s32)
+ ; GFX9: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
+ ; GFX9: [[PTR_ADD6:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p3) :: (load 1 + 7, addrspace 3)
- ; GFX9: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
- ; GFX9: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C7]](s32)
+ ; GFX9: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
+ ; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[PTR_ADD7:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C4]](s32)
; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p3) :: (load 1 + 8, addrspace 3)
- ; GFX9: [[C8:%[0-9]+]]:_(s32) = G_CONSTANT i32 9
- ; GFX9: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C8]](s32)
+ ; GFX9: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
+ ; GFX9: [[PTR_ADD8:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p3) :: (load 1 + 9, addrspace 3)
- ; GFX9: [[C9:%[0-9]+]]:_(s32) = G_CONSTANT i32 10
- ; GFX9: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C9]](s32)
+ ; GFX9: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
+ ; GFX9: [[PTR_ADD9:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; GFX9: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p3) :: (load 1 + 10, addrspace 3)
- ; GFX9: [[C10:%[0-9]+]]:_(s32) = G_CONSTANT i32 11
- ; GFX9: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C10]](s32)
+ ; GFX9: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
+ ; GFX9: [[PTR_ADD10:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; GFX9: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p3) :: (load 1 + 11, addrspace 3)
- ; GFX9: [[C11:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
- ; GFX9: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C11]](s32)
+ ; GFX9: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
+ ; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
+ ; GFX9: [[PTR_ADD11:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C5]](s32)
; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p3) :: (load 1 + 12, addrspace 3)
- ; GFX9: [[C12:%[0-9]+]]:_(s32) = G_CONSTANT i32 13
- ; GFX9: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C12]](s32)
+ ; GFX9: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
+ ; GFX9: [[PTR_ADD12:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p3) :: (load 1 + 13, addrspace 3)
- ; GFX9: [[C13:%[0-9]+]]:_(s32) = G_CONSTANT i32 14
- ; GFX9: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C13]](s32)
+ ; GFX9: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
+ ; GFX9: [[PTR_ADD13:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; GFX9: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p3) :: (load 1 + 14, addrspace 3)
- ; GFX9: [[C14:%[0-9]+]]:_(s32) = G_CONSTANT i32 15
- ; GFX9: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[COPY]], [[C14]](s32)
+ ; GFX9: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
+ ; GFX9: [[PTR_ADD14:%[0-9]+]]:_(p3) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; GFX9: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p3) :: (load 1 + 15, addrspace 3)
- ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
- ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
- ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
- ; GFX9: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; GFX9: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY9]](s32), [[COPY10]](s32)
- ; GFX9: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; GFX9: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY11]](s32), [[COPY12]](s32)
- ; GFX9: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; GFX9: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC6:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY13]](s32), [[COPY14]](s32)
- ; GFX9: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; GFX9: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC7:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY15]](s32), [[COPY16]](s32)
- ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>), [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>), [[BUILD_VECTOR_TRUNC4]](<2 x s16>), [[BUILD_VECTOR_TRUNC5]](<2 x s16>), [[BUILD_VECTOR_TRUNC6]](<2 x s16>), [[BUILD_VECTOR_TRUNC7]](<2 x s16>)
- ; GFX9: [[TRUNC:%[0-9]+]]:_(<16 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<16 x s16>)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[TRUNC]](<16 x s8>)
+ ; GFX9: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
%0:_(p3) = COPY $vgpr0
%1:_(<16 x s8>) = G_LOAD %0 :: (load 16, align 1, addrspace 3)
$vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
index 58240b5d1060..678d3bcf6d32 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-load-private.mir
@@ -4089,35 +4089,103 @@ body: |
; SI-LABEL: name: test_load_private_v2s8_align2
; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
- ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; SI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; SI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; SI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; SI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; SI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; SI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; SI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; SI: $vgpr0 = COPY [[ANYEXT]](s32)
; CI-LABEL: name: test_load_private_v2s8_align2
; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
- ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; CI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; CI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; CI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[C]](s32)
+ ; CI: [[ZEXT1:%[0-9]+]]:_(s32) = G_ZEXT [[UV1]](s8)
+ ; CI: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[ZEXT1]], [[COPY5]](s32)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; CI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[TRUNC1]]
+ ; CI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; CI: $vgpr0 = COPY [[ANYEXT]](s32)
; VI-LABEL: name: test_load_private_v2s8_align2
; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
- ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; VI: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; VI: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; VI: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; VI: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; VI: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; VI: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; VI: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; VI: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; VI: $vgpr0 = COPY [[ANYEXT]](s32)
; GFX9-LABEL: name: test_load_private_v2s8_align2
; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
- ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[LOAD]](<4 x s8>), 0
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 2, addrspace 5)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[EXTRACT:%[0-9]+]]:_(<2 x s8>) = G_EXTRACT [[TRUNC]](<4 x s8>), 0
; GFX9: [[UV:%[0-9]+]]:_(s8), [[UV1:%[0-9]+]]:_(s8) = G_UNMERGE_VALUES [[EXTRACT]](<2 x s8>)
- ; GFX9: [[MV:%[0-9]+]]:_(s16) = G_MERGE_VALUES [[UV]](s8), [[UV1]](s8)
- ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[MV]](s16)
+ ; GFX9: [[ZEXT:%[0-9]+]]:_(s16) = G_ZEXT [[UV]](s8)
+ ; GFX9: [[ZEXT1:%[0-9]+]]:_(s16) = G_ZEXT [[UV1]](s8)
+ ; GFX9: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX9: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[ZEXT1]], [[C3]](s16)
+ ; GFX9: [[OR:%[0-9]+]]:_(s16) = G_OR [[ZEXT]], [[SHL]]
+ ; GFX9: [[ANYEXT:%[0-9]+]]:_(s32) = G_ANYEXT [[OR]](s16)
; GFX9: $vgpr0 = COPY [[ANYEXT]](s32)
%0:_(p5) = COPY $vgpr0
%1:_(<2 x s8>) = G_LOAD %0 :: (load 2, align 2, addrspace 5)
@@ -4282,20 +4350,70 @@ body: |
; SI-LABEL: name: test_load_private_v4s8_align4
; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
- ; SI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
+ ; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; SI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; SI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; SI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; CI-LABEL: name: test_load_private_v4s8_align4
; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
- ; CI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
+ ; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; CI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; CI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; CI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; VI-LABEL: name: test_load_private_v4s8_align4
; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
- ; VI: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
+ ; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; VI: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; VI: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; VI: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
+ ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
; GFX9-LABEL: name: test_load_private_v4s8_align4
; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
- ; GFX9: $vgpr0 = COPY [[LOAD]](<4 x s8>)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, addrspace 5)
+ ; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX9: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C]](s32)
+ ; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 16
+ ; GFX9: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C1]](s32)
+ ; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
+ ; GFX9: [[LSHR2:%[0-9]+]]:_(s32) = G_LSHR [[LOAD]], [[C2]](s32)
+ ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
+ ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LSHR]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
+ ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LSHR1]](s32)
+ ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LSHR2]](s32)
+ ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
+ ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: $vgpr0 = COPY [[TRUNC]](<4 x s8>)
%0:_(p5) = COPY $vgpr0
%1:_(<4 x s8>) = G_LOAD %0 :: (load 4, align 4, addrspace 5)
$vgpr0 = COPY %1
@@ -4309,36 +4427,40 @@ body: |
; SI-LABEL: name: test_load_private_v8s8_align8
; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; SI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
+ ; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; SI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
- ; SI: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
- ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s8>) = G_CONCAT_VECTORS [[LOAD]](<4 x s8>), [[LOAD1]](<4 x s8>)
- ; SI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<8 x s8>)
+ ; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[LOAD]](s32), [[LOAD1]](s32)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[BUILD_VECTOR]](<2 x s32>)
+ ; SI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; CI-LABEL: name: test_load_private_v8s8_align8
; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; CI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
+ ; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
- ; CI: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
- ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s8>) = G_CONCAT_VECTORS [[LOAD]](<4 x s8>), [[LOAD1]](<4 x s8>)
- ; CI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<8 x s8>)
+ ; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[LOAD]](s32), [[LOAD1]](s32)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[BUILD_VECTOR]](<2 x s32>)
+ ; CI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; VI-LABEL: name: test_load_private_v8s8_align8
; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; VI: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
+ ; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; VI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
- ; VI: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
- ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s8>) = G_CONCAT_VECTORS [[LOAD]](<4 x s8>), [[LOAD1]](<4 x s8>)
- ; VI: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<8 x s8>)
+ ; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[LOAD]](s32), [[LOAD1]](s32)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[BUILD_VECTOR]](<2 x s32>)
+ ; VI: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
; GFX9-LABEL: name: test_load_private_v8s8_align8
; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
- ; GFX9: [[LOAD:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
+ ; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 4, align 8, addrspace 5)
; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
- ; GFX9: [[LOAD1:%[0-9]+]]:_(<4 x s8>) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
- ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<8 x s8>) = G_CONCAT_VECTORS [[LOAD]](<4 x s8>), [[LOAD1]](<4 x s8>)
- ; GFX9: $vgpr0_vgpr1 = COPY [[CONCAT_VECTORS]](<8 x s8>)
+ ; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 4 + 4, addrspace 5)
+ ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s32>) = G_BUILD_VECTOR [[LOAD]](s32), [[LOAD1]](s32)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<8 x s8>) = G_BITCAST [[BUILD_VECTOR]](<2 x s32>)
+ ; GFX9: $vgpr0_vgpr1 = COPY [[BITCAST]](<8 x s8>)
%0:_(p5) = COPY $vgpr0
%1:_(<8 x s8>) = G_LOAD %0 :: (load 8, align 8, addrspace 5)
$vgpr0_vgpr1 = COPY %1
@@ -4353,271 +4475,251 @@ body: |
; SI-LABEL: name: test_load_private_v16s8_align16
; SI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; SI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56)
+ ; SI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; SI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; SI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
; SI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 1 + 1, addrspace 56)
+ ; SI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; SI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; SI: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; SI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p5) :: (load 1 + 2, addrspace 56)
+ ; SI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; SI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; SI: [[PTR_ADD2:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C2]](s32)
; SI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p5) :: (load 1 + 3, addrspace 56)
- ; SI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; SI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; SI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; SI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
- ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; SI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; SI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; SI: [[PTR_ADD3:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C3]](s32)
; SI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p5) :: (load 1 + 4, addrspace 56)
+ ; SI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
; SI: [[PTR_ADD4:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; SI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p5) :: (load 1 + 5, addrspace 56)
+ ; SI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
; SI: [[PTR_ADD5:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; SI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p5) :: (load 1 + 6, addrspace 56)
+ ; SI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
; SI: [[PTR_ADD6:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; SI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p5) :: (load 1 + 7, addrspace 56)
- ; SI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; SI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; SI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; SI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; SI: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
- ; SI: [[TRUNC1:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR1]](<4 x s32>)
+ ; SI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; SI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
; SI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; SI: [[PTR_ADD7:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C4]](s32)
; SI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p5) :: (load 1 + 8, addrspace 56)
+ ; SI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; SI: [[PTR_ADD8:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; SI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p5) :: (load 1 + 9, addrspace 56)
+ ; SI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; SI: [[PTR_ADD9:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; SI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p5) :: (load 1 + 10, addrspace 56)
+ ; SI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; SI: [[PTR_ADD10:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; SI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p5) :: (load 1 + 11, addrspace 56)
- ; SI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; SI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; SI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; SI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; SI: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32)
- ; SI: [[TRUNC2:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR2]](<4 x s32>)
+ ; SI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; SI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; SI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
; SI: [[PTR_ADD11:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C5]](s32)
; SI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p5) :: (load 1 + 12, addrspace 56)
+ ; SI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
; SI: [[PTR_ADD12:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; SI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p5) :: (load 1 + 13, addrspace 56)
+ ; SI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
; SI: [[PTR_ADD13:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; SI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p5) :: (load 1 + 14, addrspace 56)
+ ; SI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
; SI: [[PTR_ADD14:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; SI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p5) :: (load 1 + 15, addrspace 56)
- ; SI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; SI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; SI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; SI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; SI: [[BUILD_VECTOR3:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; SI: [[TRUNC3:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR3]](<4 x s32>)
- ; SI: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<4 x s8>), [[TRUNC1]](<4 x s8>), [[TRUNC2]](<4 x s8>), [[TRUNC3]](<4 x s8>)
- ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<16 x s8>)
+ ; SI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; SI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; SI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; SI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; SI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; CI-LABEL: name: test_load_private_v16s8_align16
; CI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; CI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56)
+ ; CI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; CI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; CI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
; CI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 1 + 1, addrspace 56)
+ ; CI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; CI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; CI: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; CI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p5) :: (load 1 + 2, addrspace 56)
+ ; CI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; CI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; CI: [[PTR_ADD2:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C2]](s32)
; CI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p5) :: (load 1 + 3, addrspace 56)
- ; CI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; CI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; CI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; CI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
- ; CI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; CI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; CI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CI: [[PTR_ADD3:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C3]](s32)
; CI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p5) :: (load 1 + 4, addrspace 56)
+ ; CI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
; CI: [[PTR_ADD4:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; CI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p5) :: (load 1 + 5, addrspace 56)
+ ; CI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
; CI: [[PTR_ADD5:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; CI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p5) :: (load 1 + 6, addrspace 56)
+ ; CI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
; CI: [[PTR_ADD6:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; CI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p5) :: (load 1 + 7, addrspace 56)
- ; CI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; CI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; CI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; CI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; CI: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
- ; CI: [[TRUNC1:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR1]](<4 x s32>)
+ ; CI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; CI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
; CI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; CI: [[PTR_ADD7:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C4]](s32)
; CI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p5) :: (load 1 + 8, addrspace 56)
+ ; CI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; CI: [[PTR_ADD8:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; CI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p5) :: (load 1 + 9, addrspace 56)
+ ; CI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; CI: [[PTR_ADD9:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; CI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p5) :: (load 1 + 10, addrspace 56)
+ ; CI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; CI: [[PTR_ADD10:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; CI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p5) :: (load 1 + 11, addrspace 56)
- ; CI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; CI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; CI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; CI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; CI: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32)
- ; CI: [[TRUNC2:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR2]](<4 x s32>)
+ ; CI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; CI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; CI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
; CI: [[PTR_ADD11:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C5]](s32)
; CI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p5) :: (load 1 + 12, addrspace 56)
+ ; CI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
; CI: [[PTR_ADD12:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; CI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p5) :: (load 1 + 13, addrspace 56)
+ ; CI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
; CI: [[PTR_ADD13:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; CI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p5) :: (load 1 + 14, addrspace 56)
+ ; CI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
; CI: [[PTR_ADD14:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; CI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p5) :: (load 1 + 15, addrspace 56)
- ; CI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; CI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; CI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; CI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; CI: [[BUILD_VECTOR3:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; CI: [[TRUNC3:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR3]](<4 x s32>)
- ; CI: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<4 x s8>), [[TRUNC1]](<4 x s8>), [[TRUNC2]](<4 x s8>), [[TRUNC3]](<4 x s8>)
- ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<16 x s8>)
+ ; CI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; CI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; CI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; CI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; CI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; VI-LABEL: name: test_load_private_v16s8_align16
; VI: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; VI: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56)
+ ; VI: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; VI: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; VI: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
; VI: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 1 + 1, addrspace 56)
+ ; VI: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; VI: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; VI: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; VI: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p5) :: (load 1 + 2, addrspace 56)
+ ; VI: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; VI: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; VI: [[PTR_ADD2:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C2]](s32)
; VI: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p5) :: (load 1 + 3, addrspace 56)
- ; VI: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; VI: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; VI: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; VI: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY1]](s32), [[COPY2]](s32), [[COPY3]](s32), [[COPY4]](s32)
- ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; VI: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; VI: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; VI: [[PTR_ADD3:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C3]](s32)
; VI: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p5) :: (load 1 + 4, addrspace 56)
+ ; VI: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
; VI: [[PTR_ADD4:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; VI: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p5) :: (load 1 + 5, addrspace 56)
+ ; VI: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
; VI: [[PTR_ADD5:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; VI: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p5) :: (load 1 + 6, addrspace 56)
+ ; VI: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
; VI: [[PTR_ADD6:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; VI: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p5) :: (load 1 + 7, addrspace 56)
- ; VI: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; VI: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; VI: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; VI: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; VI: [[BUILD_VECTOR1:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY5]](s32), [[COPY6]](s32), [[COPY7]](s32), [[COPY8]](s32)
- ; VI: [[TRUNC1:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR1]](<4 x s32>)
+ ; VI: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; VI: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
; VI: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; VI: [[PTR_ADD7:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C4]](s32)
; VI: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p5) :: (load 1 + 8, addrspace 56)
+ ; VI: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; VI: [[PTR_ADD8:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; VI: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p5) :: (load 1 + 9, addrspace 56)
+ ; VI: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; VI: [[PTR_ADD9:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; VI: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p5) :: (load 1 + 10, addrspace 56)
+ ; VI: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; VI: [[PTR_ADD10:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; VI: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p5) :: (load 1 + 11, addrspace 56)
- ; VI: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; VI: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; VI: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; VI: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; VI: [[BUILD_VECTOR2:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY9]](s32), [[COPY10]](s32), [[COPY11]](s32), [[COPY12]](s32)
- ; VI: [[TRUNC2:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR2]](<4 x s32>)
+ ; VI: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; VI: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; VI: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
; VI: [[PTR_ADD11:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C5]](s32)
; VI: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p5) :: (load 1 + 12, addrspace 56)
+ ; VI: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
; VI: [[PTR_ADD12:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; VI: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p5) :: (load 1 + 13, addrspace 56)
+ ; VI: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
; VI: [[PTR_ADD13:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; VI: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p5) :: (load 1 + 14, addrspace 56)
+ ; VI: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
; VI: [[PTR_ADD14:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; VI: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p5) :: (load 1 + 15, addrspace 56)
- ; VI: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; VI: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; VI: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; VI: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; VI: [[BUILD_VECTOR3:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[COPY13]](s32), [[COPY14]](s32), [[COPY15]](s32), [[COPY16]](s32)
- ; VI: [[TRUNC3:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[BUILD_VECTOR3]](<4 x s32>)
- ; VI: [[CONCAT_VECTORS:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<4 x s8>), [[TRUNC1]](<4 x s8>), [[TRUNC2]](<4 x s8>), [[TRUNC3]](<4 x s8>)
- ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS]](<16 x s8>)
+ ; VI: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; VI: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; VI: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; VI: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; VI: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
; GFX9-LABEL: name: test_load_private_v16s8_align16
; GFX9: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; GFX9: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p5) :: (load 1, addrspace 56)
+ ; GFX9: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD]](s32)
; GFX9: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 1
; GFX9: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
; GFX9: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p5) :: (load 1 + 1, addrspace 56)
+ ; GFX9: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD1]](s32)
; GFX9: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 2
; GFX9: [[PTR_ADD1:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C1]](s32)
; GFX9: [[LOAD2:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD1]](p5) :: (load 1 + 2, addrspace 56)
+ ; GFX9: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD2]](s32)
; GFX9: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 3
; GFX9: [[PTR_ADD2:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C2]](s32)
; GFX9: [[LOAD3:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD2]](p5) :: (load 1 + 3, addrspace 56)
- ; GFX9: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD]](s32)
- ; GFX9: [[COPY2:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY1]](s32), [[COPY2]](s32)
- ; GFX9: [[COPY3:%[0-9]+]]:_(s32) = COPY [[LOAD2]](s32)
- ; GFX9: [[COPY4:%[0-9]+]]:_(s32) = COPY [[LOAD3]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY3]](s32), [[COPY4]](s32)
- ; GFX9: [[CONCAT_VECTORS:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC]](<2 x s16>), [[BUILD_VECTOR_TRUNC1]](<2 x s16>)
- ; GFX9: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS]](<4 x s16>)
+ ; GFX9: [[TRUNC3:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD3]](s32)
+ ; GFX9: [[MV:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC]](s8), [[TRUNC1]](s8), [[TRUNC2]](s8), [[TRUNC3]](s8)
; GFX9: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; GFX9: [[PTR_ADD3:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C3]](s32)
; GFX9: [[LOAD4:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD3]](p5) :: (load 1 + 4, addrspace 56)
+ ; GFX9: [[TRUNC4:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD4]](s32)
; GFX9: [[PTR_ADD4:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C]](s32)
; GFX9: [[LOAD5:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD4]](p5) :: (load 1 + 5, addrspace 56)
+ ; GFX9: [[TRUNC5:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD5]](s32)
; GFX9: [[PTR_ADD5:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C1]](s32)
; GFX9: [[LOAD6:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD5]](p5) :: (load 1 + 6, addrspace 56)
+ ; GFX9: [[TRUNC6:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD6]](s32)
; GFX9: [[PTR_ADD6:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD3]], [[C2]](s32)
; GFX9: [[LOAD7:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD6]](p5) :: (load 1 + 7, addrspace 56)
- ; GFX9: [[COPY5:%[0-9]+]]:_(s32) = COPY [[LOAD4]](s32)
- ; GFX9: [[COPY6:%[0-9]+]]:_(s32) = COPY [[LOAD5]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC2:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY5]](s32), [[COPY6]](s32)
- ; GFX9: [[COPY7:%[0-9]+]]:_(s32) = COPY [[LOAD6]](s32)
- ; GFX9: [[COPY8:%[0-9]+]]:_(s32) = COPY [[LOAD7]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC3:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY7]](s32), [[COPY8]](s32)
- ; GFX9: [[CONCAT_VECTORS1:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC2]](<2 x s16>), [[BUILD_VECTOR_TRUNC3]](<2 x s16>)
- ; GFX9: [[TRUNC1:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS1]](<4 x s16>)
+ ; GFX9: [[TRUNC7:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD7]](s32)
+ ; GFX9: [[MV1:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC4]](s8), [[TRUNC5]](s8), [[TRUNC6]](s8), [[TRUNC7]](s8)
; GFX9: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
; GFX9: [[PTR_ADD7:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C4]](s32)
; GFX9: [[LOAD8:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD7]](p5) :: (load 1 + 8, addrspace 56)
+ ; GFX9: [[TRUNC8:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD8]](s32)
; GFX9: [[PTR_ADD8:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C]](s32)
; GFX9: [[LOAD9:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD8]](p5) :: (load 1 + 9, addrspace 56)
+ ; GFX9: [[TRUNC9:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD9]](s32)
; GFX9: [[PTR_ADD9:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C1]](s32)
; GFX9: [[LOAD10:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD9]](p5) :: (load 1 + 10, addrspace 56)
+ ; GFX9: [[TRUNC10:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD10]](s32)
; GFX9: [[PTR_ADD10:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD7]], [[C2]](s32)
; GFX9: [[LOAD11:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD10]](p5) :: (load 1 + 11, addrspace 56)
- ; GFX9: [[COPY9:%[0-9]+]]:_(s32) = COPY [[LOAD8]](s32)
- ; GFX9: [[COPY10:%[0-9]+]]:_(s32) = COPY [[LOAD9]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC4:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY9]](s32), [[COPY10]](s32)
- ; GFX9: [[COPY11:%[0-9]+]]:_(s32) = COPY [[LOAD10]](s32)
- ; GFX9: [[COPY12:%[0-9]+]]:_(s32) = COPY [[LOAD11]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC5:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY11]](s32), [[COPY12]](s32)
- ; GFX9: [[CONCAT_VECTORS2:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC4]](<2 x s16>), [[BUILD_VECTOR_TRUNC5]](<2 x s16>)
- ; GFX9: [[TRUNC2:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS2]](<4 x s16>)
+ ; GFX9: [[TRUNC11:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD11]](s32)
+ ; GFX9: [[MV2:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC8]](s8), [[TRUNC9]](s8), [[TRUNC10]](s8), [[TRUNC11]](s8)
; GFX9: [[C5:%[0-9]+]]:_(s32) = G_CONSTANT i32 12
; GFX9: [[PTR_ADD11:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C5]](s32)
; GFX9: [[LOAD12:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD11]](p5) :: (load 1 + 12, addrspace 56)
+ ; GFX9: [[TRUNC12:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD12]](s32)
; GFX9: [[PTR_ADD12:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C]](s32)
; GFX9: [[LOAD13:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD12]](p5) :: (load 1 + 13, addrspace 56)
+ ; GFX9: [[TRUNC13:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD13]](s32)
; GFX9: [[PTR_ADD13:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C1]](s32)
; GFX9: [[LOAD14:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD13]](p5) :: (load 1 + 14, addrspace 56)
+ ; GFX9: [[TRUNC14:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD14]](s32)
; GFX9: [[PTR_ADD14:%[0-9]+]]:_(p5) = G_PTR_ADD [[PTR_ADD11]], [[C2]](s32)
; GFX9: [[LOAD15:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD14]](p5) :: (load 1 + 15, addrspace 56)
- ; GFX9: [[COPY13:%[0-9]+]]:_(s32) = COPY [[LOAD12]](s32)
- ; GFX9: [[COPY14:%[0-9]+]]:_(s32) = COPY [[LOAD13]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC6:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY13]](s32), [[COPY14]](s32)
- ; GFX9: [[COPY15:%[0-9]+]]:_(s32) = COPY [[LOAD14]](s32)
- ; GFX9: [[COPY16:%[0-9]+]]:_(s32) = COPY [[LOAD15]](s32)
- ; GFX9: [[BUILD_VECTOR_TRUNC7:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR_TRUNC [[COPY15]](s32), [[COPY16]](s32)
- ; GFX9: [[CONCAT_VECTORS3:%[0-9]+]]:_(<4 x s16>) = G_CONCAT_VECTORS [[BUILD_VECTOR_TRUNC6]](<2 x s16>), [[BUILD_VECTOR_TRUNC7]](<2 x s16>)
- ; GFX9: [[TRUNC3:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[CONCAT_VECTORS3]](<4 x s16>)
- ; GFX9: [[CONCAT_VECTORS4:%[0-9]+]]:_(<16 x s8>) = G_CONCAT_VECTORS [[TRUNC]](<4 x s8>), [[TRUNC1]](<4 x s8>), [[TRUNC2]](<4 x s8>), [[TRUNC3]](<4 x s8>)
- ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[CONCAT_VECTORS4]](<16 x s8>)
+ ; GFX9: [[TRUNC15:%[0-9]+]]:_(s8) = G_TRUNC [[LOAD15]](s32)
+ ; GFX9: [[MV3:%[0-9]+]]:_(s32) = G_MERGE_VALUES [[TRUNC12]](s8), [[TRUNC13]](s8), [[TRUNC14]](s8), [[TRUNC15]](s8)
+ ; GFX9: [[BUILD_VECTOR:%[0-9]+]]:_(<4 x s32>) = G_BUILD_VECTOR [[MV]](s32), [[MV1]](s32), [[MV2]](s32), [[MV3]](s32)
+ ; GFX9: [[BITCAST:%[0-9]+]]:_(<16 x s8>) = G_BITCAST [[BUILD_VECTOR]](<4 x s32>)
+ ; GFX9: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[BITCAST]](<16 x s8>)
%0:_(p5) = COPY $vgpr0
%1:_(<16 x s8>) = G_LOAD %0 :: (load 16, align 1, addrspace 56)
$vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir
index 97232c2a5217..81408b79b11f 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-phi.mir
@@ -1,5 +1,5 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer %s -o - | FileCheck %s
+# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer -global-isel-abort=2 %s -o - | FileCheck %s
---
name: test_phi_s32
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-global.mir
index 40105bdb150b..b6633adbd79c 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-sextload-global.mir
@@ -1,6 +1,17 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -o - %s | FileCheck %s
-# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer -o - %s | FileCheck %s
+# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -global-isel-abort=2 -pass-remarks-missed='gisel*' -o - 2> %t %s | FileCheck -check-prefix=GFX8 %s
+# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer -global-isel-abort=2 -o - %s | FileCheck -check-prefix=GFX6 %s
+# RUN: FileCheck -check-prefixes=ERR %s < %t
+
+# FIXME: Run with and without unaligned access turned on
+
+# ERR-NOT: remark
+# ERR: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s16>) = G_SEXTLOAD %0:_(p1) :: (load 2, addrspace 1) (in function: test_sextload_global_v2i16_from_2)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s32>) = G_SEXTLOAD %0:_(p1) :: (load 2, addrspace 1) (in function: test_sextload_global_v2i32_from_2)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s32>) = G_SEXTLOAD %0:_(p1) :: (load 4, addrspace 1) (in function: test_sextload_global_v2i32_from_4)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s64>) = G_SEXTLOAD %0:_(p1) :: (load 4, addrspace 1) (in function: test_sextload_global_v2i64_from_4)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s64>) = G_SEXTLOAD %0:_(p1) :: (load 8, addrspace 1) (in function: test_sextload_global_v2i64_from_8)
+# ERR-NOT: remark
---
name: test_sextload_global_i32_i8
@@ -8,10 +19,14 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i32_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: $vgpr0 = COPY [[SEXTLOAD]](s32)
+ ; GFX8-LABEL: name: test_sextload_global_i32_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[SEXTLOAD]](s32)
+ ; GFX6-LABEL: name: test_sextload_global_i32_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[SEXTLOAD]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s32) = G_SEXTLOAD %0 :: (load 1, addrspace 1)
$vgpr0 = COPY %1
@@ -22,10 +37,14 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i32_i16
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CHECK: $vgpr0 = COPY [[SEXTLOAD]](s32)
+ ; GFX8-LABEL: name: test_sextload_global_i32_i16
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[SEXTLOAD]](s32)
+ ; GFX6-LABEL: name: test_sextload_global_i32_i16
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[SEXTLOAD]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s32) = G_SEXTLOAD %0 :: (load 2, addrspace 1)
$vgpr0 = COPY %1
@@ -36,11 +55,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i31_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY [[SEXTLOAD]](s32)
- ; CHECK: $vgpr0 = COPY [[COPY1]](s32)
+ ; GFX8-LABEL: name: test_sextload_global_i31_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[SEXTLOAD]](s32)
+ ; GFX8: $vgpr0 = COPY [[COPY1]](s32)
+ ; GFX6-LABEL: name: test_sextload_global_i31_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[SEXTLOAD]](s32)
+ ; GFX6: $vgpr0 = COPY [[COPY1]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s31) = G_SEXTLOAD %0 :: (load 1, addrspace 1)
%2:_(s32) = G_ANYEXT %1
@@ -52,11 +76,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i64_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX8-LABEL: name: test_sextload_global_i64_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX6-LABEL: name: test_sextload_global_i64_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_SEXTLOAD %0 :: (load 1, addrspace 1)
$vgpr0_vgpr1 = COPY %1
@@ -67,11 +96,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i64_i16
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CHECK: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX8-LABEL: name: test_sextload_global_i64_i16
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX6-LABEL: name: test_sextload_global_i64_i16
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(s32) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[SEXTLOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_SEXTLOAD %0 :: (load 2, addrspace 1)
$vgpr0_vgpr1 = COPY %1
@@ -82,12 +116,202 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_sextload_global_i64_i32
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; CHECK: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[LOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX8-LABEL: name: test_sextload_global_i64_i32
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[LOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX6-LABEL: name: test_sextload_global_i64_i32
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[LOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_SEXTLOAD %0 :: (load 4, addrspace 1)
$vgpr0_vgpr1 = COPY %1
...
+
+---
+name: test_sextload_global_s32_from_2_align1
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_s32_from_2_align1
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX8: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX8: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX8: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX8: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX8: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32)
+ ; GFX8: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C1]]
+ ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX8: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[AND1]], [[C2]](s16)
+ ; GFX8: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[SHL]]
+ ; GFX8: [[SEXT:%[0-9]+]]:_(s32) = G_SEXT [[OR]](s16)
+ ; GFX8: $vgpr0 = COPY [[SEXT]](s32)
+ ; GFX6-LABEL: name: test_sextload_global_s32_from_2_align1
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX6: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX6: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX6: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX6: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX6: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX6: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX6: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX6: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
+ ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
+ ; GFX6: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; GFX6: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[TRUNC1]]
+ ; GFX6: [[SEXT:%[0-9]+]]:_(s32) = G_SEXT [[OR]](s16)
+ ; GFX6: $vgpr0 = COPY [[SEXT]](s32)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(s32) = G_SEXTLOAD %0 :: (load 2, align 1, addrspace 1)
+ $vgpr0 = COPY %1
+...
+
+---
+name: test_sextload_global_s64_from_2_align1
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_s64_from_2_align1
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX8: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX8: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX8: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX8: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX8: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32)
+ ; GFX8: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C1]]
+ ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX8: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[AND1]], [[C2]](s16)
+ ; GFX8: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[SHL]]
+ ; GFX8: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[OR]](s16)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ ; GFX6-LABEL: name: test_sextload_global_s64_from_2_align1
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX6: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX6: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX6: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX6: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX6: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX6: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX6: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX6: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
+ ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
+ ; GFX6: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; GFX6: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[TRUNC1]]
+ ; GFX6: [[SEXT:%[0-9]+]]:_(s64) = G_SEXT [[OR]](s16)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXT]](s64)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(s64) = G_SEXTLOAD %0 :: (load 2, align 1, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_sextload_global_v2i16_from_2
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_v2i16_from_2
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(<2 x s16>) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[SEXTLOAD]](<2 x s16>)
+ ; GFX6-LABEL: name: test_sextload_global_v2i16_from_2
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(<2 x s16>) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[SEXTLOAD]](<2 x s16>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s16>) = G_SEXTLOAD %0 :: (load 2, addrspace 1)
+ $vgpr0 = COPY %1
+...
+
+---
+name: test_sextload_global_v2i32_from_2
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_v2i32_from_2
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXTLOAD]](<2 x s32>)
+ ; GFX6-LABEL: name: test_sextload_global_v2i32_from_2
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_SEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXTLOAD]](<2 x s32>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s32>) = G_SEXTLOAD %0 :: (load 2, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_sextload_global_v2i32_from_4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_v2i32_from_4
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_SEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[SEXTLOAD]](<2 x s32>)
+ ; GFX6-LABEL: name: test_sextload_global_v2i32_from_4
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_SEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[SEXTLOAD]](<2 x s32>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s32>) = G_SEXTLOAD %0 :: (load 4, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_sextload_global_v2i64_from_4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_v2i64_from_4
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_SEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[SEXTLOAD]](<2 x s64>)
+ ; GFX6-LABEL: name: test_sextload_global_v2i64_from_4
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_SEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[SEXTLOAD]](<2 x s64>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s64>) = G_SEXTLOAD %0 :: (load 4, addrspace 1)
+ $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
+...
+
+---
+name: test_sextload_global_v2i64_from_8
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_sextload_global_v2i64_from_8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[SEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_SEXTLOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[SEXTLOAD]](<2 x s64>)
+ ; GFX6-LABEL: name: test_sextload_global_v2i64_from_8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[SEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_SEXTLOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[SEXTLOAD]](<2 x s64>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s64>) = G_SEXTLOAD %0 :: (load 8, addrspace 1)
+ $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
+...
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir
index 3e48a6113f61..6c2dcfcd36ae 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-store.mir
@@ -392,18 +392,16 @@ body: |
; SI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; SI: [[DEF:%[0-9]+]]:_(<3 x s8>) = G_IMPLICIT_DEF
; SI: [[DEF1:%[0-9]+]]:_(<4 x s8>) = G_IMPLICIT_DEF
- ; SI: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[DEF1]](<4 x s8>)
- ; SI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[ANYEXT]], [[DEF]](<3 x s8>), 0
- ; SI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[INSERT]](<4 x s16>)
- ; SI: G_STORE [[TRUNC]](<4 x s8>), [[COPY]](p1) :: (store 3, align 4, addrspace 1)
+ ; SI: [[INSERT:%[0-9]+]]:_(<4 x s8>) = G_INSERT [[DEF1]], [[DEF]](<3 x s8>), 0
+ ; SI: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[INSERT]](<4 x s8>)
+ ; SI: G_STORE [[BITCAST]](s32), [[COPY]](p1) :: (store 3, align 4, addrspace 1)
; VI-LABEL: name: test_store_global_v3s8_align4
; VI: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
; VI: [[DEF:%[0-9]+]]:_(<3 x s8>) = G_IMPLICIT_DEF
; VI: [[DEF1:%[0-9]+]]:_(<4 x s8>) = G_IMPLICIT_DEF
- ; VI: [[ANYEXT:%[0-9]+]]:_(<4 x s16>) = G_ANYEXT [[DEF1]](<4 x s8>)
- ; VI: [[INSERT:%[0-9]+]]:_(<4 x s16>) = G_INSERT [[ANYEXT]], [[DEF]](<3 x s8>), 0
- ; VI: [[TRUNC:%[0-9]+]]:_(<4 x s8>) = G_TRUNC [[INSERT]](<4 x s16>)
- ; VI: G_STORE [[TRUNC]](<4 x s8>), [[COPY]](p1) :: (store 3, align 4, addrspace 1)
+ ; VI: [[INSERT:%[0-9]+]]:_(<4 x s8>) = G_INSERT [[DEF1]], [[DEF]](<3 x s8>), 0
+ ; VI: [[BITCAST:%[0-9]+]]:_(s32) = G_BITCAST [[INSERT]](<4 x s8>)
+ ; VI: G_STORE [[BITCAST]](s32), [[COPY]](p1) :: (store 3, align 4, addrspace 1)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(<3 x s8>) = G_IMPLICIT_DEF
G_STORE %1, %0 :: (store 3, addrspace 1, align 4)
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-global.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-global.mir
index 3552d0e62cb5..80ea67233550 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-global.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-zextload-global.mir
@@ -1,6 +1,17 @@
# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
-# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -o - %s | FileCheck %s
-# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer -o - %s | FileCheck %s
+# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=fiji -run-pass=legalizer -global-isel-abort=2 -pass-remarks-missed='gisel*' -o - 2> %t %s | FileCheck -check-prefix=GFX8 %s
+# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=tahiti -run-pass=legalizer -global-isel-abort=2 -o - %s | FileCheck -check-prefix=GFX6 %s
+# RUN: FileCheck -check-prefixes=ERR %s < %t
+
+# FIXME: Run with and without unaligned access turned on
+
+# ERR-NOT: remark
+# ERR: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s16>) = G_ZEXTLOAD %0:_(p1) :: (load 2, addrspace 1) (in function: test_zextload_global_v2i16_from_2)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s32>) = G_ZEXTLOAD %0:_(p1) :: (load 2, addrspace 1) (in function: test_zextload_global_v2i32_from_2)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s32>) = G_ZEXTLOAD %0:_(p1) :: (load 4, addrspace 1) (in function: test_zextload_global_v2i32_from_4)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s64>) = G_ZEXTLOAD %0:_(p1) :: (load 4, addrspace 1) (in function: test_zextload_global_v2i64_from_4)
+# ERR-NEXT: remark: <unknown>:0:0: unable to legalize instruction: %1:_(<2 x s64>) = G_ZEXTLOAD %0:_(p1) :: (load 8, addrspace 1) (in function: test_zextload_global_v2i64_from_8)
+# ERR-NOT: remark
---
name: test_zextload_global_i32_i8
@@ -8,10 +19,14 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i32_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: $vgpr0 = COPY [[ZEXTLOAD]](s32)
+ ; GFX8-LABEL: name: test_zextload_global_i32_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[ZEXTLOAD]](s32)
+ ; GFX6-LABEL: name: test_zextload_global_i32_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[ZEXTLOAD]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s32) = G_ZEXTLOAD %0 :: (load 1, addrspace 1)
$vgpr0 = COPY %1
@@ -22,10 +37,14 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i32_i16
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CHECK: $vgpr0 = COPY [[ZEXTLOAD]](s32)
+ ; GFX8-LABEL: name: test_zextload_global_i32_i16
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[ZEXTLOAD]](s32)
+ ; GFX6-LABEL: name: test_zextload_global_i32_i16
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[ZEXTLOAD]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s32) = G_ZEXTLOAD %0 :: (load 2, addrspace 1)
$vgpr0 = COPY %1
@@ -36,11 +55,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i31_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY [[ZEXTLOAD]](s32)
- ; CHECK: $vgpr0 = COPY [[COPY1]](s32)
+ ; GFX8-LABEL: name: test_zextload_global_i31_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[COPY1:%[0-9]+]]:_(s32) = COPY [[ZEXTLOAD]](s32)
+ ; GFX8: $vgpr0 = COPY [[COPY1]](s32)
+ ; GFX6-LABEL: name: test_zextload_global_i31_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[ZEXTLOAD]](s32)
+ ; GFX6: $vgpr0 = COPY [[COPY1]](s32)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s31) = G_ZEXTLOAD %0 :: (load 1, addrspace 1)
%2:_(s32) = G_ANYEXT %1
@@ -52,11 +76,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i64_i8
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
- ; CHECK: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX8-LABEL: name: test_zextload_global_i64_i8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX6-LABEL: name: test_zextload_global_i64_i8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_ZEXTLOAD %0 :: (load 1, addrspace 1)
$vgpr0_vgpr1 = COPY %1
@@ -67,11 +96,16 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i64_i16
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
- ; CHECK: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX8-LABEL: name: test_zextload_global_i64_i16
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX6-LABEL: name: test_zextload_global_i64_i16
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(s32) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[ZEXTLOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_ZEXTLOAD %0 :: (load 2, addrspace 1)
$vgpr0_vgpr1 = COPY %1
@@ -82,12 +116,202 @@ body: |
bb.0:
liveins: $vgpr0_vgpr1
- ; CHECK-LABEL: name: test_zextload_global_i64_i32
- ; CHECK: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
- ; CHECK: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
- ; CHECK: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[LOAD]](s32)
- ; CHECK: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX8-LABEL: name: test_zextload_global_i64_i32
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[LOAD]](s32)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX6-LABEL: name: test_zextload_global_i64_i32
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[LOAD]](s32)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
%0:_(p1) = COPY $vgpr0_vgpr1
%1:_(s64) = G_ZEXTLOAD %0 :: (load 4, addrspace 1)
$vgpr0_vgpr1 = COPY %1
...
+
+---
+name: test_zextload_global_s32_from_2_align1
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_s32_from_2_align1
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX8: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX8: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX8: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX8: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX8: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32)
+ ; GFX8: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C1]]
+ ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX8: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[AND1]], [[C2]](s16)
+ ; GFX8: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[SHL]]
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16)
+ ; GFX8: $vgpr0 = COPY [[ZEXT]](s32)
+ ; GFX6-LABEL: name: test_zextload_global_s32_from_2_align1
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX6: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX6: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX6: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX6: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX6: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX6: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX6: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX6: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
+ ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
+ ; GFX6: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; GFX6: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[TRUNC1]]
+ ; GFX6: [[ZEXT:%[0-9]+]]:_(s32) = G_ZEXT [[OR]](s16)
+ ; GFX6: $vgpr0 = COPY [[ZEXT]](s32)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(s32) = G_ZEXTLOAD %0 :: (load 2, align 1, addrspace 1)
+ $vgpr0 = COPY %1
+...
+
+---
+name: test_zextload_global_s64_from_2_align1
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_s64_from_2_align1
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX8: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX8: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX8: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX8: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX8: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX8: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX8: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD1]](s32)
+ ; GFX8: [[AND1:%[0-9]+]]:_(s16) = G_AND [[TRUNC1]], [[C1]]
+ ; GFX8: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
+ ; GFX8: [[SHL:%[0-9]+]]:_(s16) = G_SHL [[AND1]], [[C2]](s16)
+ ; GFX8: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[SHL]]
+ ; GFX8: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[OR]](s16)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ ; GFX6-LABEL: name: test_zextload_global_s64_from_2_align1
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY]](p1) :: (load 1, addrspace 1)
+ ; GFX6: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
+ ; GFX6: [[PTR_ADD:%[0-9]+]]:_(p1) = G_PTR_ADD [[COPY]], [[C]](s64)
+ ; GFX6: [[LOAD1:%[0-9]+]]:_(s32) = G_LOAD [[PTR_ADD]](p1) :: (load 1 + 1, addrspace 1)
+ ; GFX6: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 255
+ ; GFX6: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[LOAD]](s32)
+ ; GFX6: [[AND:%[0-9]+]]:_(s16) = G_AND [[TRUNC]], [[C1]]
+ ; GFX6: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 8
+ ; GFX6: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 255
+ ; GFX6: [[COPY1:%[0-9]+]]:_(s32) = COPY [[LOAD1]](s32)
+ ; GFX6: [[AND1:%[0-9]+]]:_(s32) = G_AND [[COPY1]], [[C3]]
+ ; GFX6: [[SHL:%[0-9]+]]:_(s32) = G_SHL [[AND1]], [[C2]](s32)
+ ; GFX6: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[SHL]](s32)
+ ; GFX6: [[OR:%[0-9]+]]:_(s16) = G_OR [[AND]], [[TRUNC1]]
+ ; GFX6: [[ZEXT:%[0-9]+]]:_(s64) = G_ZEXT [[OR]](s16)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXT]](s64)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(s64) = G_ZEXTLOAD %0 :: (load 2, align 1, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_zextload_global_v2i16_from_2
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_v2i16_from_2
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s16>) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0 = COPY [[ZEXTLOAD]](<2 x s16>)
+ ; GFX6-LABEL: name: test_zextload_global_v2i16_from_2
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s16>) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0 = COPY [[ZEXTLOAD]](<2 x s16>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s16>) = G_ZEXTLOAD %0 :: (load 2, addrspace 1)
+ $vgpr0 = COPY %1
+...
+
+---
+name: test_zextload_global_v2i32_from_2
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_v2i32_from_2
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXTLOAD]](<2 x s32>)
+ ; GFX6-LABEL: name: test_zextload_global_v2i32_from_2
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_ZEXTLOAD [[COPY]](p1) :: (load 2, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXTLOAD]](<2 x s32>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s32>) = G_ZEXTLOAD %0 :: (load 2, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_zextload_global_v2i32_from_4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_v2i32_from_4
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_ZEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1 = COPY [[ZEXTLOAD]](<2 x s32>)
+ ; GFX6-LABEL: name: test_zextload_global_v2i32_from_4
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s32>) = G_ZEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1 = COPY [[ZEXTLOAD]](<2 x s32>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s32>) = G_ZEXTLOAD %0 :: (load 4, addrspace 1)
+ $vgpr0_vgpr1 = COPY %1
+...
+
+---
+name: test_zextload_global_v2i64_from_4
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_v2i64_from_4
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_ZEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[ZEXTLOAD]](<2 x s64>)
+ ; GFX6-LABEL: name: test_zextload_global_v2i64_from_4
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_ZEXTLOAD [[COPY]](p1) :: (load 4, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[ZEXTLOAD]](<2 x s64>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s64>) = G_ZEXTLOAD %0 :: (load 4, addrspace 1)
+ $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
+...
+
+---
+name: test_zextload_global_v2i64_from_8
+body: |
+ bb.0:
+ liveins: $vgpr0_vgpr1
+
+ ; GFX8-LABEL: name: test_zextload_global_v2i64_from_8
+ ; GFX8: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX8: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_ZEXTLOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX8: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[ZEXTLOAD]](<2 x s64>)
+ ; GFX6-LABEL: name: test_zextload_global_v2i64_from_8
+ ; GFX6: [[COPY:%[0-9]+]]:_(p1) = COPY $vgpr0_vgpr1
+ ; GFX6: [[ZEXTLOAD:%[0-9]+]]:_(<2 x s64>) = G_ZEXTLOAD [[COPY]](p1) :: (load 8, addrspace 1)
+ ; GFX6: $vgpr0_vgpr1_vgpr2_vgpr3 = COPY [[ZEXTLOAD]](<2 x s64>)
+ %0:_(p1) = COPY $vgpr0_vgpr1
+ %1:_(<2 x s64>) = G_ZEXTLOAD %0 :: (load 8, addrspace 1)
+ $vgpr0_vgpr1_vgpr2_vgpr3 = COPY %1
+...
diff --git a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect.mir b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect.mir
index eb0d73c9143b..5e8b492277c7 100644
--- a/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect.mir
+++ b/llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect.mir
@@ -44,7 +44,7 @@
define void @non_power_of_2() { ret void }
- define amdgpu_kernel void @load_constant_v4i16_from_6_align8(<3 x i16> addrspace(4)* %ptr0) {
+ define amdgpu_kernel void @load_constant_v4i16_from_8_align8(<3 x i16> addrspace(4)* %ptr0) {
ret void
}
@@ -188,15 +188,15 @@ body: |
...
---
-name: load_constant_v4i16_from_6_align8
+name: load_constant_v4i16_from_8_align8
legalized: true
body: |
bb.0:
- ; CHECK-LABEL: name: load_constant_v4i16_from_6_align8
+ ; CHECK-LABEL: name: load_constant_v4i16_from_8_align8
; CHECK: [[COPY:%[0-9]+]]:sgpr(p4) = COPY $sgpr0_sgpr1
- ; CHECK: [[LOAD:%[0-9]+]]:sgpr(<4 x s16>) = G_LOAD [[COPY]](p4) :: (load 6 from %ir.ptr0, align 8, addrspace 4)
+ ; CHECK: [[LOAD:%[0-9]+]]:sgpr(<4 x s16>) = G_LOAD [[COPY]](p4) :: (load 8 from %ir.ptr0, addrspace 4)
%0:_(p4) = COPY $sgpr0_sgpr1
- %1:_(<4 x s16>) = G_LOAD %0 :: (load 6 from %ir.ptr0, align 8, addrspace 4)
+ %1:_(<4 x s16>) = G_LOAD %0 :: (load 8 from %ir.ptr0, align 8, addrspace 4)
...
More information about the llvm-commits
mailing list