[llvm] 7153010 - [PowerPC] Fix invalid cast for vector shuffles when lowering to the xxsplti32dx instruction.
Amy Kwan via llvm-commits
llvm-commits at lists.llvm.org
Mon Oct 24 07:57:08 PDT 2022
Author: Amy Kwan
Date: 2022-10-24T09:56:54-05:00
New Revision: 715301056ee0b12d01463ea32ff0f006392f2d12
URL: https://github.com/llvm/llvm-project/commit/715301056ee0b12d01463ea32ff0f006392f2d12
DIFF: https://github.com/llvm/llvm-project/commit/715301056ee0b12d01463ea32ff0f006392f2d12.diff
LOG: [PowerPC] Fix invalid cast for vector shuffles when lowering to the xxsplti32dx instruction.
When lowering vector shuffles into the xxsplti32dx instruction on Power10, we
canonicalize the right operand to be a BUILD_VECTOR and as a result, get the
commuted vector shuffle node.
However, a vector shuffle will not always be returned as the result for a
commuted vector shuffle. In such a scenario, this patch updates the original
cast of a shuffle into a dyn_cast<> and checks if the shuffle is a valid vector
shuffle node prior to obtaining the commuted shuffle mask.
This patch also adds a new test case that demonstrates this scenario (primarily
seen on 32-bit), and was originally a crash prior to this fix.
Differential Revision: https://reviews.llvm.org/D135024
Added:
Modified:
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
Removed:
################################################################################
diff --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 9c52e6cd3a3e8..298004772d700 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -9772,8 +9772,11 @@ SDValue PPCTargetLowering::lowerToXXSPLTI32DX(ShuffleVectorSDNode *SVN,
// Canonicalize the RHS being a BUILD_VECTOR when lowering to xxsplti32dx.
if (RHS->getOpcode() != ISD::BUILD_VECTOR) {
std::swap(LHS, RHS);
- VecShuffle = DAG.getCommutedVectorShuffle(*SVN);
- ShuffleMask = cast<ShuffleVectorSDNode>(VecShuffle)->getMask();
+ VecShuffle = peekThroughBitcasts(DAG.getCommutedVectorShuffle(*SVN));
+ ShuffleVectorSDNode *CommutedSV = dyn_cast<ShuffleVectorSDNode>(VecShuffle);
+ if (!CommutedSV)
+ return SDValue();
+ ShuffleMask = CommutedSV->getMask();
}
// Ensure that the RHS is a vector of constants.
diff --git a/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll b/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
index 2292c67f7d864..ad6a576fbf50e 100644
--- a/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
+++ b/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
@@ -8,6 +8,12 @@
; RUN: llc -verify-machineinstrs -mtriple=powerpc64-ibm-aix-xcoff \
; RUN: -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
; RUN: FileCheck %s --check-prefix=CHECK-AIX
+; RUN: llc -verify-machineinstrs -mtriple=powerpc-unknown-linux-gnu \
+; RUN: -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
+; RUN: FileCheck %s --check-prefix=CHECK-LINUX-32
+; RUN: llc -verify-machineinstrs -mtriple=powerpc-ibm-aix-xcoff \
+; RUN: -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
+; RUN: FileCheck %s --check-prefix=CHECK-AIX-32
declare hidden i32 @call1()
define hidden void @function1() {
@@ -49,6 +55,37 @@ define hidden void @function1() {
; CHECK-AIX-NEXT: ld r0, 16(r1)
; CHECK-AIX-NEXT: mtlr r0
; CHECK-AIX-NEXT: blr
+;
+; CHECK-LINUX-32-LABEL: function1:
+; CHECK-LINUX-32: # %bb.0: # %entry
+; CHECK-LINUX-32-NEXT: mflr r0
+; CHECK-LINUX-32-NEXT: stw r0, 4(r1)
+; CHECK-LINUX-32-NEXT: stwu r1, -48(r1)
+; CHECK-LINUX-32-NEXT: .cfi_def_cfa_offset 48
+; CHECK-LINUX-32-NEXT: .cfi_offset lr, 4
+; CHECK-LINUX-32-NEXT: bl call1
+; CHECK-LINUX-32-NEXT: li r4, 0
+; CHECK-LINUX-32-NEXT: stw r3, 16(r1)
+; CHECK-LINUX-32-NEXT: stw r4, 32(r1)
+; CHECK-LINUX-32-NEXT: lwz r0, 52(r1)
+; CHECK-LINUX-32-NEXT: addi r1, r1, 48
+; CHECK-LINUX-32-NEXT: mtlr r0
+; CHECK-LINUX-32-NEXT: blr
+;
+; CHECK-AIX-32-LABEL: function1:
+; CHECK-AIX-32: # %bb.0: # %entry
+; CHECK-AIX-32-NEXT: mflr r0
+; CHECK-AIX-32-NEXT: stw r0, 8(r1)
+; CHECK-AIX-32-NEXT: stwu r1, -96(r1)
+; CHECK-AIX-32-NEXT: bl .call1[PR]
+; CHECK-AIX-32-NEXT: nop
+; CHECK-AIX-32-NEXT: li r4, 0
+; CHECK-AIX-32-NEXT: stw r3, 64(r1)
+; CHECK-AIX-32-NEXT: stw r4, 80(r1)
+; CHECK-AIX-32-NEXT: addi r1, r1, 96
+; CHECK-AIX-32-NEXT: lwz r0, 8(r1)
+; CHECK-AIX-32-NEXT: mtlr r0
+; CHECK-AIX-32-NEXT: blr
entry:
%tailcall1 = tail call i32 @call1()
%0 = insertelement <4 x i32> poison, i32 %tailcall1, i64 1
More information about the llvm-commits
mailing list