[llvm] 7153010 - [PowerPC] Fix invalid cast for vector shuffles when lowering to the xxsplti32dx instruction.

Amy Kwan via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 24 07:57:08 PDT 2022


Author: Amy Kwan
Date: 2022-10-24T09:56:54-05:00
New Revision: 715301056ee0b12d01463ea32ff0f006392f2d12

URL: https://github.com/llvm/llvm-project/commit/715301056ee0b12d01463ea32ff0f006392f2d12
DIFF: https://github.com/llvm/llvm-project/commit/715301056ee0b12d01463ea32ff0f006392f2d12.diff

LOG: [PowerPC] Fix invalid cast for vector shuffles when lowering to the xxsplti32dx instruction.

When lowering vector shuffles into the xxsplti32dx instruction on Power10, we
canonicalize the right operand to be a BUILD_VECTOR and as a result, get the
commuted vector shuffle node.

However, a vector shuffle will not always be returned as the result for a
commuted vector shuffle. In such a scenario, this patch updates the original
cast of a shuffle into a dyn_cast<> and checks if the shuffle is a valid vector
shuffle node prior to obtaining the commuted shuffle mask.

This patch also adds a new test case that demonstrates this scenario (primarily
seen on 32-bit), and was originally a crash prior to this fix.

Differential Revision: https://reviews.llvm.org/D135024

Added: 
    

Modified: 
    llvm/lib/Target/PowerPC/PPCISelLowering.cpp
    llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
index 9c52e6cd3a3e8..298004772d700 100644
--- a/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+++ b/llvm/lib/Target/PowerPC/PPCISelLowering.cpp
@@ -9772,8 +9772,11 @@ SDValue PPCTargetLowering::lowerToXXSPLTI32DX(ShuffleVectorSDNode *SVN,
   // Canonicalize the RHS being a BUILD_VECTOR when lowering to xxsplti32dx.
   if (RHS->getOpcode() != ISD::BUILD_VECTOR) {
     std::swap(LHS, RHS);
-    VecShuffle = DAG.getCommutedVectorShuffle(*SVN);
-    ShuffleMask = cast<ShuffleVectorSDNode>(VecShuffle)->getMask();
+    VecShuffle = peekThroughBitcasts(DAG.getCommutedVectorShuffle(*SVN));
+    ShuffleVectorSDNode *CommutedSV = dyn_cast<ShuffleVectorSDNode>(VecShuffle);
+    if (!CommutedSV)
+      return SDValue();
+    ShuffleMask = CommutedSV->getMask();
   }
 
   // Ensure that the RHS is a vector of constants.

diff  --git a/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll b/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
index 2292c67f7d864..ad6a576fbf50e 100644
--- a/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
+++ b/llvm/test/CodeGen/PowerPC/p10-splatImm32-undef.ll
@@ -8,6 +8,12 @@
 ; RUN: llc -verify-machineinstrs -mtriple=powerpc64-ibm-aix-xcoff \
 ; RUN:     -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
 ; RUN:     FileCheck %s --check-prefix=CHECK-AIX
+; RUN: llc -verify-machineinstrs -mtriple=powerpc-unknown-linux-gnu \
+; RUN:     -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
+; RUN:     FileCheck %s --check-prefix=CHECK-LINUX-32
+; RUN: llc -verify-machineinstrs -mtriple=powerpc-ibm-aix-xcoff \
+; RUN:     -ppc-asm-full-reg-names -mcpu=pwr10 < %s | \
+; RUN:     FileCheck %s --check-prefix=CHECK-AIX-32
 
 declare hidden i32 @call1()
 define hidden void @function1() {
@@ -49,6 +55,37 @@ define hidden void @function1() {
 ; CHECK-AIX-NEXT:    ld r0, 16(r1)
 ; CHECK-AIX-NEXT:    mtlr r0
 ; CHECK-AIX-NEXT:    blr
+;
+; CHECK-LINUX-32-LABEL: function1:
+; CHECK-LINUX-32:       # %bb.0: # %entry
+; CHECK-LINUX-32-NEXT:    mflr r0
+; CHECK-LINUX-32-NEXT:    stw r0, 4(r1)
+; CHECK-LINUX-32-NEXT:    stwu r1, -48(r1)
+; CHECK-LINUX-32-NEXT:    .cfi_def_cfa_offset 48
+; CHECK-LINUX-32-NEXT:    .cfi_offset lr, 4
+; CHECK-LINUX-32-NEXT:    bl call1
+; CHECK-LINUX-32-NEXT:    li r4, 0
+; CHECK-LINUX-32-NEXT:    stw r3, 16(r1)
+; CHECK-LINUX-32-NEXT:    stw r4, 32(r1)
+; CHECK-LINUX-32-NEXT:    lwz r0, 52(r1)
+; CHECK-LINUX-32-NEXT:    addi r1, r1, 48
+; CHECK-LINUX-32-NEXT:    mtlr r0
+; CHECK-LINUX-32-NEXT:    blr
+;
+; CHECK-AIX-32-LABEL: function1:
+; CHECK-AIX-32:       # %bb.0: # %entry
+; CHECK-AIX-32-NEXT:    mflr r0
+; CHECK-AIX-32-NEXT:    stw r0, 8(r1)
+; CHECK-AIX-32-NEXT:    stwu r1, -96(r1)
+; CHECK-AIX-32-NEXT:    bl .call1[PR]
+; CHECK-AIX-32-NEXT:    nop
+; CHECK-AIX-32-NEXT:    li r4, 0
+; CHECK-AIX-32-NEXT:    stw r3, 64(r1)
+; CHECK-AIX-32-NEXT:    stw r4, 80(r1)
+; CHECK-AIX-32-NEXT:    addi r1, r1, 96
+; CHECK-AIX-32-NEXT:    lwz r0, 8(r1)
+; CHECK-AIX-32-NEXT:    mtlr r0
+; CHECK-AIX-32-NEXT:    blr
 entry:
   %tailcall1 = tail call i32 @call1()
   %0 = insertelement <4 x i32> poison, i32 %tailcall1, i64 1


        


More information about the llvm-commits mailing list