[llvm] r283961 - [PPCMIPeephole] Fix splat elimination
Tim Shen via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 11 17:48:25 PDT 2016
Author: timshen
Date: Tue Oct 11 19:48:25 2016
New Revision: 283961
URL: http://llvm.org/viewvc/llvm-project?rev=283961&view=rev
Log:
[PPCMIPeephole] Fix splat elimination
Summary:
In PPCMIPeephole, when we see two splat instructions, we can't simply do the following transformation:
B = Splat A
C = Splat B
=>
C = Splat A
because B may still be used between these two instructions. Instead, we should make the second Splat a PPC::COPY and let later passes decide whether to remove it or not:
B = Splat A
C = Splat B
=>
B = Splat A
C = COPY B
Fixes PR30663.
Reviewers: echristo, iteratee, kbarton, nemanjai
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D25493
Added:
llvm/trunk/test/CodeGen/PowerPC/pr30663.ll
Modified:
llvm/trunk/lib/Target/PowerPC/PPCMIPeephole.cpp
Modified: llvm/trunk/lib/Target/PowerPC/PPCMIPeephole.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCMIPeephole.cpp?rev=283961&r1=283960&r2=283961&view=diff
==============================================================================
--- llvm/trunk/lib/Target/PowerPC/PPCMIPeephole.cpp (original)
+++ llvm/trunk/lib/Target/PowerPC/PPCMIPeephole.cpp Tue Oct 11 19:48:25 2016
@@ -201,11 +201,13 @@ bool PPCMIPeephole::simplifyCode(void) {
// Splat fed by another splat - switch the output of the first
// and remove the second.
if (SameOpcode) {
- DefMI->getOperand(0).setReg(MI.getOperand(0).getReg());
+ DEBUG(dbgs() << "Changing redundant splat to a copy: ");
+ DEBUG(MI.dump());
+ BuildMI(MBB, &MI, MI.getDebugLoc(), TII->get(PPC::COPY),
+ MI.getOperand(0).getReg())
+ .addOperand(MI.getOperand(OpNo));
ToErase = &MI;
Simplified = true;
- DEBUG(dbgs() << "Removing redundant splat: ");
- DEBUG(MI.dump());
}
// Splat fed by a shift. Usually when we align value to splat into
// vector element zero.
Added: llvm/trunk/test/CodeGen/PowerPC/pr30663.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/pr30663.ll?rev=283961&view=auto
==============================================================================
--- llvm/trunk/test/CodeGen/PowerPC/pr30663.ll (added)
+++ llvm/trunk/test/CodeGen/PowerPC/pr30663.ll Tue Oct 11 19:48:25 2016
@@ -0,0 +1,24 @@
+; RUN: llc -O1 < %s | FileCheck %s
+target triple = "powerpc64le-linux-gnu"
+
+; The second xxspltw should be eliminated.
+; CHECK: xxspltw
+; CHECK-NOT: xxspltw
+define void @Test() {
+bb4:
+ %tmp = load <4 x i8>, <4 x i8>* undef
+ %tmp8 = bitcast <4 x i8> %tmp to float
+ %tmp18 = fmul float %tmp8, undef
+ %tmp19 = fsub float 0.000000e+00, %tmp18
+ store float %tmp19, float* undef
+ %tmp22 = shufflevector <4 x i8> %tmp, <4 x i8> undef, <16 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3>
+ %tmp23 = bitcast <16 x i8> %tmp22 to <4 x float>
+ %tmp25 = tail call <4 x float> @llvm.fma.v4f32(<4 x float> undef, <4 x float> %tmp23, <4 x float> undef)
+ %tmp26 = fsub <4 x float> zeroinitializer, %tmp25
+ %tmp27 = bitcast <4 x float> %tmp26 to <4 x i32>
+ tail call void @llvm.ppc.altivec.stvx(<4 x i32> %tmp27, i8* undef)
+ ret void
+}
+
+declare void @llvm.ppc.altivec.stvx(<4 x i32>, i8*)
+declare <4 x float> @llvm.fma.v4f32(<4 x float>, <4 x float>, <4 x float>)
More information about the llvm-commits
mailing list