[PATCH] D24924: [PPC] Better codegen for AND, ANY_EXT, SRL sequence

Mon Sep 26 09:35:13 PDT 2016

amehsan created this revision.
amehsan added reviewers: hfinkel, kbarton, nemanjai.
amehsan added subscribers: llvm-commits, Carrot, echristo.
Herald added a subscriber: nemanjai.

This fixes the first issue exposed by PR30483. (See comment 1 in the IR). 

First I tried to fix a more general problem in dag combine. I wanted to address all code sequences of shifts and ands, with an extension in the middle. My solution was to bring the extension insn to the beginning of the sequence so the sequence of shifts and ands can be merged together during isel.  That approach did not work because this particular sequence is created in target independent dag combine in DAGCombiner::visitSRL. 

So alternatively I decided to address this particular sequence in isel.  If we encounter similar issues, we can think of a more general solution.


https://reviews.llvm.org/D24924

Files:
  lib/Target/PowerPC/PPCISelDAGToDAG.cpp
  test/CodeGen/PowerPC/anyext_srl.ll

Index: test/CodeGen/PowerPC/anyext_srl.ll
===================================================================

--- /dev/null
+++ test/CodeGen/PowerPC/anyext_srl.ll
@@ -0,0 +1,29 @@
+; RUN: llc -verify-machineinstrs -mcpu=pwr8 < %s | FileCheck %s
+
+%class.PB2 = type { [1 x i32], %class.PB1* }
+%class.PB1 = type { [1 x i32], i64, i64, i32 }
+
+; Function Attrs: norecurse nounwind readonly
+define zeroext i1 @foo(%class.PB2* nocapture readonly dereferenceable(16) %s_a, %class.PB2* nocapture readonly dereferenceable(16) %s_b) local_unnamed_addr #0 {
+entry:
+  %arrayidx.i6 = bitcast %class.PB2* %s_a to i32*
+  %0 = load i32, i32* %arrayidx.i6, align 8, !tbaa !1
+  %and.i = and i32 %0, 8
+  %cmp.i = icmp ne i32 %and.i, 0
+  %arrayidx.i37 = bitcast %class.PB2* %s_b to i32*
+  %1 = load i32, i32* %arrayidx.i37, align 8, !tbaa !1
+  %and.i4 = and i32 %1, 8
+  %cmp.i5 = icmp ne i32 %and.i4, 0
+  %cmp = xor i1 %cmp.i, %cmp.i5
+  ret i1 %cmp
+; CHECK-LABEL: @foo
+; CHECK: rldicl  {{[0-9]+}}, {{[0-9]+}}, 61, 63
+
+}
+
+!0 = !{!"clang version 4.0.0 (http://llvm.org/git/clang.git 7981b20f318488a10e7c0c8e0f0ca502e02e74cd) (http://llvm.org/git/llvm.git 3b621275428532a32a2806585282fa025af2d241)"}
+!1 = !{!2, !2, i64 0}
+!2 = !{!"int", !3, i64 0}
+!3 = !{!"omnipotent char", !4, i64 0}
+!4 = !{!"Simple C++ TBAA"}
+
Index: lib/Target/PowerPC/PPCISelDAGToDAG.cpp
===================================================================
--- lib/Target/PowerPC/PPCISelDAGToDAG.cpp
+++ lib/Target/PowerPC/PPCISelDAGToDAG.cpp
@@ -2636,6 +2636,19 @@
       MB = 64 - countTrailingOnes(Imm64);
       SH = 0;
 
+      auto Op0 = Val.getOperand(0);
+      if (Val.getOpcode() == ISD::ANY_EXTEND && Op0.getOpcode() == ISD::SRL &&
+          isInt32Immediate(Op0.getOperand(1).getNode(), Imm) && Imm <= MB) {
+
+        auto ResultType = Val.getNode()->getValueType(0);
+        auto ImDef = CurDAG->getMachineNode(PPC::IMPLICIT_DEF, dl, ResultType);
+        SDValue IDVal (ImDef, 0);
+
+        Val = SDValue(CurDAG->getMachineNode(PPC::INSERT_SUBREG, dl, ResultType,
+                      IDVal, Op0.getOperand(0), getI32Imm(1, dl)), 0);
+        SH = 64 - Imm;
+      }
+
       // If the operand is a logical right shift, we can fold it into this
       // instruction: rldicl(rldicl(x, 64-n, n), 0, mb) -> rldicl(x, 64-n, mb)
       // for n <= mb. The right shift is really a left rotate followed by a


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D24924.72504.patch
Type: text/x-patch
Size: 2397 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160926/e6019927/attachment.bin>