[llvm] r340016 - [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction

Nemanja Ivanovic via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 27 04:22:15 PDT 2018


Re-committed in r340734.
Thanks again for the investigation.

Nemanja

On Fri, Aug 24, 2018 at 11:43 AM Nemanja Ivanovic <nemanja.i.ibm at gmail.com>
wrote:

> OK. I'll re-commit the updated patch later today.
> Thanks so much for investigating Ali.
>
> Nemanja
>
> On Thu, Aug 23, 2018 at 7:46 PM Ali Tamur <tamur at google.com> wrote:
>
>> Hi Nemanja,
>> Yes it seems to work, when I apply your patch, I no longer get a seg
>> fault.
>>
>> Thanks,
>> Ali
>>
>>
>> On Thu, Aug 23, 2018 at 3:41 PM Nemanja Ivanovic <nemanja.i.ibm at gmail.com>
>> wrote:
>>
>>> This is excellent, thank you for providing the test case. I see now what
>>> the omission is in the original patch. There was an implicit assumption
>>> that a sign-extend from a 32-bit value would be to a 64-bit value, but that
>>> of course need not be the case. Would you be able to try your original test
>>> with the attached patch to make sure the problem goes away before I
>>> re-commit the patch?
>>>
>>> Thank you so much for providing a reduced test case.
>>>
>>> Nemanja
>>>
>>> On Thu, Aug 23, 2018 at 4:12 PM Ali Tamur <tamur at google.com> wrote:
>>>
>>>> Hi,
>>>> Here is a small ir that causes seg fault only when the llc is compiled
>>>> with the "[PowerPC] Generate Power9 extswsli extend sign and shift
>>>> immediate instruction" change.
>>>>
>>>> To reproduce:
>>>> $  llvm-as test_case.ll -o test_case.bc
>>>> $  opt test_case.bc -ee-instrument -tbaa -scoped-noalias -simplifycfg
>>>> -sroa -early-cse -lower-expect -forceattrs -tbaa -scoped-noalias
>>>> -inferattrs -ipsccp -called-value-propagation -globalopt -mem2reg
>>>> -deadargelim -instcombine -simplifycfg -globals-aa -prune-eh -inline
>>>> -functionattrs -sroa -early-cse-memssa -speculative-execution
>>>> -jump-threading -correlated-propagation -simplifycfg -instcombine
>>>> -libcalls-shrinkwrap -pgo-memop-opt -tailcallelim -simplifycfg -reassociate
>>>> -loop-rotate -licm -loop-unswitch -simplifycfg -instcombine -indvars
>>>> -loop-idiom -loop-deletion -loop-unroll -mldst-motion -gvn -memcpyopt -sccp
>>>> -bdce -instcombine -jump-threading -correlated-propagation -dse -licm -adce
>>>> -simplifycfg -instcombine -barrier -elim-avail-extern -rpo-functionattrs
>>>> -globalopt -globaldce -globals-aa -float2int -loop-rotate -loop-distribute
>>>> -loop-vectorize -loop-load-elim -instcombine -simplifycfg -instcombine
>>>> -loop-unroll -instcombine -licm -alignment-from-assumptions
>>>> -strip-dead-prototypes -globaldce -constmerge -loop-sink -instsimplify
>>>> -div-rem-pairs -simplifycfg
>>>>
>>>> >>>
>>>> ; ModuleID = 'bugpoint-reduced-simplified.bc'
>>>> source_filename = "experimental/users/tamur/llvm/example.cc"
>>>> target datalayout = "e-m:e-i64:64-n32:64"
>>>> target triple = "powerpc64le-grtev4-linux-gnu"
>>>>
>>>> ; Function Attrs: nounwind
>>>> define void @_Test(i32 signext) local_unnamed_addr #0 align 2 {
>>>>   %2 = ashr i32 %0, 31
>>>>   %3 = sext i32 %2 to i64
>>>>   %4 = zext i64 %3 to i128
>>>>   %5 = shl nuw i128 %4, 64
>>>>   %6 = or i128 %5, 0
>>>>   %7 = or i128 undef, 0
>>>>   %8 = mul i128 %7, %6
>>>>   %9 = lshr i128 %8, 64
>>>>   %10 = trunc i128 %9 to i64
>>>>   %11 = load i64, i64* undef, align 8
>>>>   %12 = and i64 %11, %10
>>>>   %13 = call i64 @llvm.bswap.i64(i64 %12) #2
>>>>   store i64 %13, i64* undef, align 8
>>>>   br i1 undef, label %15, label %14
>>>>
>>>> ; <label>:14:                                     ; preds = %1
>>>>   br label %15
>>>>
>>>> ; <label>:15:                                     ; preds = %14, %1
>>>>   ret void
>>>> }
>>>>
>>>> ; Function Attrs: nounwind readnone speculatable
>>>> declare i64 @llvm.bswap.i64(i64) #1
>>>>
>>>> attributes #0 = { nounwind
>>>> "correctly-rounded-divide-sqrt-fp-math"="false"
>>>> "disable-tail-calls"="false" "less-precise-fpmad"="false"
>>>> "no-frame-pointer-elim"="false" "no-frame-pointer-elim-non-leaf"
>>>> "no-infs-fp-math"="false" "no-jump-tables"="false"
>>>> "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false"
>>>> "no-trapping-math"="false" "stack-protector-buffer-size"="8"
>>>> "strict-float-cast-overflow"="false" "target-cpu"="pwr9"
>>>> "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+power9-vector,+vsx,-qpx"
>>>> "unsafe-fp-math"="false" "use-soft-float"="false" }
>>>> attributes #1 = { nounwind readnone speculatable }
>>>> attributes #2 = { nounwind }
>>>> >>>
>>>>
>>>> Please let me know if there is anything else I need to do.
>>>> Thanks,
>>>> Ali
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Aug 22, 2018 at 9:33 AM Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>>
>>>>> Ali is working on it and we'll see if we can get it to you today.
>>>>>
>>>>> -eric
>>>>>
>>>>> On Wed, Aug 22, 2018 at 5:28 AM Nemanja Ivanovic <
>>>>> nemanja.i.ibm at gmail.com> wrote:
>>>>>
>>>>>> Thanks for reverting Eric,
>>>>>>
>>>>>> I look forward to a test case so we can get this sorted out.
>>>>>> Nemanja
>>>>>>
>>>>>> On Tue, Aug 21, 2018 at 7:03 PM Eric Christopher <echristo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I went ahead and reverted this here:
>>>>>>>
>>>>>>> Author: Eric Christopher <echristo at gmail.com>
>>>>>>> Date:   Tue Aug 21 18:35:08 2018 +0000
>>>>>>>
>>>>>>>     Temporarily Revert "[PowerPC] Generate Power9 extswsli extend
>>>>>>> sign and shift immediate instruction" due to it causing a compiler crash on
>>>>>>> valid.
>>>>>>>
>>>>>>>     This reverts commit r340016, testcase forthcoming.
>>>>>>>
>>>>>>>     git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340315
>>>>>>> 91177308-0d34-0410-b5e6-96231b3b80d8
>>>>>>>
>>>>>>> after talking with Nemanja.
>>>>>>>
>>>>>>> -eric
>>>>>>>
>>>>>>> On Tue, Aug 21, 2018 at 11:11 AM Eric Christopher <
>>>>>>> echristo at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Nemanja,
>>>>>>>>
>>>>>>>> We're seeing a compiler crash on valid with this patch - while
>>>>>>>> we're working up a testcase would you mind reverting?
>>>>>>>>
>>>>>>>> -eric
>>>>>>>>
>>>>>>>> On Fri, Aug 17, 2018 at 5:36 AM Nemanja Ivanovic via llvm-commits <
>>>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> Author: nemanjai
>>>>>>>>> Date: Fri Aug 17 05:35:44 2018
>>>>>>>>> New Revision: 340016
>>>>>>>>>
>>>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=340016&view=rev
>>>>>>>>> Log:
>>>>>>>>> [PowerPC] Generate Power9 extswsli extend sign and shift immediate
>>>>>>>>> instruction
>>>>>>>>>
>>>>>>>>> Add a DAG combine for the PowerPC code generator to generate the
>>>>>>>>> Power9 extswsli
>>>>>>>>> extend sign and shift immediate instruction.
>>>>>>>>>
>>>>>>>>> Patch by RolandF.
>>>>>>>>>
>>>>>>>>> Differential revision: https://reviews.llvm.org/D49879
>>>>>>>>>
>>>>>>>>> Added:
>>>>>>>>>     llvm/trunk/llvm/
>>>>>>>>>     llvm/trunk/llvm/test/
>>>>>>>>>     llvm/trunk/llvm/test/CodeGen/
>>>>>>>>>     llvm/trunk/llvm/test/CodeGen/PowerPC/
>>>>>>>>>     llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll
>>>>>>>>>     llvm/trunk/test/CodeGen/PowerPC/extswsli.ll
>>>>>>>>> Modified:
>>>>>>>>>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>>>>>>>>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>>>>>>>>>     llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>>>>>>>>>     llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
>>>>>>>>>
>>>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
>>>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -1351,6 +1351,7 @@ const char *PPCTargetLowering::getTarget
>>>>>>>>>    case PPCISD::QBFLT:           return "PPCISD::QBFLT";
>>>>>>>>>    case PPCISD::QVLFSb:          return "PPCISD::QVLFSb";
>>>>>>>>>    case PPCISD::BUILD_FP128:     return "PPCISD::BUILD_FP128";
>>>>>>>>> +  case PPCISD::EXTSWSLI:        return "PPCISD::EXTSWSLI";
>>>>>>>>>    }
>>>>>>>>>    return nullptr;
>>>>>>>>>  }
>>>>>>>>> @@ -14102,7 +14103,30 @@ SDValue PPCTargetLowering::combineSHL(SD
>>>>>>>>>    if (auto Value = stripModuloOnShift(*this, N, DCI.DAG))
>>>>>>>>>      return Value;
>>>>>>>>>
>>>>>>>>> -  return SDValue();
>>>>>>>>> +  SDValue N0 = N->getOperand(0);
>>>>>>>>> +  ConstantSDNode *CN1 =
>>>>>>>>> dyn_cast<ConstantSDNode>(N->getOperand(1));
>>>>>>>>> +  if (!Subtarget.isISA3_0() ||
>>>>>>>>> +      N0.getOpcode() != ISD::SIGN_EXTEND ||
>>>>>>>>> +      N0.getOperand(0).getValueType() != MVT::i32 ||
>>>>>>>>> +      CN1 == nullptr)
>>>>>>>>> +    return SDValue();
>>>>>>>>> +
>>>>>>>>> +  // We can't save an operation here if the value is already
>>>>>>>>> extended, and
>>>>>>>>> +  // the existing shift is easier to combine.
>>>>>>>>> +  SDValue ExtsSrc = N0.getOperand(0);
>>>>>>>>> +  if (ExtsSrc.getOpcode() == ISD::TRUNCATE &&
>>>>>>>>> +      ExtsSrc.getOperand(0).getOpcode() == ISD::AssertSext)
>>>>>>>>> +    return SDValue();
>>>>>>>>> +
>>>>>>>>> +  SDLoc DL(N0);
>>>>>>>>> +  SDValue ShiftBy = SDValue(CN1, 0);
>>>>>>>>> +  // We want the shift amount to be i32 on the extswli, but the
>>>>>>>>> shift could
>>>>>>>>> +  // have an i64.
>>>>>>>>> +  if (ShiftBy.getValueType() == MVT::i64)
>>>>>>>>> +    ShiftBy = DCI.DAG.getConstant(CN1->getZExtValue(), DL,
>>>>>>>>> MVT::i32);
>>>>>>>>> +
>>>>>>>>> +  return DCI.DAG.getNode(PPCISD::EXTSWSLI, DL, MVT::i64,
>>>>>>>>> N0->getOperand(0),
>>>>>>>>> +                         ShiftBy);
>>>>>>>>>  }
>>>>>>>>>
>>>>>>>>>  SDValue PPCTargetLowering::combineSRA(SDNode *N, DAGCombinerInfo
>>>>>>>>> &DCI) const {
>>>>>>>>>
>>>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h (original)
>>>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -149,6 +149,10 @@ namespace llvm {
>>>>>>>>>        /// For vector types, only the last n bits are used. See
>>>>>>>>> vsld.
>>>>>>>>>        SRL, SRA, SHL,
>>>>>>>>>
>>>>>>>>> +      /// EXTSWSLI = The PPC extswsli instruction, which does an
>>>>>>>>> extend-sign
>>>>>>>>> +      /// word and shift left immediate.
>>>>>>>>> +      EXTSWSLI,
>>>>>>>>> +
>>>>>>>>>        /// The combination of sra[wd]i and addze used to
>>>>>>>>> implemented signed
>>>>>>>>>        /// integer division by a power of 2. The first operand is
>>>>>>>>> the dividend,
>>>>>>>>>        /// and the second is the constant shift amount
>>>>>>>>> (representing the
>>>>>>>>>
>>>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td (original)
>>>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -717,9 +717,10 @@ defm SRADI  : XSForm_1rc<31, 413, (outs
>>>>>>>>>                           "sradi", "$rA, $rS, $SH",
>>>>>>>>> IIC_IntRotateDI,
>>>>>>>>>                           [(set i64:$rA, (sra i64:$rS, (i32
>>>>>>>>> imm:$SH)))]>, isPPC64;
>>>>>>>>>
>>>>>>>>> -defm EXTSWSLI : XSForm_1r<31, 445, (outs g8rc:$rA), (ins
>>>>>>>>> g8rc:$rS, u6imm:$SH),
>>>>>>>>> +defm EXTSWSLI : XSForm_1r<31, 445, (outs g8rc:$rA), (ins
>>>>>>>>> gprc:$rS, u6imm:$SH),
>>>>>>>>>                            "extswsli", "$rA, $rS, $SH",
>>>>>>>>> IIC_IntRotateDI,
>>>>>>>>> -                          []>, isPPC64;
>>>>>>>>> +                          [(set i64:$rA, (PPCextswsli i32:$rS,
>>>>>>>>> (i32 imm:$SH)))]>,
>>>>>>>>> +                          isPPC64, Requires<[IsISA3_0]>;
>>>>>>>>>
>>>>>>>>>  // For fast-isel:
>>>>>>>>>  let isCodeGenOnly = 1, Defs = [CARRY] in
>>>>>>>>>
>>>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td (original)
>>>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -114,6 +114,10 @@ def SDT_PPCqvlfsb : SDTypeProfile<1, 1,
>>>>>>>>>    SDTCisVec<0>, SDTCisPtrTy<1>
>>>>>>>>>  ]>;
>>>>>>>>>
>>>>>>>>> +def SDT_PPCextswsli : SDTypeProfile<1, 2, [  // extswsli
>>>>>>>>> +  SDTCisInt<0>, SDTCisInt<1>, SDTCisOpSmallerThanOp<1, 0>,
>>>>>>>>> SDTCisInt<2>
>>>>>>>>> +]>;
>>>>>>>>> +
>>>>>>>>>
>>>>>>>>>  //===----------------------------------------------------------------------===//
>>>>>>>>>  // PowerPC specific DAG Nodes.
>>>>>>>>>  //
>>>>>>>>> @@ -218,6 +222,8 @@ def PPCsrl        : SDNode<"PPCISD::SRL"
>>>>>>>>>  def PPCsra        : SDNode<"PPCISD::SRA"       , SDTIntShiftOp>;
>>>>>>>>>  def PPCshl        : SDNode<"PPCISD::SHL"       , SDTIntShiftOp>;
>>>>>>>>>
>>>>>>>>> +def PPCextswsli : SDNode<"PPCISD::EXTSWSLI" , SDT_PPCextswsli>;
>>>>>>>>> +
>>>>>>>>>  // Move 2 i64 values into a VSX register
>>>>>>>>>  def PPCbuild_fp128: SDNode<"PPCISD::BUILD_FP128",
>>>>>>>>>                             SDTypeProfile<1, 2,
>>>>>>>>>
>>>>>>>>> Added: llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll?rev=340016&view=auto
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll (added)
>>>>>>>>> +++ llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -0,0 +1,17 @@
>>>>>>>>> +; RUN: llc -verify-machineinstrs
>>>>>>>>> -mtriple=powerpc64le-unknown-linux-gnu \
>>>>>>>>> +; RUN:     -mcpu=pwr9 -ppc-asm-full-reg-names < %s | FileCheck %s
>>>>>>>>> +
>>>>>>>>> + at z = external local_unnamed_addr global i32*, align 8
>>>>>>>>> +
>>>>>>>>> +; Function Attrs: norecurse nounwind readonly
>>>>>>>>> +define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y)
>>>>>>>>> local_unnamed_addr #0 {
>>>>>>>>> +entry:
>>>>>>>>> +  %0 = load i32*, i32** @z, align 8
>>>>>>>>> +  %add = add nsw i32 %y, %x
>>>>>>>>> +  %idxprom = sext i32 %add to i64
>>>>>>>>> +  %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom
>>>>>>>>> +  %1 = load i32, i32* %arrayidx, align 4
>>>>>>>>> +  ret i32 %1
>>>>>>>>> +; CHECK-LABEL: @_Z2tcii
>>>>>>>>> +; CHECK: extswsli {{r[0-9]+}}, {{r[0-9]+}}, 2
>>>>>>>>> +}
>>>>>>>>>
>>>>>>>>> Added: llvm/trunk/test/CodeGen/PowerPC/extswsli.ll
>>>>>>>>> URL:
>>>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/extswsli.ll?rev=340016&view=auto
>>>>>>>>>
>>>>>>>>> ==============================================================================
>>>>>>>>> --- llvm/trunk/test/CodeGen/PowerPC/extswsli.ll (added)
>>>>>>>>> +++ llvm/trunk/test/CodeGen/PowerPC/extswsli.ll Fri Aug 17
>>>>>>>>> 05:35:44 2018
>>>>>>>>> @@ -0,0 +1,17 @@
>>>>>>>>> +; RUN: llc -verify-machineinstrs
>>>>>>>>> -mtriple=powerpc64le-unknown-linux-gnu \
>>>>>>>>> +; RUN:     -mcpu=pwr9 -ppc-asm-full-reg-names < %s | FileCheck %s
>>>>>>>>> +
>>>>>>>>> + at z = external local_unnamed_addr global i32*, align 8
>>>>>>>>> +
>>>>>>>>> +; Function Attrs: norecurse nounwind readonly
>>>>>>>>> +define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y)
>>>>>>>>> local_unnamed_addr #0 {
>>>>>>>>> +entry:
>>>>>>>>> +  %0 = load i32*, i32** @z, align 8
>>>>>>>>> +  %add = add nsw i32 %y, %x
>>>>>>>>> +  %idxprom = sext i32 %add to i64
>>>>>>>>> +  %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom
>>>>>>>>> +  %1 = load i32, i32* %arrayidx, align 4
>>>>>>>>> +  ret i32 %1
>>>>>>>>> +; CHECK-LABEL: @_Z2tcii
>>>>>>>>> +; CHECK: extswsli {{r[0-9]+}}, {{r[0-9]+}}, 2
>>>>>>>>> +}
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> llvm-commits mailing list
>>>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>>>
>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180827/f280e837/attachment.html>


More information about the llvm-commits mailing list