[llvm] r340016 - [PowerPC] Generate Power9 extswsli extend sign and shift immediate instruction
Ali Tamur via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 23 16:45:58 PDT 2018
Hi Nemanja,
Yes it seems to work, when I apply your patch, I no longer get a seg fault.
Thanks,
Ali
On Thu, Aug 23, 2018 at 3:41 PM Nemanja Ivanovic <nemanja.i.ibm at gmail.com>
wrote:
> This is excellent, thank you for providing the test case. I see now what
> the omission is in the original patch. There was an implicit assumption
> that a sign-extend from a 32-bit value would be to a 64-bit value, but that
> of course need not be the case. Would you be able to try your original test
> with the attached patch to make sure the problem goes away before I
> re-commit the patch?
>
> Thank you so much for providing a reduced test case.
>
> Nemanja
>
> On Thu, Aug 23, 2018 at 4:12 PM Ali Tamur <tamur at google.com> wrote:
>
>> Hi,
>> Here is a small ir that causes seg fault only when the llc is compiled
>> with the "[PowerPC] Generate Power9 extswsli extend sign and shift
>> immediate instruction" change.
>>
>> To reproduce:
>> $ llvm-as test_case.ll -o test_case.bc
>> $ opt test_case.bc -ee-instrument -tbaa -scoped-noalias -simplifycfg
>> -sroa -early-cse -lower-expect -forceattrs -tbaa -scoped-noalias
>> -inferattrs -ipsccp -called-value-propagation -globalopt -mem2reg
>> -deadargelim -instcombine -simplifycfg -globals-aa -prune-eh -inline
>> -functionattrs -sroa -early-cse-memssa -speculative-execution
>> -jump-threading -correlated-propagation -simplifycfg -instcombine
>> -libcalls-shrinkwrap -pgo-memop-opt -tailcallelim -simplifycfg -reassociate
>> -loop-rotate -licm -loop-unswitch -simplifycfg -instcombine -indvars
>> -loop-idiom -loop-deletion -loop-unroll -mldst-motion -gvn -memcpyopt -sccp
>> -bdce -instcombine -jump-threading -correlated-propagation -dse -licm -adce
>> -simplifycfg -instcombine -barrier -elim-avail-extern -rpo-functionattrs
>> -globalopt -globaldce -globals-aa -float2int -loop-rotate -loop-distribute
>> -loop-vectorize -loop-load-elim -instcombine -simplifycfg -instcombine
>> -loop-unroll -instcombine -licm -alignment-from-assumptions
>> -strip-dead-prototypes -globaldce -constmerge -loop-sink -instsimplify
>> -div-rem-pairs -simplifycfg
>>
>> >>>
>> ; ModuleID = 'bugpoint-reduced-simplified.bc'
>> source_filename = "experimental/users/tamur/llvm/example.cc"
>> target datalayout = "e-m:e-i64:64-n32:64"
>> target triple = "powerpc64le-grtev4-linux-gnu"
>>
>> ; Function Attrs: nounwind
>> define void @_Test(i32 signext) local_unnamed_addr #0 align 2 {
>> %2 = ashr i32 %0, 31
>> %3 = sext i32 %2 to i64
>> %4 = zext i64 %3 to i128
>> %5 = shl nuw i128 %4, 64
>> %6 = or i128 %5, 0
>> %7 = or i128 undef, 0
>> %8 = mul i128 %7, %6
>> %9 = lshr i128 %8, 64
>> %10 = trunc i128 %9 to i64
>> %11 = load i64, i64* undef, align 8
>> %12 = and i64 %11, %10
>> %13 = call i64 @llvm.bswap.i64(i64 %12) #2
>> store i64 %13, i64* undef, align 8
>> br i1 undef, label %15, label %14
>>
>> ; <label>:14: ; preds = %1
>> br label %15
>>
>> ; <label>:15: ; preds = %14, %1
>> ret void
>> }
>>
>> ; Function Attrs: nounwind readnone speculatable
>> declare i64 @llvm.bswap.i64(i64) #1
>>
>> attributes #0 = { nounwind
>> "correctly-rounded-divide-sqrt-fp-math"="false"
>> "disable-tail-calls"="false" "less-precise-fpmad"="false"
>> "no-frame-pointer-elim"="false" "no-frame-pointer-elim-non-leaf"
>> "no-infs-fp-math"="false" "no-jump-tables"="false"
>> "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false"
>> "no-trapping-math"="false" "stack-protector-buffer-size"="8"
>> "strict-float-cast-overflow"="false" "target-cpu"="pwr9"
>> "target-features"="+altivec,+bpermd,+crypto,+direct-move,+extdiv,+htm,+power8-vector,+power9-vector,+vsx,-qpx"
>> "unsafe-fp-math"="false" "use-soft-float"="false" }
>> attributes #1 = { nounwind readnone speculatable }
>> attributes #2 = { nounwind }
>> >>>
>>
>> Please let me know if there is anything else I need to do.
>> Thanks,
>> Ali
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 22, 2018 at 9:33 AM Eric Christopher <echristo at gmail.com>
>> wrote:
>>
>>> Ali is working on it and we'll see if we can get it to you today.
>>>
>>> -eric
>>>
>>> On Wed, Aug 22, 2018 at 5:28 AM Nemanja Ivanovic <
>>> nemanja.i.ibm at gmail.com> wrote:
>>>
>>>> Thanks for reverting Eric,
>>>>
>>>> I look forward to a test case so we can get this sorted out.
>>>> Nemanja
>>>>
>>>> On Tue, Aug 21, 2018 at 7:03 PM Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>>
>>>>> I went ahead and reverted this here:
>>>>>
>>>>> Author: Eric Christopher <echristo at gmail.com>
>>>>> Date: Tue Aug 21 18:35:08 2018 +0000
>>>>>
>>>>> Temporarily Revert "[PowerPC] Generate Power9 extswsli extend sign
>>>>> and shift immediate instruction" due to it causing a compiler crash on
>>>>> valid.
>>>>>
>>>>> This reverts commit r340016, testcase forthcoming.
>>>>>
>>>>> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340315
>>>>> 91177308-0d34-0410-b5e6-96231b3b80d8
>>>>>
>>>>> after talking with Nemanja.
>>>>>
>>>>> -eric
>>>>>
>>>>> On Tue, Aug 21, 2018 at 11:11 AM Eric Christopher <echristo at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nemanja,
>>>>>>
>>>>>> We're seeing a compiler crash on valid with this patch - while we're
>>>>>> working up a testcase would you mind reverting?
>>>>>>
>>>>>> -eric
>>>>>>
>>>>>> On Fri, Aug 17, 2018 at 5:36 AM Nemanja Ivanovic via llvm-commits <
>>>>>> llvm-commits at lists.llvm.org> wrote:
>>>>>>
>>>>>>> Author: nemanjai
>>>>>>> Date: Fri Aug 17 05:35:44 2018
>>>>>>> New Revision: 340016
>>>>>>>
>>>>>>> URL: http://llvm.org/viewvc/llvm-project?rev=340016&view=rev
>>>>>>> Log:
>>>>>>> [PowerPC] Generate Power9 extswsli extend sign and shift immediate
>>>>>>> instruction
>>>>>>>
>>>>>>> Add a DAG combine for the PowerPC code generator to generate the
>>>>>>> Power9 extswsli
>>>>>>> extend sign and shift immediate instruction.
>>>>>>>
>>>>>>> Patch by RolandF.
>>>>>>>
>>>>>>> Differential revision: https://reviews.llvm.org/D49879
>>>>>>>
>>>>>>> Added:
>>>>>>> llvm/trunk/llvm/
>>>>>>> llvm/trunk/llvm/test/
>>>>>>> llvm/trunk/llvm/test/CodeGen/
>>>>>>> llvm/trunk/llvm/test/CodeGen/PowerPC/
>>>>>>> llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll
>>>>>>> llvm/trunk/test/CodeGen/PowerPC/extswsli.ll
>>>>>>> Modified:
>>>>>>> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>>>>>> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>>>>>>> llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>>>>>>> llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Fri Aug 17
>>>>>>> 05:35:44 2018
>>>>>>> @@ -1351,6 +1351,7 @@ const char *PPCTargetLowering::getTarget
>>>>>>> case PPCISD::QBFLT: return "PPCISD::QBFLT";
>>>>>>> case PPCISD::QVLFSb: return "PPCISD::QVLFSb";
>>>>>>> case PPCISD::BUILD_FP128: return "PPCISD::BUILD_FP128";
>>>>>>> + case PPCISD::EXTSWSLI: return "PPCISD::EXTSWSLI";
>>>>>>> }
>>>>>>> return nullptr;
>>>>>>> }
>>>>>>> @@ -14102,7 +14103,30 @@ SDValue PPCTargetLowering::combineSHL(SD
>>>>>>> if (auto Value = stripModuloOnShift(*this, N, DCI.DAG))
>>>>>>> return Value;
>>>>>>>
>>>>>>> - return SDValue();
>>>>>>> + SDValue N0 = N->getOperand(0);
>>>>>>> + ConstantSDNode *CN1 = dyn_cast<ConstantSDNode>(N->getOperand(1));
>>>>>>> + if (!Subtarget.isISA3_0() ||
>>>>>>> + N0.getOpcode() != ISD::SIGN_EXTEND ||
>>>>>>> + N0.getOperand(0).getValueType() != MVT::i32 ||
>>>>>>> + CN1 == nullptr)
>>>>>>> + return SDValue();
>>>>>>> +
>>>>>>> + // We can't save an operation here if the value is already
>>>>>>> extended, and
>>>>>>> + // the existing shift is easier to combine.
>>>>>>> + SDValue ExtsSrc = N0.getOperand(0);
>>>>>>> + if (ExtsSrc.getOpcode() == ISD::TRUNCATE &&
>>>>>>> + ExtsSrc.getOperand(0).getOpcode() == ISD::AssertSext)
>>>>>>> + return SDValue();
>>>>>>> +
>>>>>>> + SDLoc DL(N0);
>>>>>>> + SDValue ShiftBy = SDValue(CN1, 0);
>>>>>>> + // We want the shift amount to be i32 on the extswli, but the
>>>>>>> shift could
>>>>>>> + // have an i64.
>>>>>>> + if (ShiftBy.getValueType() == MVT::i64)
>>>>>>> + ShiftBy = DCI.DAG.getConstant(CN1->getZExtValue(), DL,
>>>>>>> MVT::i32);
>>>>>>> +
>>>>>>> + return DCI.DAG.getNode(PPCISD::EXTSWSLI, DL, MVT::i64,
>>>>>>> N0->getOperand(0),
>>>>>>> + ShiftBy);
>>>>>>> }
>>>>>>>
>>>>>>> SDValue PPCTargetLowering::combineSRA(SDNode *N, DAGCombinerInfo
>>>>>>> &DCI) const {
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h (original)
>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.h Fri Aug 17
>>>>>>> 05:35:44 2018
>>>>>>> @@ -149,6 +149,10 @@ namespace llvm {
>>>>>>> /// For vector types, only the last n bits are used. See vsld.
>>>>>>> SRL, SRA, SHL,
>>>>>>>
>>>>>>> + /// EXTSWSLI = The PPC extswsli instruction, which does an
>>>>>>> extend-sign
>>>>>>> + /// word and shift left immediate.
>>>>>>> + EXTSWSLI,
>>>>>>> +
>>>>>>> /// The combination of sra[wd]i and addze used to implemented
>>>>>>> signed
>>>>>>> /// integer division by a power of 2. The first operand is
>>>>>>> the dividend,
>>>>>>> /// and the second is the constant shift amount (representing
>>>>>>> the
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td (original)
>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCInstr64Bit.td Fri Aug 17
>>>>>>> 05:35:44 2018
>>>>>>> @@ -717,9 +717,10 @@ defm SRADI : XSForm_1rc<31, 413, (outs
>>>>>>> "sradi", "$rA, $rS, $SH", IIC_IntRotateDI,
>>>>>>> [(set i64:$rA, (sra i64:$rS, (i32
>>>>>>> imm:$SH)))]>, isPPC64;
>>>>>>>
>>>>>>> -defm EXTSWSLI : XSForm_1r<31, 445, (outs g8rc:$rA), (ins g8rc:$rS,
>>>>>>> u6imm:$SH),
>>>>>>> +defm EXTSWSLI : XSForm_1r<31, 445, (outs g8rc:$rA), (ins gprc:$rS,
>>>>>>> u6imm:$SH),
>>>>>>> "extswsli", "$rA, $rS, $SH",
>>>>>>> IIC_IntRotateDI,
>>>>>>> - []>, isPPC64;
>>>>>>> + [(set i64:$rA, (PPCextswsli i32:$rS, (i32
>>>>>>> imm:$SH)))]>,
>>>>>>> + isPPC64, Requires<[IsISA3_0]>;
>>>>>>>
>>>>>>> // For fast-isel:
>>>>>>> let isCodeGenOnly = 1, Defs = [CARRY] in
>>>>>>>
>>>>>>> Modified: llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td?rev=340016&r1=340015&r2=340016&view=diff
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td (original)
>>>>>>> +++ llvm/trunk/lib/Target/PowerPC/PPCInstrInfo.td Fri Aug 17
>>>>>>> 05:35:44 2018
>>>>>>> @@ -114,6 +114,10 @@ def SDT_PPCqvlfsb : SDTypeProfile<1, 1,
>>>>>>> SDTCisVec<0>, SDTCisPtrTy<1>
>>>>>>> ]>;
>>>>>>>
>>>>>>> +def SDT_PPCextswsli : SDTypeProfile<1, 2, [ // extswsli
>>>>>>> + SDTCisInt<0>, SDTCisInt<1>, SDTCisOpSmallerThanOp<1, 0>,
>>>>>>> SDTCisInt<2>
>>>>>>> +]>;
>>>>>>> +
>>>>>>>
>>>>>>> //===----------------------------------------------------------------------===//
>>>>>>> // PowerPC specific DAG Nodes.
>>>>>>> //
>>>>>>> @@ -218,6 +222,8 @@ def PPCsrl : SDNode<"PPCISD::SRL"
>>>>>>> def PPCsra : SDNode<"PPCISD::SRA" , SDTIntShiftOp>;
>>>>>>> def PPCshl : SDNode<"PPCISD::SHL" , SDTIntShiftOp>;
>>>>>>>
>>>>>>> +def PPCextswsli : SDNode<"PPCISD::EXTSWSLI" , SDT_PPCextswsli>;
>>>>>>> +
>>>>>>> // Move 2 i64 values into a VSX register
>>>>>>> def PPCbuild_fp128: SDNode<"PPCISD::BUILD_FP128",
>>>>>>> SDTypeProfile<1, 2,
>>>>>>>
>>>>>>> Added: llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll?rev=340016&view=auto
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll (added)
>>>>>>> +++ llvm/trunk/llvm/test/CodeGen/PowerPC/extswsli.ll Fri Aug 17
>>>>>>> 05:35:44 2018
>>>>>>> @@ -0,0 +1,17 @@
>>>>>>> +; RUN: llc -verify-machineinstrs
>>>>>>> -mtriple=powerpc64le-unknown-linux-gnu \
>>>>>>> +; RUN: -mcpu=pwr9 -ppc-asm-full-reg-names < %s | FileCheck %s
>>>>>>> +
>>>>>>> + at z = external local_unnamed_addr global i32*, align 8
>>>>>>> +
>>>>>>> +; Function Attrs: norecurse nounwind readonly
>>>>>>> +define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y)
>>>>>>> local_unnamed_addr #0 {
>>>>>>> +entry:
>>>>>>> + %0 = load i32*, i32** @z, align 8
>>>>>>> + %add = add nsw i32 %y, %x
>>>>>>> + %idxprom = sext i32 %add to i64
>>>>>>> + %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom
>>>>>>> + %1 = load i32, i32* %arrayidx, align 4
>>>>>>> + ret i32 %1
>>>>>>> +; CHECK-LABEL: @_Z2tcii
>>>>>>> +; CHECK: extswsli {{r[0-9]+}}, {{r[0-9]+}}, 2
>>>>>>> +}
>>>>>>>
>>>>>>> Added: llvm/trunk/test/CodeGen/PowerPC/extswsli.ll
>>>>>>> URL:
>>>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/extswsli.ll?rev=340016&view=auto
>>>>>>>
>>>>>>> ==============================================================================
>>>>>>> --- llvm/trunk/test/CodeGen/PowerPC/extswsli.ll (added)
>>>>>>> +++ llvm/trunk/test/CodeGen/PowerPC/extswsli.ll Fri Aug 17 05:35:44
>>>>>>> 2018
>>>>>>> @@ -0,0 +1,17 @@
>>>>>>> +; RUN: llc -verify-machineinstrs
>>>>>>> -mtriple=powerpc64le-unknown-linux-gnu \
>>>>>>> +; RUN: -mcpu=pwr9 -ppc-asm-full-reg-names < %s | FileCheck %s
>>>>>>> +
>>>>>>> + at z = external local_unnamed_addr global i32*, align 8
>>>>>>> +
>>>>>>> +; Function Attrs: norecurse nounwind readonly
>>>>>>> +define signext i32 @_Z2tcii(i32 signext %x, i32 signext %y)
>>>>>>> local_unnamed_addr #0 {
>>>>>>> +entry:
>>>>>>> + %0 = load i32*, i32** @z, align 8
>>>>>>> + %add = add nsw i32 %y, %x
>>>>>>> + %idxprom = sext i32 %add to i64
>>>>>>> + %arrayidx = getelementptr inbounds i32, i32* %0, i64 %idxprom
>>>>>>> + %1 = load i32, i32* %arrayidx, align 4
>>>>>>> + ret i32 %1
>>>>>>> +; CHECK-LABEL: @_Z2tcii
>>>>>>> +; CHECK: extswsli {{r[0-9]+}}, {{r[0-9]+}}, 2
>>>>>>> +}
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>>
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180823/2009127e/attachment.html>
More information about the llvm-commits
mailing list