[llvm] r296771 - [PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size

Daniel Jasper via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 7 00:20:12 PST 2017


Just FYI, Tim has reverted this patch and is looking at reducing a test
case. He should have one tomorrow morning.

So far, he found that the first parameter is live but clobbered immediately
in some case.
Assembly:
ld      r3,-31576(r2)

(I don't know what that means, but just wanted to provide as much
information as I can, in case that makes it easy for you to spot the bug)

Cheers,
Danie

On Tue, Mar 7, 2017 at 7:48 AM, Daniel Jasper <djasper at google.com> wrote:

> This seems to be causing some miscompiles. Working on a test case.
>
> On Thu, Mar 2, 2017 at 6:39 PM, Nemanja Ivanovic via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> Author: nemanjai
>> Date: Thu Mar  2 11:38:59 2017
>> New Revision: 296771
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=296771&view=rev
>> Log:
>> [PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack
>> frame size
>>
>> This patch reduces the stack frame size by not allocating the parameter
>> area if
>> it is not required. In the current implementation
>> LowerFormalArguments_64SVR4
>> already handles the parameter area, but LowerCall_64SVR4 does not
>> (when calculating the stack frame size). What this patch does is make
>> LowerCall_64SVR4 consistent with LowerFormalArguments_64SVR4.
>>
>> Committing on behalf of Hiroshi Inoue.
>>
>> Differential Revision: https://reviews.llvm.org/D29881
>>
>> Added:
>>     llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
>> Modified:
>>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>>
>> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/Po
>> werPC/PPCISelLowering.cpp?rev=296771&r1=296770&r2=296771&view=diff
>> ============================================================
>> ==================
>> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
>> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Thu Mar  2
>> 11:38:59 2017
>> @@ -5147,10 +5147,30 @@ SDValue PPCTargetLowering::LowerCall_64S
>>    };
>>
>>    const unsigned NumGPRs = array_lengthof(GPR);
>> -  const unsigned NumFPRs = 13;
>> +  const unsigned NumFPRs = useSoftFloat() ? 0 : 13;
>>    const unsigned NumVRs  = array_lengthof(VR);
>>    const unsigned NumQFPRs = NumFPRs;
>>
>> +  // On ELFv2, we can avoid allocating the parameter area if all the
>> arguments
>> +  // can be passed to the callee in registers.
>> +  // For the fast calling convention, there is another check below.
>> +  // Note: keep consistent with LowerFormalArguments_64SVR4()
>> +  bool HasParameterArea = !isELFv2ABI || isVarArg || CallConv ==
>> CallingConv::Fast;
>> +  if (!HasParameterArea) {
>> +    unsigned ParamAreaSize = NumGPRs * PtrByteSize;
>> +    unsigned AvailableFPRs = NumFPRs;
>> +    unsigned AvailableVRs = NumVRs;
>> +    unsigned NumBytesTmp = NumBytes;
>> +    for (unsigned i = 0; i != NumOps; ++i) {
>> +      if (Outs[i].Flags.isNest()) continue;
>> +      if (CalculateStackSlotUsed(Outs[i].VT, Outs[i].ArgVT,
>> Outs[i].Flags,
>> +                                PtrByteSize, LinkageSize, ParamAreaSize,
>> +                                NumBytesTmp, AvailableFPRs, AvailableVRs,
>> +                                Subtarget.hasQPX()))
>> +        HasParameterArea = true;
>> +    }
>> +  }
>> +
>>    // When using the fast calling convention, we don't provide backing for
>>    // arguments that will be in registers.
>>    unsigned NumGPRsUsed = 0, NumFPRsUsed = 0, NumVRsUsed = 0;
>> @@ -5218,13 +5238,18 @@ SDValue PPCTargetLowering::LowerCall_64S
>>
>>    unsigned NumBytesActuallyUsed = NumBytes;
>>
>> -  // The prolog code of the callee may store up to 8 GPR argument
>> registers to
>> +  // In the old ELFv1 ABI,
>> +  // the prolog code of the callee may store up to 8 GPR argument
>> registers to
>>    // the stack, allowing va_start to index over them in memory if its
>> varargs.
>>    // Because we cannot tell if this is needed on the caller side, we
>> have to
>>    // conservatively assume that it is needed.  As such, make sure we
>> have at
>>    // least enough stack space for the caller to store the 8 GPRs.
>> -  // FIXME: On ELFv2, it may be unnecessary to allocate the parameter
>> area.
>> -  NumBytes = std::max(NumBytes, LinkageSize + 8 * PtrByteSize);
>> +  // In the ELFv2 ABI, we allocate the parameter area iff a callee
>> +  // really requires memory operands, e.g. a vararg function.
>> +  if (HasParameterArea)
>> +    NumBytes = std::max(NumBytes, LinkageSize + 8 * PtrByteSize);
>> +  else
>> +    NumBytes = LinkageSize;
>>
>>    // Tail call needs the stack to be aligned.
>>    if (getTargetMachine().Options.GuaranteedTailCallOpt &&
>> @@ -5443,6 +5468,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>          if (CallConv == CallingConv::Fast)
>>            ComputePtrOff();
>>
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist to pass an argument in
>> memory.");
>>          LowerMemOpCallTo(DAG, MF, Chain, Arg, PtrOff, SPDiff, ArgOffset,
>>                           true, isTailCall, false, MemOpChains,
>>                           TailCallArguments, dl);
>> @@ -5528,6 +5555,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>            PtrOff = DAG.getNode(ISD::ADD, dl, PtrVT, PtrOff, ConstFour);
>>          }
>>
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist to pass an argument in
>> memory.");
>>          LowerMemOpCallTo(DAG, MF, Chain, Arg, PtrOff, SPDiff, ArgOffset,
>>                           true, isTailCall, false, MemOpChains,
>>                           TailCallArguments, dl);
>> @@ -5562,6 +5591,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>        // GPRs when within range.  For now, we always put the value in
>> both
>>        // locations (or even all three).
>>        if (isVarArg) {
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist if we have a varargs call.");
>>          // We could elide this store in the case where the object fits
>>          // entirely in R registers.  Maybe later.
>>          SDValue Store =
>> @@ -5594,6 +5625,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>          if (CallConv == CallingConv::Fast)
>>            ComputePtrOff();
>>
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist to pass an argument in
>> memory.");
>>          LowerMemOpCallTo(DAG, MF, Chain, Arg, PtrOff, SPDiff, ArgOffset,
>>                           true, isTailCall, true, MemOpChains,
>>                           TailCallArguments, dl);
>> @@ -5614,6 +5647,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>      case MVT::v4i1: {
>>        bool IsF32 = Arg.getValueType().getSimpleVT().SimpleTy ==
>> MVT::v4f32;
>>        if (isVarArg) {
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist if we have a varargs call.");
>>          // We could elide this store in the case where the object fits
>>          // entirely in R registers.  Maybe later.
>>          SDValue Store =
>> @@ -5646,6 +5681,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>          if (CallConv == CallingConv::Fast)
>>            ComputePtrOff();
>>
>> +        assert(HasParameterArea &&
>> +               "Parameter area must exist to pass an argument in
>> memory.");
>>          LowerMemOpCallTo(DAG, MF, Chain, Arg, PtrOff, SPDiff, ArgOffset,
>>                           true, isTailCall, true, MemOpChains,
>>                           TailCallArguments, dl);
>> @@ -5660,7 +5697,8 @@ SDValue PPCTargetLowering::LowerCall_64S
>>      }
>>    }
>>
>> -  assert(NumBytesActuallyUsed == ArgOffset);
>> +  assert((!HasParameterArea || NumBytesActuallyUsed == ArgOffset) &&
>> +         "mismatch in size of parameter area");
>>    (void)NumBytesActuallyUsed;
>>
>>    if (!MemOpChains.empty())
>>
>> Added: llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
>> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/
>> PowerPC/stacksize.ll?rev=296771&view=auto
>> ============================================================
>> ==================
>> --- llvm/trunk/test/CodeGen/PowerPC/stacksize.ll (added)
>> +++ llvm/trunk/test/CodeGen/PowerPC/stacksize.ll Thu Mar  2 11:38:59 2017
>> @@ -0,0 +1,86 @@
>> +; For ELFv2 ABI, we can avoid allocating the parameter area in the stack
>> frame of the caller function
>> +; if all the arguments can be passed to the callee in registers.
>> +; For ELFv1 ABI, we always need to allocate the parameter area.
>> +
>> +; Tests for ELFv2 ABI
>> +; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu
>> -target-abi elfv2 < %s | FileCheck %s -check-prefix=PPC64-ELFV2
>> +; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu
>> -target-abi elfv2 < %s | FileCheck %s -check-prefix=PPC64-ELFV2
>> +
>> +; Tests for ELFv1 ABI
>> +; RUN: llc -verify-machineinstrs -mtriple=powerpc64le-unknown-linux-gnu
>> -target-abi elfv1 < %s | FileCheck %s -check-prefix=PPC64-ELFV1
>> +; RUN: llc -verify-machineinstrs -mtriple=powerpc64-unknown-linux-gnu
>> -target-abi elfv1 < %s | FileCheck %s -check-prefix=PPC64-ELFV1
>> +
>> +; If the callee has at most eight integer args, parameter area can be
>> ommited for ELFv2 ABI.
>> +
>> +; PPC64-ELFV2-LABEL: WithoutParamArea1:
>> +; PPC64-ELFV2-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV2: stdu 1, -32(1)
>> +; PPC64-ELFV2: addi 1, 1, 32
>> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1-LABEL: WithoutParamArea1:
>> +; PPC64-ELFV1-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1: stdu 1, -112(1)
>> +; PPC64-ELFV1: addi 1, 1, 112
>> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +define signext i32 @WithoutParamArea1(i32 signext %a) local_unnamed_addr
>> #0 {
>> +entry:
>> +  %call = tail call signext i32 @onearg(i32 signext %a) #2
>> +  ret i32 %call
>> +}
>> +
>> +; PPC64-ELFV2-LABEL: WithoutParamArea2:
>> +; PPC64-ELFV2-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV2: stdu 1, -32(1)
>> +; PPC64-ELFV2: addi 1, 1, 32
>> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1-LABEL: WithoutParamArea2:
>> +; PPC64-ELFV1-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1: stdu 1, -112(1)
>> +; PPC64-ELFV1: addi 1, 1, 112
>> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +define signext i32 @WithoutParamArea2(i32 signext %a) local_unnamed_addr
>> #0 {
>> +entry:
>> +  %call = tail call signext i32 @eightargs(i32 signext %a, i32 signext
>> %a, i32 signext %a, i32 signext %a, i32 signext %a, i32 signext %a, i32
>> signext %a, i32 signext %a) #2
>> +  ret i32 %call
>> +}
>> +
>> +; If the callee has more than eight integer args or variable number of
>> args,
>> +; parameter area cannot be ommited even for ELFv2 ABI
>> +
>> +; PPC64-ELFV2-LABEL: WithParamArea1:
>> +; PPC64-ELFV2-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV2: stdu 1, -96(1)
>> +; PPC64-ELFV2: addi 1, 1, 96
>> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1-LABEL: WithParamArea1:
>> +; PPC64-ELFV1-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1: stdu 1, -112(1)
>> +; PPC64-ELFV1: addi 1, 1, 112
>> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +define signext i32 @WithParamArea1(i32 signext %a) local_unnamed_addr #0
>> {
>> +entry:
>> +  %call = tail call signext i32 (i32, ...) @varargs(i32 signext %a, i32
>> signext %a) #2
>> +  ret i32 %call
>> +}
>> +
>> +; PPC64-ELFV2-LABEL: WithParamArea2:
>> +; PPC64-ELFV2-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV2: stdu 1, -112(1)
>> +; PPC64-ELFV2: addi 1, 1, 112
>> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1-LABEL: WithParamArea2:
>> +; PPC64-ELFV1-NOT: stw {{[0-9]+}}, -{{[0-9]+}}(1)
>> +; PPC64-ELFV1: stdu 1, -128(1)
>> +; PPC64-ELFV1: addi 1, 1, 128
>> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}}, -{{[0-9]+}}(1)
>> +define signext i32 @WithParamArea2(i32 signext %a) local_unnamed_addr #0
>> {
>> +entry:
>> +  %call = tail call signext i32 @nineargs(i32 signext %a, i32 signext
>> %a, i32 signext %a, i32 signext %a, i32 signext %a, i32 signext %a, i32
>> signext %a, i32 signext %a, i32 signext %a) #2
>> +  ret i32 %call
>> +}
>> +
>> +declare signext i32 @onearg(i32 signext) local_unnamed_addr #1
>> +declare signext i32 @eightargs(i32 signext, i32 signext, i32 signext,
>> i32 signext, i32 signext, i32 signext, i32 signext, i32 signext)
>> local_unnamed_addr #1
>> +declare signext i32 @nineargs(i32 signext, i32 signext, i32 signext, i32
>> signext, i32 signext, i32 signext, i32 signext, i32 signext, i32 signext)
>> local_unnamed_addr #1
>> +declare signext i32 @varargs(i32 signext, ...) local_unnamed_addr #1
>> +
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170307/8f54b093/attachment.html>


More information about the llvm-commits mailing list