[llvm] r296771 - [PowerPC][ELFv2ABI] Allocate parameter area on-demand to reduce stack frame size
Hal Finkel via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 8 03:15:14 PST 2017
On 03/07/2017 08:43 PM, Tim Shen via llvm-commits wrote:
> On Tue, Mar 7, 2017 at 6:21 PM Tim Shen <timshen at google.com
> <mailto:timshen at google.com>> wrote:
>
> I found the problem. In the patch, there is a check for isVarArg:
> bool HasParameterArea = !isELFv2ABI || isVarArg || CallConv ==
> CallingConv::Fast;
>
> In our internal case, the callee is indeed a var arg function, but
> the call is through a function pointer. I think this optimization
> should be off for indirect call as well.
>
>
> Oops, we had a C cast from a var arg function pointer into a
> non-var-arg function pointer in the source. We deserve the UB :/.
Do you think that we should make Clang warn on this case? A user might
not realize that on some targets this is an ABI-incompatible change.
-Hal
>
> For this failure it's not the patch's fault. I'll revert the revert
> and test it again.
>
> Sorry for the disturbance!
>
>
> On Tue, Mar 7, 2017 at 2:37 PM Tim Shen <timshen at google.com
> <mailto:timshen at google.com>> wrote:
>
> Update:
>
> The failure mode is that, malloc(10) returns a bogus address,
> like 0x3fff9cc60428. Normal addresses are like 0x1003a00b980.
> Here malloc uses TCMalloc.
>
> Still looking.
>
> On Tue, Mar 7, 2017 at 9:40 AM Tim Shen <timshen at google.com
> <mailto:timshen at google.com>> wrote:
>
> In an internal function, I found the a very early
> instruction modifies the first parameter (r3):
> ld r3,-31576(r2)
>
> which immediately clobbers r3. Succeeding operations on r3
> causes a crash.
>
> My guess would be this function assumes the parameters are
> on the stack, while the caller doesn't assume so.
>
> Still investigating.
>
> On Tue, Mar 7, 2017 at 1:59 AM Nemanja Ivanovic via
> llvm-commits <llvm-commits at lists.llvm.org
> <mailto:llvm-commits at lists.llvm.org>> wrote:
>
> Thanks Daniel and Tim,
>
> It's certainly a curious thing that this occurs. I am
> just curious, does this happen at noopt or with
> optimization? The only thing I can think of is that
> maybe this stack frame reduction isn't valid if
> mem2reg isn't run. But this is more or less a total
> shot in the dark.
>
> Nemanja.
>
> On Tue, Mar 7, 2017 at 9:20 AM, Daniel Jasper
> <djasper at google.com <mailto:djasper at google.com>> wrote:
>
> Just FYI, Tim has reverted this patch and is
> looking at reducing a test case. He should have
> one tomorrow morning.
>
> So far, he found that the first parameter is live
> but clobbered immediately in some case.
> Assembly:
> ld r3,-31576(r2)
>
> (I don't know what that means, but just wanted to
> provide as much information as I can, in case that
> makes it easy for you to spot the bug)
>
> Cheers,
> Danie
>
> On Tue, Mar 7, 2017 at 7:48 AM, Daniel Jasper
> <djasper at google.com <mailto:djasper at google.com>>
> wrote:
>
> This seems to be causing some miscompiles.
> Working on a test case.
>
> On Thu, Mar 2, 2017 at 6:39 PM, Nemanja
> Ivanovic via llvm-commits
> <llvm-commits at lists.llvm.org
> <mailto:llvm-commits at lists.llvm.org>> wrote:
>
> Author: nemanjai
> Date: Thu Mar 2 11:38:59 2017
> New Revision: 296771
>
> URL:
> http://llvm.org/viewvc/llvm-project?rev=296771&view=rev
> Log:
> [PowerPC][ELFv2ABI] Allocate parameter
> area on-demand to reduce stack frame size
>
> This patch reduces the stack frame size by
> not allocating the parameter area if
> it is not required. In the current
> implementation LowerFormalArguments_64SVR4
> already handles the parameter area, but
> LowerCall_64SVR4 does not
> (when calculating the stack frame size).
> What this patch does is make
> LowerCall_64SVR4 consistent with
> LowerFormalArguments_64SVR4.
>
> Committing on behalf of Hiroshi Inoue.
>
> Differential Revision:
> https://reviews.llvm.org/D29881
>
> Added:
> llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
> Modified:
> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>
> Modified:
> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=296771&r1=296770&r2=296771&view=diff
> ==============================================================================
> ---
> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> (original)
> +++
> llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> Thu Mar 2 11:38:59 2017
> @@ -5147,10 +5147,30 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> };
>
> const unsigned NumGPRs =
> array_lengthof(GPR);
> - const unsigned NumFPRs = 13;
> + const unsigned NumFPRs = useSoftFloat()
> ? 0 : 13;
> const unsigned NumVRs =
> array_lengthof(VR);
> const unsigned NumQFPRs = NumFPRs;
>
> + // On ELFv2, we can avoid allocating
> the parameter area if all the arguments
> + // can be passed to the callee in
> registers.
> + // For the fast calling convention,
> there is another check below.
> + // Note: keep consistent with
> LowerFormalArguments_64SVR4()
> + bool HasParameterArea = !isELFv2ABI ||
> isVarArg || CallConv == CallingConv::Fast;
> + if (!HasParameterArea) {
> + unsigned ParamAreaSize = NumGPRs *
> PtrByteSize;
> + unsigned AvailableFPRs = NumFPRs;
> + unsigned AvailableVRs = NumVRs;
> + unsigned NumBytesTmp = NumBytes;
> + for (unsigned i = 0; i != NumOps; ++i) {
> + if (Outs[i].Flags.isNest()) continue;
> + if
> (CalculateStackSlotUsed(Outs[i].VT,
> Outs[i].ArgVT, Outs[i].Flags,
> + PtrByteSize, LinkageSize, ParamAreaSize,
> + NumBytesTmp, AvailableFPRs, AvailableVRs,
> + Subtarget.hasQPX()))
> + HasParameterArea = true;
> + }
> + }
> +
> // When using the fast calling
> convention, we don't provide backing for
> // arguments that will be in registers.
> unsigned NumGPRsUsed = 0, NumFPRsUsed =
> 0, NumVRsUsed = 0;
> @@ -5218,13 +5238,18 @@ SDValue
> PPCTargetLowering::LowerCall_64S
>
> unsigned NumBytesActuallyUsed = NumBytes;
>
> - // The prolog code of the callee may
> store up to 8 GPR argument registers to
> + // In the old ELFv1 ABI,
> + // the prolog code of the callee may
> store up to 8 GPR argument registers to
> // the stack, allowing va_start to
> index over them in memory if its varargs.
> // Because we cannot tell if this is
> needed on the caller side, we have to
> // conservatively assume that it is
> needed. As such, make sure we have at
> // least enough stack space for the
> caller to store the 8 GPRs.
> - // FIXME: On ELFv2, it may be
> unnecessary to allocate the parameter area.
> - NumBytes = std::max(NumBytes,
> LinkageSize + 8 * PtrByteSize);
> + // In the ELFv2 ABI, we allocate the
> parameter area iff a callee
> + // really requires memory operands,
> e.g. a vararg function.
> + if (HasParameterArea)
> + NumBytes = std::max(NumBytes,
> LinkageSize + 8 * PtrByteSize);
> + else
> + NumBytes = LinkageSize;
>
> // Tail call needs the stack to be aligned.
> if
> (getTargetMachine().Options.GuaranteedTailCallOpt
> &&
> @@ -5443,6 +5468,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> if (CallConv == CallingConv::Fast)
> ComputePtrOff();
>
> + assert(HasParameterArea &&
> + "Parameter area must exist to pass an
> argument in memory.");
> LowerMemOpCallTo(DAG, MF, Chain, Arg,
> PtrOff, SPDiff, ArgOffset,
> true, isTailCall, false, MemOpChains,
> TailCallArguments, dl);
> @@ -5528,6 +5555,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> PtrOff = DAG.getNode(ISD::ADD,
> dl, PtrVT, PtrOff, ConstFour);
> }
>
> + assert(HasParameterArea &&
> + "Parameter area must exist to pass an
> argument in memory.");
> LowerMemOpCallTo(DAG, MF, Chain, Arg,
> PtrOff, SPDiff, ArgOffset,
> true, isTailCall, false, MemOpChains,
> TailCallArguments, dl);
> @@ -5562,6 +5591,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> // GPRs when within range. For
> now, we always put the value in both
> // locations (or even all three).
> if (isVarArg) {
> + assert(HasParameterArea &&
> + "Parameter area must exist if we have a
> varargs call.");
> // We could elide this store in
> the case where the object fits
> // entirely in R registers. Maybe
> later.
> SDValue Store =
> @@ -5594,6 +5625,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> if (CallConv == CallingConv::Fast)
> ComputePtrOff();
>
> + assert(HasParameterArea &&
> + "Parameter area must exist to pass an
> argument in memory.");
> LowerMemOpCallTo(DAG, MF, Chain, Arg,
> PtrOff, SPDiff, ArgOffset,
> true, isTailCall, true, MemOpChains,
> TailCallArguments, dl);
> @@ -5614,6 +5647,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> case MVT::v4i1: {
> bool IsF32 =
> Arg.getValueType().getSimpleVT().SimpleTy
> == MVT::v4f32;
> if (isVarArg) {
> + assert(HasParameterArea &&
> + "Parameter area must exist if we have a
> varargs call.");
> // We could elide this store in
> the case where the object fits
> // entirely in R registers. Maybe
> later.
> SDValue Store =
> @@ -5646,6 +5681,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> if (CallConv == CallingConv::Fast)
> ComputePtrOff();
>
> + assert(HasParameterArea &&
> + "Parameter area must exist to pass an
> argument in memory.");
> LowerMemOpCallTo(DAG, MF, Chain, Arg,
> PtrOff, SPDiff, ArgOffset,
> true, isTailCall, true, MemOpChains,
> TailCallArguments, dl);
> @@ -5660,7 +5697,8 @@ SDValue
> PPCTargetLowering::LowerCall_64S
> }
> }
>
> - assert(NumBytesActuallyUsed == ArgOffset);
> + assert((!HasParameterArea ||
> NumBytesActuallyUsed == ArgOffset) &&
> + "mismatch in size of parameter
> area");
> (void)NumBytesActuallyUsed;
>
> if (!MemOpChains.empty())
>
> Added:
> llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/stacksize.ll?rev=296771&view=auto
> ==============================================================================
> ---
> llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
> (added)
> +++
> llvm/trunk/test/CodeGen/PowerPC/stacksize.ll
> Thu Mar 2 11:38:59 2017
> @@ -0,0 +1,86 @@
> +; For ELFv2 ABI, we can avoid allocating
> the parameter area in the stack frame of
> the caller function
> +; if all the arguments can be passed to
> the callee in registers.
> +; For ELFv1 ABI, we always need to
> allocate the parameter area.
> +
> +; Tests for ELFv2 ABI
> +; RUN: llc -verify-machineinstrs
> -mtriple=powerpc64le-unknown-linux-gnu
> -target-abi elfv2 < %s | FileCheck %s
> -check-prefix=PPC64-ELFV2
> +; RUN: llc -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu
> -target-abi elfv2 < %s | FileCheck %s
> -check-prefix=PPC64-ELFV2
> +
> +; Tests for ELFv1 ABI
> +; RUN: llc -verify-machineinstrs
> -mtriple=powerpc64le-unknown-linux-gnu
> -target-abi elfv1 < %s | FileCheck %s
> -check-prefix=PPC64-ELFV1
> +; RUN: llc -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu
> -target-abi elfv1 < %s | FileCheck %s
> -check-prefix=PPC64-ELFV1
> +
> +; If the callee has at most eight integer
> args, parameter area can be ommited for
> ELFv2 ABI.
> +
> +; PPC64-ELFV2-LABEL: WithoutParamArea1:
> +; PPC64-ELFV2-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV2: stdu 1, -32(1)
> +; PPC64-ELFV2: addi 1, 1, 32
> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1-LABEL: WithoutParamArea1:
> +; PPC64-ELFV1-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1: stdu 1, -112(1)
> +; PPC64-ELFV1: addi 1, 1, 112
> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +define signext i32 @WithoutParamArea1(i32
> signext %a) local_unnamed_addr #0 {
> +entry:
> + %call = tail call signext i32
> @onearg(i32 signext %a) #2
> + ret i32 %call
> +}
> +
> +; PPC64-ELFV2-LABEL: WithoutParamArea2:
> +; PPC64-ELFV2-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV2: stdu 1, -32(1)
> +; PPC64-ELFV2: addi 1, 1, 32
> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1-LABEL: WithoutParamArea2:
> +; PPC64-ELFV1-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1: stdu 1, -112(1)
> +; PPC64-ELFV1: addi 1, 1, 112
> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +define signext i32 @WithoutParamArea2(i32
> signext %a) local_unnamed_addr #0 {
> +entry:
> + %call = tail call signext i32
> @eightargs(i32 signext %a, i32 signext %a,
> i32 signext %a, i32 signext %a, i32
> signext %a, i32 signext %a, i32 signext
> %a, i32 signext %a) #2
> + ret i32 %call
> +}
> +
> +; If the callee has more than eight
> integer args or variable number of args,
> +; parameter area cannot be ommited even
> for ELFv2 ABI
> +
> +; PPC64-ELFV2-LABEL: WithParamArea1:
> +; PPC64-ELFV2-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV2: stdu 1, -96(1)
> +; PPC64-ELFV2: addi 1, 1, 96
> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1-LABEL: WithParamArea1:
> +; PPC64-ELFV1-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1: stdu 1, -112(1)
> +; PPC64-ELFV1: addi 1, 1, 112
> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +define signext i32 @WithParamArea1(i32
> signext %a) local_unnamed_addr #0 {
> +entry:
> + %call = tail call signext i32 (i32,
> ...) @varargs(i32 signext %a, i32 signext
> %a) #2
> + ret i32 %call
> +}
> +
> +; PPC64-ELFV2-LABEL: WithParamArea2:
> +; PPC64-ELFV2-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV2: stdu 1, -112(1)
> +; PPC64-ELFV2: addi 1, 1, 112
> +; PPC64-ELFV2-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1-LABEL: WithParamArea2:
> +; PPC64-ELFV1-NOT: stw {{[0-9]+}},
> -{{[0-9]+}}(1)
> +; PPC64-ELFV1: stdu 1, -128(1)
> +; PPC64-ELFV1: addi 1, 1, 128
> +; PPC64-ELFV1-NOT: lwz {{[0-9]+}},
> -{{[0-9]+}}(1)
> +define signext i32 @WithParamArea2(i32
> signext %a) local_unnamed_addr #0 {
> +entry:
> + %call = tail call signext i32
> @nineargs(i32 signext %a, i32 signext %a,
> i32 signext %a, i32 signext %a, i32
> signext %a, i32 signext %a, i32 signext
> %a, i32 signext %a, i32 signext %a) #2
> + ret i32 %call
> +}
> +
> +declare signext i32 @onearg(i32 signext)
> local_unnamed_addr #1
> +declare signext i32 @eightargs(i32
> signext, i32 signext, i32 signext, i32
> signext, i32 signext, i32 signext, i32
> signext, i32 signext) local_unnamed_addr #1
> +declare signext i32 @nineargs(i32
> signext, i32 signext, i32 signext, i32
> signext, i32 signext, i32 signext, i32
> signext, i32 signext, i32 signext)
> local_unnamed_addr #1
> +declare signext i32 @varargs(i32 signext,
> ...) local_unnamed_addr #1
> +
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> <mailto:llvm-commits at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
--
Hal Finkel
Lead, Compiler Technology and Programming Languages
Leadership Computing Facility
Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170308/4c33e7eb/attachment.html>
More information about the llvm-commits
mailing list