[llvm] r289638 - [PowerPC] Fix logic dealing with nop after calls (and tail-call eligibility)

Chandler Carruth via llvm-commits llvm-commits at lists.llvm.org
Thu Dec 15 23:32:58 PST 2016


Hey Hal, as discussed on IRC, this is causing a miscompile on PPC.

It appears that this miscompiles non-virtual thunk (trampoline in PPC
parlance I think?) in some cases with shared objects.

Here is the assembly of the thunk we think is leading to a segfault:

_ZThn8_<snip>:
.Lfunc_begin260:
        .loc    7 0 0
        .cfi_startproc
.Lfunc_gep260:
        addis 2, 12, .TOC.-.Lfunc_gep260 at ha
        addi 2, 2, .TOC.-.Lfunc_gep260 at l
.Lfunc_lep260:
        .localentry     _ZThn8_<snip>, .Lfunc_lep260-.Lfunc_gep260
# BB#0:
        mr 4, 6
        #DEBUG_VALUE: this <- [%X1+-16]
        #DEBUG_VALUE: delay <- [%X1+-24]
        #DEBUG_VALUE: c <- [%X1+-32]
        #DEBUG_VALUE: priority <- [%X1+-36]
        #DEBUG_VALUE: key <- [%X1+-48]
        std 3, -16(1)
        addi 3, 1, -24
        stxsdx 1, 0, 3
        std 5, -32(1)
        stw 4, -36(1)
        std 7, -48(1)
.Ltmp1549:
        ld 5, -16(1)
        addi 5, 5, -8
        lxsdx 1, 0, 3
        ld 3, -32(1)
        lwz 4, -36(1)
                                        # implicit-def: %X6
        mr 6, 4
        clrldi   6, 6, 32
        stggd 3, -56(1)                   # 8-byte Folded Spill
        mr 3, 5
        ld 5, -56(1)                    # 8-byte Folded Reload
        b _Z<snip>
        #TC_RETURNd8 _Z<snip> 0
.Ltmp1550:


And this is the IR for that trampoline:

define void @_ZThn8_<snip>(%"<snip>"*, double, %class.<snip>*, i32
zeroext, %"<snip>"*) {
  %6 = alloca %"<snip>"*, align 8
  %7 = alloca double, align 8
  %8 = alloca %class.<snip>*, align 8
  %9 = alloca i32, align 4
  %10 = alloca %"<snip>"*, align 8
  store %"<snip>"* %0, %"<snip>"** %6, align 8
  call void @llvm.dbg.declare(metadata %"<snip>"** %6, metadata
!23009, metadata !16176), !dbg !23010
  store double %1, double* %7, align 8
  call void @llvm.dbg.declare(metadata double* %7, metadata !23011,
metadata !16176), !dbg !23012
  store %class.<snip>* %2, %class.<snip>** %8, align 8
  call void @llvm.dbg.declare(metadata %class.<snip>** %8, metadata
!23013, metadata !16176), !dbg !23014
  store i32 %3, i32* %9, align 4
  call void @llvm.dbg.declare(metadata i32* %9, metadata !23015,
metadata !16176), !dbg !23016
  store %"<snip>"* %4, %"<snip>"** %10, align 8
  call void @llvm.dbg.declare(metadata %"<snip>"** %10, metadata
!23017, metadata !16176), !dbg !23018
  %11 = load %"<snip>"*, %"<snip>"** %6, align 8, !dbg !23010
  %12 = bitcast %"<snip>"* %11 to i8*, !dbg !23010
  %13 = getelementptr inbounds i8, i8* %12, i64 -8, !dbg !23010
  %14 = bitcast i8* %13 to %"<snip>"*, !dbg !23010
  %15 = load double, double* %7, align 8, !dbg !23010
  %16 = load %class.<snip>*, %class.<snip>** %8, align 8, !dbg !23010
  %17 = load i32, i32* %9, align 4, !dbg !23010
  %18 = load %"<snip>"*, %"<snip>"** %10, align 8, !dbg !23010
  tail call void @_<snip>(%"<snip>"* %14, double %15, %class.<snip>*
%16, i32 zeroext %17, %"<snip>"* %18), !dbg !23010
  ret void, !dbg !23010
}


>From the analysis Kyle did it appears to be corrupting the saved r2 in the
caller of the trampoline (which calls it indirectly through a vtable) thus
causing a corrupt r2 to be restored afterward.

If this isn't enough to understand the bug Kyle shoudl be able to help dig
into a more refined test case on Monday at the latest.

Based on our discussion on IRC, I'm reverting until we sort this out as it
is blocking us using LLVM on PPC right now.

On Tue, Dec 13, 2016 at 11:35 PM Hal Finkel via llvm-commits <
llvm-commits at lists.llvm.org> wrote:

> Author: hfinkel
> Date: Wed Dec 14 01:24:50 2016
> New Revision: 289638
>
> URL: http://llvm.org/viewvc/llvm-project?rev=289638&view=rev
> Log:
> [PowerPC] Fix logic dealing with nop after calls (and tail-call
> eligibility)
>
> This change aims to unify and correct our logic for when we need to allow
> for
> the possibility of the linker adding a TOC restoration instruction after a
> call. This comes up in two contexts:
>
>  1. When determining tail-call eligibility. If we make a tail call (i.e.
>     directly branch to a function) then there is no place for the linker
> to add
>     a TOC restoration.
>  2. When determining when we need to add a nop instruction after a call.
>     Likewise, if there is a possibility that the linker might need to add a
>     TOC restoration after a call, then we need to put a nop after the call
>     (the bl instruction).
>
> First problem: We were using similar, but different, logic to decide (1)
> and
> (2). This is just wrong. Both the resideInSameModule function (used when
> determining tail-call eligibility) and the isLocalCall function (used when
> deciding if the post-call nop is needed) were supposed to be determining
> the
> same underlying fact (i.e. might a TOC restoration be needed after the
> call).
> The same logic should be used in both places.
>
> Second problem: The logic in both places was wrong. We only know that two
> functions will share the same TOC when both functions come from the same
> section of the same object. Otherwise the linker might cause the functions
> to
> use different TOC base addresses (unless the multi-TOC linker option is
> disabled, in which case only shared-library boundaries are relevant).
> There are
> a number of factors that can cause functions to be placed in different
> sections
> or come from different objects (-ffunction-sections, explicitly-specified
> section names, COMDAT, weak linkage, etc.). All of these need to be
> checked.
> The existing logic only checked properties of the callee, but the
> properties of
> the caller must also be checked (for example, calling from a function in a
> COMDAT section means calling between sections).
>
> There was a conceptual error in the resideInSameModule function in that it
> allowed tail calls to functions with weak linkage and protected/hidden
> visibility. While protected/hidden visibility does prevent the function
> implementation from being replaced at runtime (via interposition), it does
> not
> prevent the linker from using an alternate implementation at link time
> (i.e.
> using some strong definition to replace the provided weak one during
> linking).
> If this happens, then we're still potentially looking at a required TOC
> restoration upon return.
>
> Otherwise, in general, the post-call nop is needed wherever ELF
> interposition
> needs to be supported. We don't currently support ELF interposition at the
> IR
> level (see
> http://lists.llvm.org/pipermail/llvm-dev/2016-November/107625.html
> for more information), and I don't think we should try to make it appear to
> work in the backend in spite of that fact. This will yield subtle bugs if
> interposition is attempted. As a result, regardless of whether we're in PIC
> mode, we don't assume that we need to add the nop to support the
> possibility of
> ELF interposition. However, the necessary check is in place (i.e. calling
> GV->isInterposable and TM.shouldAssumeDSOLocal) so when we have functions
> for
> which interposition is allowed at the IR level, we'll add the nop as
> necessary.
> In the mean time, we'll generate more tail calls and fewer nops when
> compiling
> position-independent code.
>
> Differential Revision: https://reviews.llvm.org/D27231
>
> Added:
>     llvm/trunk/test/CodeGen/PowerPC/ppc64-blnop.ll
> Modified:
>     llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
>     llvm/trunk/test/CodeGen/PowerPC/ppc64-sibcall.ll
>
> Modified: llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp?rev=289638&r1=289637&r2=289638&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp (original)
> +++ llvm/trunk/lib/Target/PowerPC/PPCISelLowering.cpp Wed Dec 14 01:24:50
> 2016
> @@ -3981,40 +3981,32 @@ static int CalculateTailCallSPDiff(Selec
>  static bool isFunctionGlobalAddress(SDValue Callee);
>
>  static bool
> -resideInSameModule(SDValue Callee, Reloc::Model RelMod) {
> +resideInSameSection(const Function *Caller, SDValue Callee,
> +                    const TargetMachine &TM) {
>    // If !G, Callee can be an external symbol.
>    GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee);
> -  if (!G) return false;
> +  if (!G)
> +    return false;
>
>    const GlobalValue *GV = G->getGlobal();
> -
> -  if (GV->isDeclaration()) return false;
> -
> -  switch(GV->getLinkage()) {
> -  default: llvm_unreachable("unknow linkage type");
> -  case GlobalValue::AvailableExternallyLinkage:
> -  case GlobalValue::ExternalWeakLinkage:
> +  if (!GV->isStrongDefinitionForLinker())
>      return false;
>
> -  // Callee with weak linkage is allowed if it has hidden or protected
> -  // visibility
> -  case GlobalValue::LinkOnceAnyLinkage:
> -  case GlobalValue::LinkOnceODRLinkage: // e.g. c++ inline functions
> -  case GlobalValue::WeakAnyLinkage:
> -  case GlobalValue::WeakODRLinkage:     // e.g. c++ template instantiation
> -    if (GV->hasDefaultVisibility())
> +  // Any explicitly-specified sections and section prefixes must also
> match.
> +  // Also, if we're using -ffunction-sections, then each function is
> always in
> +  // a different section (the same is true for COMDAT functions).
> +  if (TM.getFunctionSections() || GV->hasComdat() || Caller->hasComdat()
> ||
> +      GV->getSection() != Caller->getSection())
> +    return false;
> +  if (const auto *F = dyn_cast<Function>(GV)) {
> +    if (F->getSectionPrefix() != Caller->getSectionPrefix())
>        return false;
> -
> -  case GlobalValue::ExternalLinkage:
> -  case GlobalValue::InternalLinkage:
> -  case GlobalValue::PrivateLinkage:
> -    break;
>    }
>
> -  // With '-fPIC', calling default visiblity function need insert 'nop'
> after
> -  // function call, no matter that function resides in same module or
> not, so
> -  // we treat it as in different module.
> -  if (RelMod == Reloc::PIC_ && GV->hasDefaultVisibility())
> +  // If the callee might be interposed, then we can't assume the ultimate
> call
> +  // target will be in the same section.
> +  if (GV->isInterposable() &&
> +      !TM.shouldAssumeDSOLocal(*Caller->getParent(), GV))
>      return false;
>
>    return true;
> @@ -4130,11 +4122,11 @@ PPCTargetLowering::IsEligibleForTailCall
>        !isa<ExternalSymbolSDNode>(Callee))
>      return false;
>
> -  // Check if Callee resides in the same module, because for now, PPC64
> SVR4 ABI
> -  // (ELFv1/ELFv2) doesn't allow tail calls to a symbol resides in another
> -  // module.
> +  // Check if Callee resides in the same section, because for now, PPC64
> SVR4
> +  // ABI (ELFv1/ELFv2) doesn't allow tail calls to a symbol resides in
> another
> +  // section.
>    // ref: https://bugzilla.mozilla.org/show_bug.cgi?id=973977
> -  if (!resideInSameModule(Callee,
> getTargetMachine().getRelocationModel()))
> +  if (!resideInSameSection(MF.getFunction(), Callee, getTargetMachine()))
>      return false;
>
>    // TCO allows altering callee ABI, so we don't have to check further.
> @@ -4592,14 +4584,6 @@ PrepareCall(SelectionDAG &DAG, SDValue &
>    return CallOpc;
>  }
>
> -static
> -bool isLocalCall(const SDValue &Callee)
> -{
> -  if (GlobalAddressSDNode *G = dyn_cast<GlobalAddressSDNode>(Callee))
> -    return G->getGlobal()->isStrongDefinitionForLinker();
> -  return false;
> -}
> -
>  SDValue PPCTargetLowering::LowerCallResult(
>      SDValue Chain, SDValue InFlag, CallingConv::ID CallConv, bool
> isVarArg,
>      const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &dl,
> @@ -4701,6 +4685,7 @@ SDValue PPCTargetLowering::FinishCall(
>    // stack frame. If caller and callee belong to the same module (and
> have the
>    // same TOC), the NOP will remain unchanged.
>
> +  MachineFunction &MF = DAG.getMachineFunction();
>    if (!isTailCall && Subtarget.isSVR4ABI()&& Subtarget.isPPC64() &&
>        !isPatchPoint) {
>      if (CallOpc == PPCISD::BCTRL) {
> @@ -4724,11 +4709,11 @@ SDValue PPCTargetLowering::FinishCall(
>        // The address needs to go after the chain input but before the
> flag (or
>        // any other variadic arguments).
>        Ops.insert(std::next(Ops.begin()), AddTOC);
> -    } else if ((CallOpc == PPCISD::CALL) &&
> -               (!isLocalCall(Callee) ||
> -                DAG.getTarget().getRelocationModel() == Reloc::PIC_))
> +    } else if (CallOpc == PPCISD::CALL &&
> +      !resideInSameSection(MF.getFunction(), Callee, DAG.getTarget())) {
>        // Otherwise insert NOP for non-local calls.
>        CallOpc = PPCISD::CALL_NOP;
> +    }
>    }
>
>    Chain = DAG.getNode(CallOpc, dl, NodeTys, Ops);
>
> Added: llvm/trunk/test/CodeGen/PowerPC/ppc64-blnop.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc64-blnop.ll?rev=289638&view=auto
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/PowerPC/ppc64-blnop.ll (added)
> +++ llvm/trunk/test/CodeGen/PowerPC/ppc64-blnop.ll Wed Dec 14 01:24:50 2016
> @@ -0,0 +1,129 @@
> +; RUN: llc < %s -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s
> +; RUN: llc < %s -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 | FileCheck %s
> +; RUN: llc < %s -verify-machineinstrs
> -mtriple=powerpc64le-unknown-linux-gnu -mcpu=pwr8 | FileCheck %s
> +; RUN: llc < %s -relocation-model=pic -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s
> +; RUN: llc < %s -function-sections -verify-machineinstrs
> -mtriple=powerpc64-unknown-linux-gnu | FileCheck %s -check-prefix=CHECK-FS
> +; RUN: llc < %s -relocation-model=pic -verify-machineinstrs
> -mtriple=powerpc64le-unknown-linux-gnu | FileCheck %s
> +; RUN: llc < %s -function-sections -verify-machineinstrs
> -mtriple=powerpc64le-unknown-linux-gnu | FileCheck %s -check-prefix=CHECK-FS
> +
> +%class.T = type { [2 x i8] }
> +
> +define void @e_callee(%class.T* %this, i8* %c) { ret void }
> +define void @e_caller(%class.T* %this, i8* %c) {
> +  call void @e_callee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: e_caller:
> +; CHECK: bl e_callee
> +; CHECK-NOT: nop
> +
> +; CHECK-FS-LABEL: e_caller:
> +; CHECK-FS: bl e_callee
> +; CHECK-FS-NEXT: nop
> +}
> +
> +define void @e_scallee(%class.T* %this, i8* %c) section "different" { ret
> void }
> +define void @e_scaller(%class.T* %this, i8* %c) {
> +  call void @e_scallee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: e_scaller:
> +; CHECK: bl e_scallee
> +; CHECK-NEXT: nop
> +}
> +
> +define void @e_s2callee(%class.T* %this, i8* %c) { ret void }
> +define void @e_s2caller(%class.T* %this, i8* %c) section "different" {
> +  call void @e_s2callee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: e_s2caller:
> +; CHECK: bl e_s2callee
> +; CHECK-NEXT: nop
> +}
> +
> +$cd1 = comdat any
> +$cd2 = comdat any
> +
> +define void @e_ccallee(%class.T* %this, i8* %c) comdat($cd1) { ret void }
> +define void @e_ccaller(%class.T* %this, i8* %c) comdat($cd2) {
> +  call void @e_ccallee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: e_ccaller:
> +; CHECK: bl e_ccallee
> +; CHECK-NEXT: nop
> +}
> +
> +$cd = comdat any
> +
> +define void @e_c1callee(%class.T* %this, i8* %c) comdat($cd) { ret void }
> +define void @e_c1caller(%class.T* %this, i8* %c) comdat($cd) {
> +  call void @e_c1callee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: e_c1caller:
> +; CHECK: bl e_c1callee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak_odr hidden void @wo_hcallee(%class.T* %this, i8* %c) { ret
> void }
> +define void @wo_hcaller(%class.T* %this, i8* %c) {
> +  call void @wo_hcallee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: wo_hcaller:
> +; CHECK: bl wo_hcallee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak_odr protected void @wo_pcallee(%class.T* %this, i8* %c) { ret
> void }
> +define void @wo_pcaller(%class.T* %this, i8* %c) {
> +  call void @wo_pcallee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: wo_pcaller:
> +; CHECK: bl wo_pcallee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak_odr void @wo_callee(%class.T* %this, i8* %c) { ret void }
> +define void @wo_caller(%class.T* %this, i8* %c) {
> +  call void @wo_callee(%class.T* %this, i8* %c)
> +  ret void
> +
> +; CHECK-LABEL: wo_caller:
> +; CHECK: bl wo_callee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak protected void @w_pcallee(i8* %ptr) { ret void }
> +define void @w_pcaller(i8* %ptr) {
> +  call void @w_pcallee(i8* %ptr)
> +  ret void
> +
> +; CHECK-LABEL: w_pcaller:
> +; CHECK: bl w_pcallee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak hidden void @w_hcallee(i8* %ptr) { ret void }
> +define void @w_hcaller(i8* %ptr) {
> +  call void @w_hcallee(i8* %ptr)
> +  ret void
> +
> +; CHECK-LABEL: w_hcaller:
> +; CHECK: bl w_hcallee
> +; CHECK-NEXT: nop
> +}
> +
> +define weak void @w_callee(i8* %ptr) { ret void }
> +define void @w_caller(i8* %ptr) {
> +  call void @w_callee(i8* %ptr)
> +  ret void
> +
> +; CHECK-LABEL: w_caller:
> +; CHECK: bl w_callee
> +; CHECK-NEXT: nop
> +}
> +
>
> Modified: llvm/trunk/test/CodeGen/PowerPC/ppc64-sibcall.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/PowerPC/ppc64-sibcall.ll?rev=289638&r1=289637&r2=289638&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/PowerPC/ppc64-sibcall.ll (original)
> +++ llvm/trunk/test/CodeGen/PowerPC/ppc64-sibcall.ll Wed Dec 14 01:24:50
> 2016
> @@ -142,7 +142,7 @@ define void @wo_hcaller(%class.T* %this,
>    ret void
>
>  ; CHECK-SCO-LABEL: wo_hcaller:
> -; CHECK-SCO: b wo_hcallee
> +; CHECK-SCO: bl wo_hcallee
>  }
>
>  define weak_odr protected void @wo_pcallee(%class.T* %this, i8* %c) { ret
> void }
> @@ -151,7 +151,7 @@ define void @wo_pcaller(%class.T* %this,
>    ret void
>
>  ; CHECK-SCO-LABEL: wo_pcaller:
> -; CHECK-SCO: b wo_pcallee
> +; CHECK-SCO: bl wo_pcallee
>  }
>
>  define weak_odr void @wo_callee(%class.T* %this, i8* %c) { ret void }
> @@ -169,7 +169,7 @@ define void @w_pcaller(i8* %ptr) {
>    ret void
>
>  ; CHECK-SCO-LABEL: w_pcaller:
> -; CHECK-SCO: b w_pcallee
> +; CHECK-SCO: bl w_pcallee
>  }
>
>  define weak hidden void @w_hcallee(i8* %ptr) { ret void }
> @@ -178,7 +178,7 @@ define void @w_hcaller(i8* %ptr) {
>    ret void
>
>  ; CHECK-SCO-LABEL: w_hcaller:
> -; CHECK-SCO: b w_hcallee
> +; CHECK-SCO: bl w_hcallee
>  }
>
>  define weak void @w_callee(i8* %ptr) { ret void }
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20161216/bb3cb867/attachment.html>


More information about the llvm-commits mailing list