[llvm-commits] [llvm] r160002 - in /llvm/trunk: include/llvm/CodeGen/MachineFrameInfo.h lib/Target/X86/X86FrameLowering.cpp lib/Target/X86/X86RegisterInfo.cpp lib/Target/X86/X86RegisterInfo.h test/CodeGen/X86/alloca-align-rounding-32.ll test/Code

Chad Rosier mcrosier at apple.com
Tue Jul 31 09:55:05 PDT 2012


Hi Alexey,
Yes, this change is definitely the problem.  I used EBX as the base pointer because it appeared to work with all the calling conventions, but it can be changed in X86RegisterInfo.cpp.  If there is callee-saved register that isn't used by any ABI, then that would be the ideal choice.  Otherwise, I think the solution would be to probe the subtarget and ask for the base pointer register.  If that sounds reasonable I'll start working on a patch.

 Chad

On Jul 31, 2012, at 7:41 AM, Alexey Samsonov wrote:

> Hi Chad,
> 
> I'm not quite sure in the source of problem, and currently don't have a small reproducer, but would share my concerns here for now.
> 
> I've encountered problems while testing 32-bit code, where Global Offset Table (GOT) was overwritten with garbage, which lead to
> segfaults when trying to run functions from PLT. I think this may be connected with this patch, as there is a comment in X86ISelLowering.cpp:
>     // ELF / PIC requires GOT in the EBX register before function calls via PLT
>     // GOT pointer.
> 
> Here are extracts from function that spoils global offset table contents (there is a stack realignment, and dynamic allocas there,
> so we would use %ebx to reference stack variables):
> 
>  0xde635aa0 <+0>:     push   %ebp
>   0xde635aa1 <+1>:     mov    %esp,%ebp
>   0xde635aa3 <+3>:     push   %ebx
> <...>
>   0xde635aa6 <+6>:     and    $0xffffffe0,%esp
>   0xde635aac <+12>:    sub    $0x2e0,%esp
>   0xde635ab2 <+18>:    mov    %esp,%ebx   <------ stack pointer is saved to %ebx
> 
> On the other hand, %ebx is still used to reference GOT for "function at plt" functions, and the address of GOT
> is explicitly copied to %ebx from %esi before "function at plt" calls:
> 
> <...>
>   0xde635e06 <+870>:   mov    %esi,%ebx    <--------------- %ebx is overwritten here
>   0xde635e08 <+872>:   call   0xde60f7d0 <_ZSt3minIiERKT_S2_S2_ at plt>
> 
> After that, if we use %ebx to reference local variables, we're in trouble, as we're overwriting GOT with garbage:
> <...>
>   0xde635e3a <+922>:   mov    %edi,%eax
>   0xde635e3c <+924>:   shr    $0x3,%eax
>   0xde635e3f <+927>:   mov    %eax,0x24(%ebx)   <--------- GOT is filled with garbage.
> 
> 
> On Tue, Jul 10, 2012 at 9:45 PM, Chad Rosier <mcrosier at apple.com> wrote:
> Author: mcrosier
> Date: Tue Jul 10 12:45:53 2012
> New Revision: 160002
> 
> URL: http://llvm.org/viewvc/llvm-project?rev=160002&view=rev
> Log:
> Add support for dynamic stack realignment in the presence of dynamic allocas on
> X86.  Basically, this is a reapplication of r158087 with a few fixes.
> 
> Specifically, (1) the stack pointer is restored from the base pointer before
> popping callee-saved registers and (2) in obscure cases (see comments in patch)
> we must cache the value of the original stack adjustment in the prologue and
> apply it in the epilogue.
> 
> rdar://11496434
> 
> Modified:
>     llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h
>     llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>     llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
>     llvm/trunk/lib/Target/X86/X86RegisterInfo.h
>     llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll
>     llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll
>     llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll
> 
> Modified: llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h Tue Jul 10 12:45:53 2012
> @@ -215,6 +215,10 @@
>    /// just allocate them normally.
>    bool UseLocalStackAllocationBlock;
> 
> +  /// After the stack pointer has been restore from the base pointer we
> +  /// use a cached adjusment.  Currently only used for x86.
> +  int64_t BPAdj;
> +
>  public:
>      explicit MachineFrameInfo(const TargetFrameLowering &tfi) : TFI(tfi) {
>      StackSize = NumFixedObjects = OffsetAdjustment = MaxAlignment = 0;
> @@ -230,6 +234,7 @@
>      LocalFrameSize = 0;
>      LocalFrameMaxAlign = 0;
>      UseLocalStackAllocationBlock = false;
> +    BPAdj = 0;
>    }
> 
>    /// hasStackObjects - Return true if there are any stack objects in this
> @@ -538,6 +543,16 @@
> 
>    void setCalleeSavedInfoValid(bool v) { CSIValid = v; }
> 
> +  /// setBasePtrStackAdjustment - If we're restoring the stack pointer from the
> +  /// base pointer, due to dynamic stack realignment + VLAs, we cache the
> +  /// number of bytes initially allocated for the stack frame.  In obscure
> +  /// cases (e.g., tail calls with byval argument and no stack protector), the
> +  /// stack gets adjusted outside of the prolog, but these shouldn't be
> +  /// considered when restoring from the base pointer.  Currently, this is only
> +  /// needed for x86.
> +  void setBasePtrStackAdjustment(int64_t adj) { BPAdj = adj; }
> +  int64_t getBasePtrStackAdjustment() const { return BPAdj; }
> +
>    /// getPristineRegs - Return a set of physical registers that are pristine on
>    /// entry to the MBB.
>    ///
> 
> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.cpp?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Tue Jul 10 12:45:53 2012
> @@ -650,6 +650,7 @@
>    unsigned SlotSize = RegInfo->getSlotSize();
>    unsigned FramePtr = RegInfo->getFrameRegister(MF);
>    unsigned StackPtr = RegInfo->getStackRegister();
> +  unsigned BasePtr = RegInfo->getBaseRegister();
>    DebugLoc DL;
> 
>    // If we're forcing a stack realignment we can't rely on just the frame
> @@ -913,6 +914,20 @@
>      emitSPUpdate(MBB, MBBI, StackPtr, -(int64_t)NumBytes, Is64Bit,
>                   UseLEA, TII, *RegInfo);
> 
> +  // If we need a base pointer, set it up here. It's whatever the value
> +  // of the stack pointer is at this point. Any variable size objects
> +  // will be allocated after this, so we can still use the base pointer
> +  // to reference locals.
> +  if (RegInfo->hasBasePointer(MF)) {
> +    // Update the frame pointer with the current stack pointer.
> +    unsigned Opc = Is64Bit ? X86::MOV64rr : X86::MOV32rr;
> +    BuildMI(MBB, MBBI, DL, TII.get(Opc), BasePtr)
> +      .addReg(StackPtr)
> +      .setMIFlag(MachineInstr::FrameSetup);
> +
> +    MFI->setBasePtrStackAdjustment(NumBytes);
> +  }
> +
>    if (( (!HasFP && NumBytes) || PushedRegs) && needsFrameMoves) {
>      // Mark end of stack pointer adjustment.
>      MCSymbol *Label = MMI.getContext().CreateTempSymbol();
> @@ -960,6 +975,7 @@
>    unsigned SlotSize = RegInfo->getSlotSize();
>    unsigned FramePtr = RegInfo->getFrameRegister(MF);
>    unsigned StackPtr = RegInfo->getStackRegister();
> +  unsigned BasePtr = RegInfo->getBaseRegister();
> 
>    switch (RetOpcode) {
>    default:
> @@ -1029,6 +1045,15 @@
>    if (NumBytes || MFI->hasVarSizedObjects())
>      mergeSPUpdatesUp(MBB, MBBI, StackPtr, &NumBytes);
> 
> +  // Restore the SP from the BP, if necessary.
> +  if (RegInfo->hasBasePointer(MF)) {
> +    BuildMI(MBB, MBBI, DL, TII.get(Is64Bit ? X86::MOV64rr : X86::MOV32rr),
> +            StackPtr).addReg(BasePtr);
> +
> +    // When restoring from the BP we must use a cached SP adjustment.
> +    NumBytes = MFI->getBasePtrStackAdjustment();
> +  }
> +
>    // If dynamic alloca is used, then reset esp to point to the last callee-saved
>    // slot before popping them off! Same applies for the case, when stack was
>    // realigned.
> @@ -1147,7 +1172,16 @@
>    int Offset = MFI->getObjectOffset(FI) - getOffsetOfLocalArea();
>    uint64_t StackSize = MFI->getStackSize();
> 
> -  if (RegInfo->needsStackRealignment(MF)) {
> +  if (RegInfo->hasBasePointer(MF)) {
> +    assert (hasFP(MF) && "VLAs and dynamic stack realign, but no FP?!");
> +    if (FI < 0) {
> +      // Skip the saved EBP.
> +      return Offset + RegInfo->getSlotSize();
> +    } else {
> +      assert((-(Offset + StackSize)) % MFI->getObjectAlignment(FI) == 0);
> +      return Offset + StackSize;
> +    }
> +  } else if (RegInfo->needsStackRealignment(MF)) {
>      if (FI < 0) {
>        // Skip the saved EBP.
>        return Offset + RegInfo->getSlotSize();
> @@ -1178,9 +1212,14 @@
>    const X86RegisterInfo *RegInfo =
>        static_cast<const X86RegisterInfo*>(MF.getTarget().getRegisterInfo());
>    // We can't calculate offset from frame pointer if the stack is realigned,
> -  // so enforce usage of stack pointer.
> -  FrameReg = (RegInfo->needsStackRealignment(MF)) ?
> -    RegInfo->getStackRegister() : RegInfo->getFrameRegister(MF);
> +  // so enforce usage of stack/base pointer.  The base pointer is used when we
> +  // have dynamic allocas in addition to dynamic realignment.
> +  if (RegInfo->hasBasePointer(MF))
> +    FrameReg = RegInfo->getBaseRegister();
> +  else if (RegInfo->needsStackRealignment(MF))
> +    FrameReg = RegInfo->getStackRegister();
> +  else
> +    FrameReg = RegInfo->getFrameRegister(MF);
>    return getFrameIndexOffset(MF, FI);
>  }
> 
> @@ -1317,6 +1356,10 @@
>             "Slot for EBP register must be last in order to be found!");
>      (void)FrameIdx;
>    }
> +
> +  // Spill the BasePtr if it's used.
> +  if (RegInfo->hasBasePointer(MF))
> +    MF.getRegInfo().setPhysRegUsed(RegInfo->getBaseRegister());
>  }
> 
>  static bool
> 
> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Tue Jul 10 12:45:53 2012
> @@ -50,6 +50,10 @@
>                             " needed for the function."),
>                   cl::init(false), cl::Hidden);
> 
> +cl::opt<bool>
> +EnableBasePointer("x86-use-base-pointer", cl::Hidden, cl::init(true),
> +          cl::desc("Enable use of a base pointer for complex stack frames"));
> +
>  X86RegisterInfo::X86RegisterInfo(X86TargetMachine &tm,
>                                   const TargetInstrInfo &tii)
>    : X86GenRegisterInfo(tm.getSubtarget<X86Subtarget>().is64Bit()
> @@ -68,10 +72,12 @@
>      SlotSize = 8;
>      StackPtr = X86::RSP;
>      FramePtr = X86::RBP;
> +    BasePtr = X86::RBX;
>    } else {
>      SlotSize = 4;
>      StackPtr = X86::ESP;
>      FramePtr = X86::EBP;
> +    BasePtr = X86::EBX;
>    }
>  }
> 
> @@ -290,6 +296,20 @@
>        Reserved.set(*I);
>    }
> 
> +  // Set the base-pointer register and its aliases as reserved if needed.
> +  if (hasBasePointer(MF)) {
> +    CallingConv::ID CC = MF.getFunction()->getCallingConv();
> +    const uint32_t* RegMask = getCallPreservedMask(CC);
> +    if (MachineOperand::clobbersPhysReg(RegMask, getBaseRegister()))
> +      report_fatal_error(
> +        "Stack realignment in presence of dynamic allocas is not supported with"
> +        "this calling convention.");
> +
> +    Reserved.set(getBaseRegister());
> +    for (MCSubRegIterator I(getBaseRegister(), this); I.isValid(); ++I)
> +      Reserved.set(*I);
> +  }
> +
>    // Mark the segment registers as reserved.
>    Reserved.set(X86::CS);
>    Reserved.set(X86::SS);
> @@ -340,10 +360,36 @@
>  // Stack Frame Processing methods
>  //===----------------------------------------------------------------------===//
> 
> +bool X86RegisterInfo::hasBasePointer(const MachineFunction &MF) const {
> +   const MachineFrameInfo *MFI = MF.getFrameInfo();
> +
> +   if (!EnableBasePointer)
> +     return false;
> +
> +   // When we need stack realignment and there are dynamic allocas, we can't
> +   // reference off of the stack pointer, so we reserve a base pointer.
> +   if (needsStackRealignment(MF) && MFI->hasVarSizedObjects())
> +     return true;
> +
> +   return false;
> +}
> +
>  bool X86RegisterInfo::canRealignStack(const MachineFunction &MF) const {
>    const MachineFrameInfo *MFI = MF.getFrameInfo();
> -  return (MF.getTarget().Options.RealignStack &&
> -          !MFI->hasVarSizedObjects());
> +  const MachineRegisterInfo *MRI = &MF.getRegInfo();
> +  if (!MF.getTarget().Options.RealignStack)
> +    return false;
> +
> +  // Stack realignment requires a frame pointer.  If we already started
> +  // register allocation with frame pointer elimination, it is too late now.
> +  if (!MRI->canReserveReg(FramePtr))
> +    return false;
> +
> +  // If a base pointer is necessary.  Check that it isn't too late to reserve
> +  // it.
> +  if (MFI->hasVarSizedObjects())
> +    return MRI->canReserveReg(BasePtr);
> +  return true;
>  }
> 
>  bool X86RegisterInfo::needsStackRealignment(const MachineFunction &MF) const {
> @@ -353,13 +399,6 @@
>    bool requiresRealignment = ((MFI->getMaxAlignment() > StackAlign) ||
>                                 F->hasFnAttr(Attribute::StackAlignment));
> 
> -  // FIXME: Currently we don't support stack realignment for functions with
> -  //        variable-sized allocas.
> -  // FIXME: It's more complicated than this...
> -  if (0 && requiresRealignment && MFI->hasVarSizedObjects())
> -    report_fatal_error(
> -      "Stack realignment in presence of dynamic allocas is not supported");
> -
>    // If we've requested that we force align the stack do so now.
>    if (ForceStackAlign)
>      return canRealignStack(MF);
> @@ -499,7 +538,9 @@
> 
>    unsigned Opc = MI.getOpcode();
>    bool AfterFPPop = Opc == X86::TAILJMPm64 || Opc == X86::TAILJMPm;
> -  if (needsStackRealignment(MF))
> +  if (hasBasePointer(MF))
> +    BasePtr = (FrameIndex < 0 ? FramePtr : getBaseRegister());
> +  else if (needsStackRealignment(MF))
>      BasePtr = (FrameIndex < 0 ? FramePtr : StackPtr);
>    else if (AfterFPPop)
>      BasePtr = StackPtr;
> 
> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.h?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.h (original)
> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.h Tue Jul 10 12:45:53 2012
> @@ -50,6 +50,11 @@
>    ///
>    unsigned FramePtr;
> 
> +  /// BasePtr - X86 physical register used as a base ptr in complex stack
> +  /// frames. I.e., when we need a 3rd base, not just SP and FP, due to
> +  /// variable size stack objects.
> +  unsigned BasePtr;
> +
>  public:
>    X86RegisterInfo(X86TargetMachine &tm, const TargetInstrInfo &tii);
> 
> @@ -106,6 +111,8 @@
>    /// register scavenger to determine what registers are free.
>    BitVector getReservedRegs(const MachineFunction &MF) const;
> 
> +  bool hasBasePointer(const MachineFunction &MF) const;
> +
>    bool canRealignStack(const MachineFunction &MF) const;
> 
>    bool needsStackRealignment(const MachineFunction &MF) const;
> @@ -123,6 +130,7 @@
>    // Debug information queries.
>    unsigned getFrameRegister(const MachineFunction &MF) const;
>    unsigned getStackRegister() const { return StackPtr; }
> +  unsigned getBaseRegister() const { return BasePtr; }
>    // FIXME: Move to FrameInfok
>    unsigned getSlotSize() const { return SlotSize; }
> 
> 
> Modified: llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll Tue Jul 10 12:45:53 2012
> @@ -15,5 +15,6 @@
>    call void @bar(<2 x i64>* %p)
>    ret void
>  ; CHECK: foo2
> +; CHECK: andl $-32, %esp
>  ; CHECK: andl $-32, %eax
>  }
> 
> Modified: llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll Tue Jul 10 12:45:53 2012
> @@ -15,5 +15,6 @@
>    call void @bar(<2 x i64>* %p)
>    ret void
>  ; CHECK: foo2
> +; CHECK: andq $-32, %rsp
>  ; CHECK: andq $-32, %rax
>  }
> 
> Modified: llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll?rev=160002&r1=160001&r2=160002&view=diff
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll Tue Jul 10 12:45:53 2012
> @@ -17,10 +17,15 @@
> 
>  define i64 @g(i32 %i) nounwind {
>  ; CHECK: g:
> -; CHECK:      pushl
> +; CHECK:      pushl  %ebp
>  ; CHECK-NEXT: movl   %esp, %ebp
> +; CHECK-NEXT: andl   $-32, %esp
>  ; CHECK-NEXT: pushl
> -; CHECK-NEXT: subl   $20, %esp
> +; CHECK-NEXT: pushl
> +; CHECK-NEXT: subl   $24, %esp
> +;
> +; Now setup the base pointer (%ebx).
> +; CHECK-NEXT: movl   %esp, %ebx
>  ; CHECK-NOT:         {{[^ ,]*}}, %esp
>  ;
>  ; The next adjustment of the stack is due to the alloca.
> @@ -41,12 +46,18 @@
>  ; CHECK-NEXT: addl   $32, %esp
>  ; CHECK-NOT:         {{[^ ,]*}}, %esp
>  ;
> -; Finally we nede to restore %esp from %ebp, the alloca prevents us from
> -; restoring it directly.
> +; Restore %esp from %ebx (base pointer) so we can pop the callee-saved
> +; registers.  This is the state prior to the allocation of VLAs.
>  ; CHECK-NOT:  popl
> -; CHECK:      leal   -4(%ebp), %esp
> +; CHECK:      movl   %ebx, %esp
> +; CHECK-NEXT: addl   $24, %esp
>  ; CHECK-NEXT: popl
>  ; CHECK-NEXT: popl
> +;
> +; Finally we need to restore %esp from %ebp due to dynamic stack
> +; realignment.
> +; CHECK-NEXT: movl   %ebp, %esp
> +; CHECK-NEXT: popl   %ebp
>  ; CHECK-NEXT: ret
> 
>  entry:
> 
> 
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> 
> 
> 
> -- 
> Alexey Samsonov, MSK
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120731/2bb8ecc7/attachment.html>


More information about the llvm-commits mailing list