[llvm-commits] [llvm] r160002 - in /llvm/trunk: include/llvm/CodeGen/MachineFrameInfo.h lib/Target/X86/X86FrameLowering.cpp lib/Target/X86/X86RegisterInfo.cpp lib/Target/X86/X86RegisterInfo.h test/CodeGen/X86/alloca-align-rounding-32.ll test/Code

Alexey Samsonov samsonov at google.com
Wed Jul 11 07:47:27 PDT 2012


Hello, Chad!

Thanks for working on this! I hope this could help with
http://llvm.org/bugs/show_bug.cgi?id=11468,
that is, the issue that LLVM incorrectly handles exceptions if stack gets
realigned. While you're at it,
could you help me with some advice?

Simple reproducer is:
$ cat t2.cc
void Throw(const char& arg) {
  volatile int __attribute__((aligned(32))) n = 0; // aligned stack variable
  asm volatile ("nop" : : : "r14"); // force to save %r14 on stack
  throw arg;
}
$ clang++ -O2 -c t2.cc
$ objdump -d t2.o
<...>
0000000000000000 <_Z5ThrowRKc>:
   0: 55                   push   %rbp
   1: 48 89 e5             mov    %rsp,%rbp
   4: 48 81 e4 e0 ff ff ff and    $0xffffffffffffffe0,%rsp  <--- stack gets
realigned here
   b: 41 57                 push   %r15
   d: 41 56                 push   %r14
   f: 53                   push   %rbx
  10: 48 83 ec 28           sub    $0x28,%rsp
  14: 48 89 e3             mov    %rsp,%rbx
  17: 49 89 ff             mov    %rdi,%r15
<...>
$ readelf -wF t2.o
Contents of the .eh_frame section:
<....>
00000018 0000001c 0000001c FDE cie=00000000 pc=00000000..0000003f
   LOC                  CFA      rbx   rbp     r14   r15   ra
0000000000000000 rsp+8    u     u        u     u     c-8
0000000000000001 rsp+16   u     c-16   u     u     c-8
0000000000000004 rbp+16   u     c-16   u     u     c-8
0000000000000017 rbp+16   c-40  c-16  c-32  c-24  c-8

The last line means that the location of "r15" is "c-24" = "CFA-24" =
"rbp+16-24" = "rbp - 8".
In the same way, location of "r14" is assumed to be "rbp - 16", etc.

The problem is that X86FrameLowering::emitCalleeSavedFrameMoves emits DWARF
for .eh_frame
that defines the locations of all registers by setting offset from frame
pointer (%rbp). It is just wrong if
stack gets realigned (see the marked line in objdump). For example, gcc
emits DWARF arithmetical
expressions to calculate the correct locations of saved registers if stack
is realigned.

I wonder if we can solve this issue using the base pointer (%rbx)
introduced by your change. Here's what can be done:
1) make X86RegisterInfo::hasBasePointer() always return "true" if stack
needs realignment (even if there are no dynamic allocas).
2) after we calculate the base pointer (mov %rsp, %rbx), emit the locations
of all saved registers for .eh_frame DWARF section
as "%rbx + offset" (instead of "%rbp + offset").
3) the only exception is the "old" value of %rbp. (which is pushed to stack
before it is realigned). But, assuming that the function prologue
always starts with "push %rbp; mov %rsp, %rpb", we may state that the
address of "old" %rsp value can be loaded from address currently
stored in "%rbp". This means emitting "DW_CFA_register" register rule
instruction. It is not currently implemented in LLVM, but I think it's not
that hard to support this.

Does all this looks reasonable to you?

On Tue, Jul 10, 2012 at 9:45 PM, Chad Rosier <mcrosier at apple.com> wrote:

> Author: mcrosier
> Date: Tue Jul 10 12:45:53 2012
> New Revision: 160002
>
> URL: http://llvm.org/viewvc/llvm-project?rev=160002&view=rev
> Log:
> Add support for dynamic stack realignment in the presence of dynamic
> allocas on
> X86.  Basically, this is a reapplication of r158087 with a few fixes.
>
> Specifically, (1) the stack pointer is restored from the base pointer
> before
> popping callee-saved registers and (2) in obscure cases (see comments in
> patch)
> we must cache the value of the original stack adjustment in the prologue
> and
> apply it in the epilogue.
>
> rdar://11496434
>
> Modified:
>     llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h
>     llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
>     llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
>     llvm/trunk/lib/Target/X86/X86RegisterInfo.h
>     llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll
>     llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll
>     llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll
>
> Modified: llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h (original)
> +++ llvm/trunk/include/llvm/CodeGen/MachineFrameInfo.h Tue Jul 10 12:45:53
> 2012
> @@ -215,6 +215,10 @@
>    /// just allocate them normally.
>    bool UseLocalStackAllocationBlock;
>
> +  /// After the stack pointer has been restore from the base pointer we
> +  /// use a cached adjusment.  Currently only used for x86.
> +  int64_t BPAdj;
> +
>  public:
>      explicit MachineFrameInfo(const TargetFrameLowering &tfi) : TFI(tfi) {
>      StackSize = NumFixedObjects = OffsetAdjustment = MaxAlignment = 0;
> @@ -230,6 +234,7 @@
>      LocalFrameSize = 0;
>      LocalFrameMaxAlign = 0;
>      UseLocalStackAllocationBlock = false;
> +    BPAdj = 0;
>    }
>
>    /// hasStackObjects - Return true if there are any stack objects in this
> @@ -538,6 +543,16 @@
>
>    void setCalleeSavedInfoValid(bool v) { CSIValid = v; }
>
> +  /// setBasePtrStackAdjustment - If we're restoring the stack pointer
> from the
> +  /// base pointer, due to dynamic stack realignment + VLAs, we cache the
> +  /// number of bytes initially allocated for the stack frame.  In obscure
> +  /// cases (e.g., tail calls with byval argument and no stack
> protector), the
> +  /// stack gets adjusted outside of the prolog, but these shouldn't be
> +  /// considered when restoring from the base pointer.  Currently, this
> is only
> +  /// needed for x86.
> +  void setBasePtrStackAdjustment(int64_t adj) { BPAdj = adj; }
> +  int64_t getBasePtrStackAdjustment() const { return BPAdj; }
> +
>    /// getPristineRegs - Return a set of physical registers that are
> pristine on
>    /// entry to the MBB.
>    ///
>
> Modified: llvm/trunk/lib/Target/X86/X86FrameLowering.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86FrameLowering.cpp?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86FrameLowering.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86FrameLowering.cpp Tue Jul 10 12:45:53 2012
> @@ -650,6 +650,7 @@
>    unsigned SlotSize = RegInfo->getSlotSize();
>    unsigned FramePtr = RegInfo->getFrameRegister(MF);
>    unsigned StackPtr = RegInfo->getStackRegister();
> +  unsigned BasePtr = RegInfo->getBaseRegister();
>    DebugLoc DL;
>
>    // If we're forcing a stack realignment we can't rely on just the frame
> @@ -913,6 +914,20 @@
>      emitSPUpdate(MBB, MBBI, StackPtr, -(int64_t)NumBytes, Is64Bit,
>                   UseLEA, TII, *RegInfo);
>
> +  // If we need a base pointer, set it up here. It's whatever the value
> +  // of the stack pointer is at this point. Any variable size objects
> +  // will be allocated after this, so we can still use the base pointer
> +  // to reference locals.
> +  if (RegInfo->hasBasePointer(MF)) {
> +    // Update the frame pointer with the current stack pointer.
> +    unsigned Opc = Is64Bit ? X86::MOV64rr : X86::MOV32rr;
> +    BuildMI(MBB, MBBI, DL, TII.get(Opc), BasePtr)
> +      .addReg(StackPtr)
> +      .setMIFlag(MachineInstr::FrameSetup);
> +
> +    MFI->setBasePtrStackAdjustment(NumBytes);
> +  }
> +
>    if (( (!HasFP && NumBytes) || PushedRegs) && needsFrameMoves) {
>      // Mark end of stack pointer adjustment.
>      MCSymbol *Label = MMI.getContext().CreateTempSymbol();
> @@ -960,6 +975,7 @@
>    unsigned SlotSize = RegInfo->getSlotSize();
>    unsigned FramePtr = RegInfo->getFrameRegister(MF);
>    unsigned StackPtr = RegInfo->getStackRegister();
> +  unsigned BasePtr = RegInfo->getBaseRegister();
>
>    switch (RetOpcode) {
>    default:
> @@ -1029,6 +1045,15 @@
>    if (NumBytes || MFI->hasVarSizedObjects())
>      mergeSPUpdatesUp(MBB, MBBI, StackPtr, &NumBytes);
>
> +  // Restore the SP from the BP, if necessary.
> +  if (RegInfo->hasBasePointer(MF)) {
> +    BuildMI(MBB, MBBI, DL, TII.get(Is64Bit ? X86::MOV64rr : X86::MOV32rr),
> +            StackPtr).addReg(BasePtr);
> +
> +    // When restoring from the BP we must use a cached SP adjustment.
> +    NumBytes = MFI->getBasePtrStackAdjustment();
> +  }
> +
>    // If dynamic alloca is used, then reset esp to point to the last
> callee-saved
>    // slot before popping them off! Same applies for the case, when stack
> was
>    // realigned.
> @@ -1147,7 +1172,16 @@
>    int Offset = MFI->getObjectOffset(FI) - getOffsetOfLocalArea();
>    uint64_t StackSize = MFI->getStackSize();
>
> -  if (RegInfo->needsStackRealignment(MF)) {
> +  if (RegInfo->hasBasePointer(MF)) {
> +    assert (hasFP(MF) && "VLAs and dynamic stack realign, but no FP?!");
> +    if (FI < 0) {
> +      // Skip the saved EBP.
> +      return Offset + RegInfo->getSlotSize();
> +    } else {
> +      assert((-(Offset + StackSize)) % MFI->getObjectAlignment(FI) == 0);
> +      return Offset + StackSize;
> +    }
> +  } else if (RegInfo->needsStackRealignment(MF)) {
>      if (FI < 0) {
>        // Skip the saved EBP.
>        return Offset + RegInfo->getSlotSize();
> @@ -1178,9 +1212,14 @@
>    const X86RegisterInfo *RegInfo =
>        static_cast<const
> X86RegisterInfo*>(MF.getTarget().getRegisterInfo());
>    // We can't calculate offset from frame pointer if the stack is
> realigned,
> -  // so enforce usage of stack pointer.
> -  FrameReg = (RegInfo->needsStackRealignment(MF)) ?
> -    RegInfo->getStackRegister() : RegInfo->getFrameRegister(MF);
> +  // so enforce usage of stack/base pointer.  The base pointer is used
> when we
> +  // have dynamic allocas in addition to dynamic realignment.
> +  if (RegInfo->hasBasePointer(MF))
> +    FrameReg = RegInfo->getBaseRegister();
> +  else if (RegInfo->needsStackRealignment(MF))
> +    FrameReg = RegInfo->getStackRegister();
> +  else
> +    FrameReg = RegInfo->getFrameRegister(MF);
>    return getFrameIndexOffset(MF, FI);
>  }
>
> @@ -1317,6 +1356,10 @@
>             "Slot for EBP register must be last in order to be found!");
>      (void)FrameIdx;
>    }
> +
> +  // Spill the BasePtr if it's used.
> +  if (RegInfo->hasBasePointer(MF))
> +    MF.getRegInfo().setPhysRegUsed(RegInfo->getBaseRegister());
>  }
>
>  static bool
>
> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp (original)
> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.cpp Tue Jul 10 12:45:53 2012
> @@ -50,6 +50,10 @@
>                             " needed for the function."),
>                   cl::init(false), cl::Hidden);
>
> +cl::opt<bool>
> +EnableBasePointer("x86-use-base-pointer", cl::Hidden, cl::init(true),
> +          cl::desc("Enable use of a base pointer for complex stack
> frames"));
> +
>  X86RegisterInfo::X86RegisterInfo(X86TargetMachine &tm,
>                                   const TargetInstrInfo &tii)
>    : X86GenRegisterInfo(tm.getSubtarget<X86Subtarget>().is64Bit()
> @@ -68,10 +72,12 @@
>      SlotSize = 8;
>      StackPtr = X86::RSP;
>      FramePtr = X86::RBP;
> +    BasePtr = X86::RBX;
>    } else {
>      SlotSize = 4;
>      StackPtr = X86::ESP;
>      FramePtr = X86::EBP;
> +    BasePtr = X86::EBX;
>    }
>  }
>
> @@ -290,6 +296,20 @@
>        Reserved.set(*I);
>    }
>
> +  // Set the base-pointer register and its aliases as reserved if needed.
> +  if (hasBasePointer(MF)) {
> +    CallingConv::ID CC = MF.getFunction()->getCallingConv();
> +    const uint32_t* RegMask = getCallPreservedMask(CC);
> +    if (MachineOperand::clobbersPhysReg(RegMask, getBaseRegister()))
> +      report_fatal_error(
> +        "Stack realignment in presence of dynamic allocas is not
> supported with"
> +        "this calling convention.");
> +
> +    Reserved.set(getBaseRegister());
> +    for (MCSubRegIterator I(getBaseRegister(), this); I.isValid(); ++I)
> +      Reserved.set(*I);
> +  }
> +
>    // Mark the segment registers as reserved.
>    Reserved.set(X86::CS);
>    Reserved.set(X86::SS);
> @@ -340,10 +360,36 @@
>  // Stack Frame Processing methods
>
>  //===----------------------------------------------------------------------===//
>
> +bool X86RegisterInfo::hasBasePointer(const MachineFunction &MF) const {
> +   const MachineFrameInfo *MFI = MF.getFrameInfo();
> +
> +   if (!EnableBasePointer)
> +     return false;
> +
> +   // When we need stack realignment and there are dynamic allocas, we
> can't
> +   // reference off of the stack pointer, so we reserve a base pointer.
> +   if (needsStackRealignment(MF) && MFI->hasVarSizedObjects())
> +     return true;
> +
> +   return false;
> +}
> +
>  bool X86RegisterInfo::canRealignStack(const MachineFunction &MF) const {
>    const MachineFrameInfo *MFI = MF.getFrameInfo();
> -  return (MF.getTarget().Options.RealignStack &&
> -          !MFI->hasVarSizedObjects());
> +  const MachineRegisterInfo *MRI = &MF.getRegInfo();
> +  if (!MF.getTarget().Options.RealignStack)
> +    return false;
> +
> +  // Stack realignment requires a frame pointer.  If we already started
> +  // register allocation with frame pointer elimination, it is too late
> now.
> +  if (!MRI->canReserveReg(FramePtr))
> +    return false;
> +
> +  // If a base pointer is necessary.  Check that it isn't too late to
> reserve
> +  // it.
> +  if (MFI->hasVarSizedObjects())
> +    return MRI->canReserveReg(BasePtr);
> +  return true;
>  }
>
>  bool X86RegisterInfo::needsStackRealignment(const MachineFunction &MF)
> const {
> @@ -353,13 +399,6 @@
>    bool requiresRealignment = ((MFI->getMaxAlignment() > StackAlign) ||
>                                 F->hasFnAttr(Attribute::StackAlignment));
>
> -  // FIXME: Currently we don't support stack realignment for functions
> with
> -  //        variable-sized allocas.
> -  // FIXME: It's more complicated than this...
> -  if (0 && requiresRealignment && MFI->hasVarSizedObjects())
> -    report_fatal_error(
> -      "Stack realignment in presence of dynamic allocas is not
> supported");
> -
>    // If we've requested that we force align the stack do so now.
>    if (ForceStackAlign)
>      return canRealignStack(MF);
> @@ -499,7 +538,9 @@
>
>    unsigned Opc = MI.getOpcode();
>    bool AfterFPPop = Opc == X86::TAILJMPm64 || Opc == X86::TAILJMPm;
> -  if (needsStackRealignment(MF))
> +  if (hasBasePointer(MF))
> +    BasePtr = (FrameIndex < 0 ? FramePtr : getBaseRegister());
> +  else if (needsStackRealignment(MF))
>      BasePtr = (FrameIndex < 0 ? FramePtr : StackPtr);
>    else if (AfterFPPop)
>      BasePtr = StackPtr;
>
> Modified: llvm/trunk/lib/Target/X86/X86RegisterInfo.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86RegisterInfo.h?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Target/X86/X86RegisterInfo.h (original)
> +++ llvm/trunk/lib/Target/X86/X86RegisterInfo.h Tue Jul 10 12:45:53 2012
> @@ -50,6 +50,11 @@
>    ///
>    unsigned FramePtr;
>
> +  /// BasePtr - X86 physical register used as a base ptr in complex stack
> +  /// frames. I.e., when we need a 3rd base, not just SP and FP, due to
> +  /// variable size stack objects.
> +  unsigned BasePtr;
> +
>  public:
>    X86RegisterInfo(X86TargetMachine &tm, const TargetInstrInfo &tii);
>
> @@ -106,6 +111,8 @@
>    /// register scavenger to determine what registers are free.
>    BitVector getReservedRegs(const MachineFunction &MF) const;
>
> +  bool hasBasePointer(const MachineFunction &MF) const;
> +
>    bool canRealignStack(const MachineFunction &MF) const;
>
>    bool needsStackRealignment(const MachineFunction &MF) const;
> @@ -123,6 +130,7 @@
>    // Debug information queries.
>    unsigned getFrameRegister(const MachineFunction &MF) const;
>    unsigned getStackRegister() const { return StackPtr; }
> +  unsigned getBaseRegister() const { return BasePtr; }
>    // FIXME: Move to FrameInfok
>    unsigned getSlotSize() const { return SlotSize; }
>
>
> Modified: llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/alloca-align-rounding-32.ll Tue Jul 10
> 12:45:53 2012
> @@ -15,5 +15,6 @@
>    call void @bar(<2 x i64>* %p)
>    ret void
>  ; CHECK: foo2
> +; CHECK: andl $-32, %esp
>  ; CHECK: andl $-32, %eax
>  }
>
> Modified: llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/alloca-align-rounding.ll Tue Jul 10
> 12:45:53 2012
> @@ -15,5 +15,6 @@
>    call void @bar(<2 x i64>* %p)
>    ret void
>  ; CHECK: foo2
> +; CHECK: andq $-32, %rsp
>  ; CHECK: andq $-32, %rax
>  }
>
> Modified: llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll?rev=160002&r1=160001&r2=160002&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll (original)
> +++ llvm/trunk/test/CodeGen/X86/force-align-stack-alloca.ll Tue Jul 10
> 12:45:53 2012
> @@ -17,10 +17,15 @@
>
>  define i64 @g(i32 %i) nounwind {
>  ; CHECK: g:
> -; CHECK:      pushl
> +; CHECK:      pushl  %ebp
>  ; CHECK-NEXT: movl   %esp, %ebp
> +; CHECK-NEXT: andl   $-32, %esp
>  ; CHECK-NEXT: pushl
> -; CHECK-NEXT: subl   $20, %esp
> +; CHECK-NEXT: pushl
> +; CHECK-NEXT: subl   $24, %esp
> +;
> +; Now setup the base pointer (%ebx).
> +; CHECK-NEXT: movl   %esp, %ebx
>  ; CHECK-NOT:         {{[^ ,]*}}, %esp
>  ;
>  ; The next adjustment of the stack is due to the alloca.
> @@ -41,12 +46,18 @@
>  ; CHECK-NEXT: addl   $32, %esp
>  ; CHECK-NOT:         {{[^ ,]*}}, %esp
>  ;
> -; Finally we nede to restore %esp from %ebp, the alloca prevents us from
> -; restoring it directly.
> +; Restore %esp from %ebx (base pointer) so we can pop the callee-saved
> +; registers.  This is the state prior to the allocation of VLAs.
>  ; CHECK-NOT:  popl
> -; CHECK:      leal   -4(%ebp), %esp
> +; CHECK:      movl   %ebx, %esp
> +; CHECK-NEXT: addl   $24, %esp
>  ; CHECK-NEXT: popl
>  ; CHECK-NEXT: popl
> +;
> +; Finally we need to restore %esp from %ebp due to dynamic stack
> +; realignment.
> +; CHECK-NEXT: movl   %ebp, %esp
> +; CHECK-NEXT: popl   %ebp
>  ; CHECK-NEXT: ret
>
>  entry:
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>



-- 
Alexey Samsonov, MSK
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120711/c000c0a2/attachment.html>


More information about the llvm-commits mailing list