[cfe-dev] A new builtin: __builtin_stack_pointer()

Mon Nov 11 01:46:37 PST 2013

On 10 Nov 2013 at 19:09, Richard Smith wrote:

> On Sun, Nov 10, 2013 at 1:46 PM, PaX Team <pageexec at freemail.hu> wrote:
> 
> > now with that background let me try to answer your questions:
> >
> > - __builtin_frame_address is indeed good enough for this purpose (and i
> > can't
> >   find more use of the stack register in C, but maybe Behan knows of more
> > where
> >   an exact value is important)
> >
> 
> That seems like a good answer, if it works. It seems like we could choose
> to copy everything on the current stack frame into some global storage and
> back around any call to __builtin_stack_address, and thus one possible
> correct implementation would be to always return the frame pointer.

yes, the frame pointer works but please make sure that you can compute it
even with -fomit-frame-pointer (i.e., when there's no explicit hardware
register assigned for this purpose) as the i386 kernel is often compiled
with it (and which is why this manual stack walking code exists in the
first place, otherwise the stack walker 'knows' to follow ebp, the register
for the frame pointer).

> Thanks. Seems like the kernel can rely on being able to read through
> pointers that used to point to the stack because it knows that the readable
> portion of the stack never shrinks, right? (This could go wrong for
> programs using segmented stacks.)

exactly. in fact, the 'used to point to' part is technically not true
because for this i386 backtrace code the kernel knows that the 'other'
kernel stack is still valid and accessible in memory. this is because
at the time the backtrace code is called, the kernel knows that the
following events occured:

1. cpu entered the kernel (syscall, exception, etc) and is running on
   the kernel stack assigned to the current process. on i386 it's a 
   fixed size of 8kB (2 pages) and is also aligned to 8kB, its lifetime
   is that of the corresponding userland process.

2. an IRQ occured and the kernel switched to an interrupt stack. at this
   point the calling context on the process' kernel stack is all valid
   and this is the time when the kernel needs 'something' to be able to
   find it later. the closer this something is to the last activated frame
   on the process' kernel stack, the better (more faithful) the backtrace
   will be later.

3. now that the cpu is on the interrupt stack, an unexpected event occurs,
   say an unservicable page fault, or some debug facility detects something
   (lockdep violation, memory leak, whatnot). this is the time when the
   kernel's diagnostic logic wants to print a backtrace, which at this point
   includes the stack frames on both the interrupt stack *and* the process'
   kernel stack since they're all live. finding the stack frames on the
   interrupt stack is 'easy' but finding the other stack requires explicit
   management (the topic of this discussion).

now all this is very linux (and kernel context) specific so i don't know if
you really need to worry about other use cases. of course there's the more
generic gcc feature of being able to assign other registers to variables,
not just the stack pointer. i don't know if clang/llvm wants to go there
as it then brings up the whole -ffixed-REG/-fcall-saved-REG/-fcall-used-REG
business as well ;).

cheers,
  PaX Team