[LLVMdev] Generating a backtrace

Tue Feb 23 20:44:47 PST 2010

Well, since there was no response, I guess I'll just have to reply to myself
:)

The whole "walking the stack" thing reminds me of something else I would
like to see in LLVM. In the document on garbage collection, it says:

This so-called "shadow stack" mirrors the machine stack. Maintaining this
data structure is slower than using a stack map compiled into the executable
as constant data, but has a significant portability advantage because it
requires no special support from the target code generator, and does not
require tricky platform-specific code to crawl the machine stack.

The problem is that writing that "tricky platform-specific code" is required
if you want to have multiple threads - shadow stack modifies a global every
call frame, and that won't work in a threaded environment. Unfortunately
writing that code is also completely beyond the abilities of the typical
LLVM user. Well, it's certainly beyond me anyway.

Similarly, the assembly language reference manual has this to say about the
llvm.returnaddress and llvm.frameaddress intrinsics:

The value returned by this intrinsic is likely to be incorrect or 0 for
arguments other than zero, so it should only be used for debugging purposes.

Seems to me that these two problems are really the same - the general
inability to introspect the stack in a reliable way. I realize that this is
a hard problem in the general case, but I'd be willing to turn on some
compiler option that generated (very slightly) less efficient code, if it
would give me the ability to crawl the stack deterministically, and allow a
stack map entry to be identified for every call frame without the cost of
having to modify a global linked list every call.

This has always seemed to me like one area where the LLVM machine
abstraction is sadly incomplete. Most of the time I can blissfully generate
IR without having to think about all the gritty platform details. But when
it comes to looking at the stack, all of a sudden I'm forced to deal with
all of the chaos of the underlying architectures that, up to that point,
LLVM had so gracefully covered up for me.

On Fri, Feb 19, 2010 at 7:29 PM, Talin <viridia at gmail.com> wrote:

> After working with LLVM for several years now, one problem that remains
> unsolved is how to generate a stack backtrace for exceptions.
>
> My basic approach is simple: I have two different exception personality
> functions, a "lightweight" one that just does the bare minimum needed to
> handle exception, and a "capturing" one that (ideally) records the call
> frame information as it unwinds the stack. My motivation for doing it this
> way is that it would be too expensive to always capture call frame
> information on every exception, so instead my compiler only uses the heavier
> personality function when the exception backtrace information is actually
> going to be used.
>
> Within the personality function, there's a call to _Unwind_Backtrace(),
> which walks through the list of call frames and calls a callback for each
> one. Within the callback, I can get the value of the return address for each
> call frame using _Unwind_GetIP(). So far so good.
>
> The problem is converting those addresses into meaningful symbols. For some
> reason that I don't understand, dladdr() doesn't seem to work on
> LLVM-generated functions, even though I know those functions have full DWARF
> debugging information. If I insert a printf into my backtrace code, and
> print out the addresses of each return address I see something like this:
>
> 0x406f25
> 0x407158
> Function _Unwind_RaiseException
> 0x401359
> 0x401ead
> 0x406155
> 0x4060a6
> 0x4020c6
> Function __libc_start_main
> 0x400e09
>
> The hex values are ones where dladdr() failed to provide a function name.
> As you can see, the only functions it was able to deal with are the libc
> startup function, and _Unwind_Raise_Exception itself. Yet I know these
> functions have symbolic names, since I can step through them in gdb, set
> breakpoints, and so on.
>
> I've tried a number of other approaches: Calling dlopen(NULL) and then
> using dlsym() to try and locate __data_start so that I can then attempt to
> manually parse the DWARF debug frames to translate the return addresses into
> function names. Unfortunately, I can't seem to locate __data_start at all.
> I've also tried calling the libc backtrace() function, but it produced
> similarly useless results.
>
> The really icky part about all of this is that even if I do come up with a
> solution for these problems, I will then have to re-solve the same problems
> for each different platform that LLVM supports. I kinda wish that there was
> some LLVM intrinsic or library function that would hide all these details
> from me :)
>
> --
> -- Talin
>

-- 
-- Talin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100223/cdf34a68/attachment.html>