[LLVMdev] One way to support unwind on x86

Tue Mar 3 05:27:04 PST 2009

Hi Duncan,

On Tue, Mar 3, 2009 at 10:26 AM, Duncan Sands <baldrick at free.fr> wrote:
> why?  The DWARF EH info encodes two things: (1) how to restore
> registers; and (2) matching rules for exception objects, and
> what to do with them.  You will need something along the lines
> of (1) if you unwind out of the middle of functions.  As for (2),
> if you don't do any matching of exceptions against types, this is
> an extremely minimal amount of info.  In any case, it is entirely
> up to the personality function what happens here - you can always
> write your own.  Check out the C personality function (yes, C not
> C++!) in gcc/unwind-c.c to get an idea of what a small personality
> function looks like.

Yes, I need (1) to restore registers. I don't see why the type
checking can't be done in the landing pad. Yes, it is an overhead, but
not more than interpreting DWARF gives me.

I will look at the C personality function, thank you for that.

> You can link statically with libgcc.

Yes, I know, but I think 50KB is a lot. It's not a microkernel I'm
writing anymore.

Also I don't get the benefits of invoke/unwind. LLVM handles function
inlining with invoke/unwind quite nicely. I'm not sure it can do that
to the same degree with calls to libgcc?

>>  *  Unwinding should be a read-only operation regarding the
>>     stack, so I can create a stack dump in the landing pad.
>
> You can get stack dumps with gcc dwarf eh.  The Ada front-end
> does this for example - very convenient.

Very convenient. Does libgcc provide that too? I like the features of
DWARF, just not the time and space overhead.

> Take a look at libunwind (http://www.hpl.hp.com/research/linux/libunwind/).

I will, thank you.

> Another possibility, very close you yours and currently used by the vmkit
> project, is to modify all functions so they return two values, the usual
> return value and an additional boolean value indicating whether an exception
> was thrown during the call or not.  Callers then branch to an appropriate
> place based on this value.  Thus there is no special stack unwinding, it
> is just functions returning.  This adds some distributed overhead, but
> unwinding is fast.  You can always return something more complicated than
> a boolean of course.

Maybe that's an option. I think I know how to do that already by
writing a LLVM pass. I guess it would be translated to something like
this in machine code:

    call  some_function
    test  ebx, ebx      ; Check second return value
    jz    handle_unwind ; If nessescary handle unwinding

That is a fairly small overhead.

>> My idea on handling the DWARF EH actions is to compile it to machine
>> instructions. Fx. given an Instruction Pointer, unwinding a call frame
>> might be described as [...]
>
> This is a lot like what you get using the "function returns an extra
> boolean" method.

Yes, you're right and it's conceptually easier to handle.

>> Other call frames might be more complex to handle. It depends on the
>> moves needed to restore the registers of the previous call frame (the
>> caller) and to remove the current frame.
>
> If you really plan to unwind out of the middle of functions you will
> have to do magic to restore registers.  Do you plan to use the dwarf
> frame moves info for this?

Yes. The information is available. It is stored as a DWARF virtual
machine (CFA instructions). I "just" need to translate this to machine
instructions. I thought that it would be somewhat easy to hook into
LLVM to get the DWARF instructions it would write and then emit
machine instructions instead.

> If you use the "functions return an
> extra boolean" method then you don't have to do anything, since
> functions return and have registers restored in the usual way.

Yes, I see.

>> I want to tag all calls and invokes in a manner that can be easily
>> recognized by a runtime. I can tolerate a small overhead on calls. The
>> idea is to do something like this [...]
>
> This again looks a lot like the "have functions return an extra boolean"
> method.

Except there's a less overhead on a jmp instruction than test+jz
instructions. Maybe it's worth it. It is definitely easier to
implement than my first idea.

> The vmkit experience is that unwinding using their method is a lot faster
> than using the dwarf unwinder.  I don't know if the distributed overhead
> is noticeable or not.

It can be measured. Maybe I should look at the vmkit code as well.

Bjarke Walling