[LLVMdev] Alternative exception handling proposal

Thu Dec 2 02:21:11 PST 2010

Hi Bill,

> This is similar to my first proposal.

yup, I still consider your first proposal to have been basically sound.

But it also suffers from a major problem,
> which stopped that proposal dead in its tracks. Namely, you have information in
> one place which needs to be shared in two different, but possibly disjoint,
> places: the type, filters, and personality information. In order to generate the
> EH tables, you need to know this information at the throw site and at the place
> which makes the decision of which catch handler to invoke. There is no guarantee
> in your proposal that the invoke can be associated with the proper eh.selector
> call. And because of (C++) cleanups and inlining, it's the rule not the exception.

I disagree that this information is needed anywhere except the invoke.  If it
was needed arbitrarily far downstream then of course my proposal would be dead.
But it isn't!  Got an example where it is?

> Example, if you have this:
>
> invoke void @foo()
> to label %invcont unwind label %lpad
> personality @__gxx_personality_v0
> catches %struct.__fundamental_type_info_pseudo* @_ZTIi,
> %struct.__pointer_type_info_pseudo* @_ZTIPKc
>
> lpad:
> call void @bar(%A* %a) ; a cleanup
> br label %ppad
>
> ppad:
> %eh_ptr = call i8* llvm.eh.exception()
> %eh_sel = call i32 llvm.eh.selector()
> ; code to clean up.
>
> The call to @bar can insert an arbitrarily complex amount of code, including
> invokes, llvm.eh.selector calls, etc. Because there is no relationship between
> the invoke of @foo and %eh_sel in ppad, we lose that information at ppad, which
> is where we need it.

It would of course be wrong to expect eh.exception to return the original value
in ppad if you inlined an invoke via the call to @bar, and reached %ppad via the
unwind branch of that invoke because a new exception was thrown.  This is not a
problem.  Here's how gcc does it.  In fact llvm-gcc does exactly the same thing!
In lpad gcc grabs the exception and selector using the equivalent of
eh.exception and eh.selector and stashes the values in local variables.  It
then uses those stashed variables everywhere, for example in ppad to do the
comparisons with eh.typeid.for etc.  It doesn't try to get the value via
eh.exception in ppad.  Since presumably you know this (since llvm-gcc does it)
maybe you were talking about something else?

> The code in DwarfEHPrepare::MoveExceptionValueCalls that moves the call to
> llvm.eh.exception into the landing pad, and which you want to do for
> llvm.eh.selector as well, will only complicate matters. It would introduce PHI
> nodes for llvm.eh.selector values like it currently does for llvm.eh.exception
> values.

Sure, but that's not a problem because the catch info is only needed at invokes,
there is no need to go searching for it downstream, and I'm not sure why you
think there is such a need.

>> invoke void @_Z3foov()
>> to label %"3" unwind label %lpad personality @__gxx_personality_v0
>> catches %struct.__fundamental_type_info_pseudo* @_ZTIi,
>> %struct.__pointer_type_info_pseudo* @_ZTIPKc, i8* null
>
> The use of "i8* null" here is just as bad as it is for the current
> llvm.eh.selector call. There's no way to determine from this list whether the
> last value is truly the catchall value or for a catch handler.

There is a catch-all here only because there is a catch-all in the original
code:

    } catch (...) {
      printf("catchall\n");
    }

In my proposal you don't need to know about catch-all, add special catch-alls
etc.  If there was a catch-all in the original code then there is one on the
invoke, otherwise there is not.  There is no special treatment of catch-all.

>> "10": ; preds = %"5"
>> %exc_ptr31 = call i8* @llvm.eh.exception()
>> %filter32 = call i32 @llvm.eh.selector()
>> invoke void @_ZN1CD1Ev(%struct.A* %memtmp)
>> to label %"11" unwind label %fail personality @__gxx_personality_v0
>> catches i32 1 ; <- this is an empty filter, i.e. one that catches everything
>>
> Filter? What do you mean by this?

http://llvm.org/docs/ExceptionHandling.html#throw_filters

>> Will everything work?
>> ---------------------
>>
>> I am confident that it will work fine, for a very simple reason: this is exactly
>> what gcc does! Of course it is in disguise, a wolf in sheep's clothing some
>> might say :) In fact moving closer to gcc like this is probably the best way
>> to be sure that exception handling works properly, since gcc is what everyone
>> tests against whether we like it or not (for example libstdc++ exploits some
>> details of how gcc implements exception handling that are not specified by the
>> standard, i.e. are implementation defined, and this has caused trouble for LLVM
>> in the past).
>
> I would suspect that GCC has proper EH table generation mostly because it keeps
> tables on the side; whereas we do not and cannot. Our current EH tables are
> pretty poor. I would love to be able to generate tables similar to theirs.

I think you are reading more into the gcc tables than actually exists.  The
tables hold a set of nested regions.  Each region consists of a set of basic
blocks.  There are various types of regions, corresponding to handlers, filters,
cleanups etc.  Given a basic block, what happens when an exception is thrown?
You wind up through the enclosing regions, from inner-most to outer-most looking
for what to do.  If nothing matches then the exception unwinds out of the
function, otherwise the action specified by the region is taken.

Here is an equivalent way of storing regions, by attaching them to basic blocks:
given a basic block BB, consider all regions that contain BB, and attach their
info to BB in order of inner-most region to outer-most region.  That's what
my "catch info" does - and I think it contains all relevant info from the gcc
regions.  If so, we have all the same info gcc has, so if gcc can do something
then so can we.  You might object that regions can contain multiple basic
blocks, and by attaching info to basic blocks (currently this means to invokes)
you no longer can tell if two basic blocks are in the same region or not.  This
is true to some extent (you can reconstruct maximal regions by comparing catch
info on basic blocks) but I don't think it matters for anything, in gcc it is
just an optimization to reduce memory usage and duplicated effort.

>> I hate the way dwarf typeinfos, catches and filters are being baked into the
>> IR. Maybe metadata (see above) helps with this.
>
> Metadata cannot be counted on to remain.

Is that also true for global metadata used as an argument to an intrinsic?
Do you have an idea for how to keep catches etc out of the definition of the
IR?  I'm worried that if one day we add support for, say, SEH then we will
have to change how the IR is defined again, and that's better avoided.

> How will your implementation allow us to remove the Horrible Hack from
> DwarfEHPrepare.cpp? Right now we catch and throw at almost every level that the
> exception can propagate up. How will your proposal solve this?

The horrible hack is not needed at all, pushing extra catch-alls is not needed
at all - it all goes away.  Why were these needed?  They were needed to handle
the effects of inlining, in particular that right now when you inline through
an invoke the catch info (contained in the selector) gets attached to the
inlined _Unwind_Resume which is far away from the place you really want it: you
want it on the inlined invoke that the _Unwind_Resume is downstream of.  But
notice how inlining works with my scheme (described in my original proposal):
when inlining through an invoke, the catch info for that invoke gets appended
to everything you inline, including the invoke you inline.  Thus it occurs in
the right place automatically!  It also gets attached to the _Unwind_Resume,
which is also correct.  If _Unwind_Resume is replaced with unwind (rewind in
my original proposal, since amended) then you can just replace unwind with a
branch and everything comes out in the wash (it is not obvious that everything
comes out in the wash, but nonetheless it does!).

Ciao,

Duncan.