[LLVMdev] Alternative exception handling proposal

Thu Dec 2 15:37:07 PST 2010

On Dec 2, 2010, at 2:21 AM, Duncan Sands wrote:

> Hi Bill,
> 
>> This is similar to my first proposal.
> 
> yup, I still consider your first proposal to have been basically sound.
> 
> But it also suffers from a major problem,
>> which stopped that proposal dead in its tracks. Namely, you have information in
>> one place which needs to be shared in two different, but possibly disjoint,
>> places: the type, filters, and personality information. In order to generate the
>> EH tables, you need to know this information at the throw site and at the place
>> which makes the decision of which catch handler to invoke. There is no guarantee
>> in your proposal that the invoke can be associated with the proper eh.selector
>> call. And because of (C++) cleanups and inlining, it's the rule not the exception.
> 
> I disagree that this information is needed anywhere except the invoke.  If it
> was needed arbitrarily far downstream then of course my proposal would be dead.
> But it isn't!  Got an example where it is?
> 
>> Example, if you have this:
>> 
>> invoke void @foo()
>> to label %invcont unwind label %lpad
>> personality @__gxx_personality_v0
>> catches %struct.__fundamental_type_info_pseudo* @_ZTIi,
>> %struct.__pointer_type_info_pseudo* @_ZTIPKc
>> 
>> lpad:
>> call void @bar(%A* %a) ; a cleanup
>> br label %ppad
>> 
>> ppad:
>> %eh_ptr = call i8* llvm.eh.exception()
>> %eh_sel = call i32 llvm.eh.selector()
>> ; code to clean up.
>> 
>> The call to @bar can insert an arbitrarily complex amount of code, including
>> invokes, llvm.eh.selector calls, etc. Because there is no relationship between
>> the invoke of @foo and %eh_sel in ppad, we lose that information at ppad, which
>> is where we need it.
> 
> It would of course be wrong to expect eh.exception to return the original value
> in ppad if you inlined an invoke via the call to @bar, and reached %ppad via the
> unwind branch of that invoke because a new exception was thrown.  This is not a
> problem.  Here's how gcc does it.  In fact llvm-gcc does exactly the same thing!
> In lpad gcc grabs the exception and selector using the equivalent of
> eh.exception and eh.selector and stashes the values in local variables.  It
> then uses those stashed variables everywhere, for example in ppad to do the
> comparisons with eh.typeid.for etc.  It doesn't try to get the value via
> eh.exception in ppad.  Since presumably you know this (since llvm-gcc does it)
> maybe you were talking about something else?
> 
>> The code in DwarfEHPrepare::MoveExceptionValueCalls that moves the call to
>> llvm.eh.exception into the landing pad, and which you want to do for
>> llvm.eh.selector as well, will only complicate matters. It would introduce PHI
>> nodes for llvm.eh.selector values like it currently does for llvm.eh.exception
>> values.
> 
> Sure, but that's not a problem because the catch info is only needed at invokes,
> there is no need to go searching for it downstream, and I'm not sure why you
> think there is such a need.
> 
In order to generate the EH tables correctly, we need to know what types can be thrown, where to land for a specific throw site, and how to branch to the correct catch handler. When you divorce the point where you branch to the catch handler from the point where it throws, you now have a huge gap that cannot be easily recovered from, and may be impossible to recover from.

This is the code that G++ generates from the example in my proposal:

LEHB2:
	call	__Z3foov
LEHE2:
. . .
L24:
	# basic block 10
	movq	%rax, %r12
L5:
	# basic block 11
	movl	%edx, %ebx
	leaq	-18(%rbp), %rdi
	call	__ZN1BD1Ev
	movslq	%ebx,%rdx
	jmp	L7
. . .
L7:
	# basic block 15
	movl	%edx, %ebx
	leaq	-17(%rbp), %rdi
	call	__ZN1AD1Ev
	movslq	%ebx,%rdx
	jmp	L19
. . .
L19:
	# basic block 19
	cmpq	$3, %rdx
	je	L9
	# basic block 20
	cmpq	$2, %rdx
	jne	L30
	# basic block 21
	jmp	L39
. . .
L39: # The catch handler
. . .
GCC_except_table0:
LLSDA4:
. . .
 .set L$set$6,LEHB2-LFB4
	.long L$set$6		# region 2 start
	.set L$set$7,LEHE2-LEHB2
	.long L$set$7		# length
	.set L$set$8,L24-LFB4
	.long L$set$8		# landing pad
	.byte	0x7		# uleb128 0x7; action 
. . .
	.byte	0x1	# Action record table
	.byte	0x0
	.byte	0x2
	.byte	0x7d
	.byte	0x3
	.byte	0x7d
	.byte	0x0
	.byte	0x7d
	.align 2
	.long	__ZTIi+4 at GOTPCREL
	.long	__ZTIPKc+4 at GOTPCREL
	.long	0

If the call to __Z3foov throws, we need to set up the tables to that it knows that it needs to call the __ZN1BD1Ev and __ZN1AD1Ev cleanups. This information requires looking at the invoke instruction – i.e., "where should I land?". It also needs to know which types it can catch in order to get the "action" variable.

So the information is needed at the invoke site.

The information is also needed at the site that makes the decision of which catch handler to execute (L19 in the above example). For one, it needs to know the action record table entries. And of course it needs to know the types that can catch, the personality function, and information about any filters. In your model, that point in the code is completely, and potentially irreversibly, separated from the invoke instruction.

This is why I abandoned my original idea. There was no good way of modeling a relationship between the invoke and the catch handler decision site in the IR.

>>> invoke void @_Z3foov()
>>> to label %"3" unwind label %lpad personality @__gxx_personality_v0
>>> catches %struct.__fundamental_type_info_pseudo* @_ZTIi,
>>> %struct.__pointer_type_info_pseudo* @_ZTIPKc, i8* null
>> 
>> The use of "i8* null" here is just as bad as it is for the current
>> llvm.eh.selector call. There's no way to determine from this list whether the
>> last value is truly the catchall value or for a catch handler.
> 
> There is a catch-all here only because there is a catch-all in the original
> code:
> 
>   } catch (...) {
>     printf("catchall\n");
>   }
> 
> In my proposal you don't need to know about catch-all, add special catch-alls
> etc.  If there was a catch-all in the original code then there is one on the
> invoke, otherwise there is not.  There is no special treatment of catch-all.
> 
You miss the point. As you well know, we need to know the specific type that is used for a catchall. E.g., i8* null in C++ and a global variable in Ada. In your code above, the i8* null is indistinguishable from the other types.

>>> "10": ; preds = %"5"
>>> %exc_ptr31 = call i8* @llvm.eh.exception()
>>> %filter32 = call i32 @llvm.eh.selector()
>>> invoke void @_ZN1CD1Ev(%struct.A* %memtmp)
>>> to label %"11" unwind label %fail personality @__gxx_personality_v0
>>> catches i32 1 ; <- this is an empty filter, i.e. one that catches everything
>>> 
>> Filter? What do you mean by this?
> 
> http://llvm.org/docs/ExceptionHandling.html#throw_filters
> 
Why is it in a "catches" clause?

>>> Will everything work?
>>> ---------------------
>>> 
>>> I am confident that it will work fine, for a very simple reason: this is exactly
>>> what gcc does! Of course it is in disguise, a wolf in sheep's clothing some
>>> might say :) In fact moving closer to gcc like this is probably the best way
>>> to be sure that exception handling works properly, since gcc is what everyone
>>> tests against whether we like it or not (for example libstdc++ exploits some
>>> details of how gcc implements exception handling that are not specified by the
>>> standard, i.e. are implementation defined, and this has caused trouble for LLVM
>>> in the past).
>> 
>> I would suspect that GCC has proper EH table generation mostly because it keeps
>> tables on the side; whereas we do not and cannot. Our current EH tables are
>> pretty poor. I would love to be able to generate tables similar to theirs.
> 
> I think you are reading more into the gcc tables than actually exists.  The
> tables hold a set of nested regions.  Each region consists of a set of basic
> blocks.  There are various types of regions, corresponding to handlers, filters,
> cleanups etc.  Given a basic block, what happens when an exception is thrown?
> You wind up through the enclosing regions, from inner-most to outer-most looking
> for what to do.  If nothing matches then the exception unwinds out of the
> function, otherwise the action specified by the region is taken.
> 
This is exactly what my newest EH model proposal is meant to do.

> Here is an equivalent way of storing regions, by attaching them to basic blocks:
> given a basic block BB, consider all regions that contain BB, and attach their
> info to BB in order of inner-most region to outer-most region.  That's what
> my "catch info" does - and I think it contains all relevant info from the gcc
> regions.  If so, we have all the same info gcc has, so if gcc can do something
> then so can we.  You might object that regions can contain multiple basic
> blocks, and by attaching info to basic blocks (currently this means to invokes)
> you no longer can tell if two basic blocks are in the same region or not.  This
> is true to some extent (you can reconstruct maximal regions by comparing catch
> info on basic blocks) but I don't think it matters for anything, in gcc it is
> just an optimization to reduce memory usage and duplicated effort.
> 
>>> I hate the way dwarf typeinfos, catches and filters are being baked into the
>>> IR. Maybe metadata (see above) helps with this.
>> 
>> Metadata cannot be counted on to remain.
> 
> Is that also true for global metadata used as an argument to an intrinsic?

You will need to read the documentation. But Chris never expects metadata to stick around. In fact, it's derived from Value, not Use. So how can you have a use of it in an intrinsic that the compiler would know about?

> Do you have an idea for how to keep catches etc out of the definition of the
> IR?  I'm worried that if one day we add support for, say, SEH then we will
> have to change how the IR is defined again, and that's better avoided.
> 
I haven't given it a lot of thought. I don't like encoding DWARF-specific concepts into the IR. But my proposal doesn't do that either. It involves generic concepts that could be applied to all forms of EH.

>> How will your implementation allow us to remove the Horrible Hack from
>> DwarfEHPrepare.cpp? Right now we catch and throw at almost every level that the
>> exception can propagate up. How will your proposal solve this?
> 
> The horrible hack is not needed at all, pushing extra catch-alls is not needed
> at all - it all goes away.  Why were these needed?  They were needed to handle
> the effects of inlining, in particular that right now when you inline through
> an invoke the catch info (contained in the selector) gets attached to the
> inlined _Unwind_Resume which is far away from the place you really want it: you
> want it on the inlined invoke that the _Unwind_Resume is downstream of.  But
> notice how inlining works with my scheme (described in my original proposal):
> when inlining through an invoke, the catch info for that invoke gets appended
> to everything you inline, including the invoke you inline.  Thus it occurs in
> the right place automatically!  It also gets attached to the _Unwind_Resume,
> which is also correct.  If _Unwind_Resume is replaced with unwind (rewind in
> my original proposal, since amended) then you can just replace unwind with a
> branch and everything comes out in the wash (it is not obvious that everything
> comes out in the wash, but nonetheless it does!).
> 
John all ready mentioned problems with your inlining proposal. Here is the code that brought up the need for what you instantly labeled a "horrible hack". It must work flawlessly with your new proposal.

	http://llvm.org/viewvc/llvm-project?view=rev&revision=99670

int main() {
    try {
        throw new std::exception();
    } catch (std::exception *e) {
        throw e;
    }
}

And this code needs to give a sensible backtrace (on Darwin at least). It will crash libunwind because it was doing horrible things:
#import <Foundation/Foundation.h>

int main (int argc, const char * argv[]) {
    NSAutoreleasePool * pool = [[NSAutoreleasePool alloc] init];

    @try {
        @throw [NSException exceptionWithName:@"TestException" reason:@"Test" userInfo:nil];
    }
    @catch (NSException *e) {
        @throw e;
    }

    [pool drain];
    return 0;
}

-bw

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20101202/8ac3d80d/attachment.html>