[LLVMdev] Darwin vs exceptions

Wed Dec 12 11:01:26 PST 2007

Hi Dale,

> No, I don't want to change the semantics of invoke, at least I don't  
> think so.
> When inlining, I want the inlined throw to reach cleanup code as it  
> does.
> But I want the Unwind_Resume call that ends the cleanup code to be
> replaced with a control transfer to the handler (or cleanup) in the  
> calling
> function, i.e. the inliner needs to know the semantics of Unwind_Resume.

it seems to me that this is extremely tricky to do in general, though it
is simpler if you suppose the IR was produced by gcc.  Consider this example:

class A {}; class B {};
int i;
extern void f();
void g() { try { f(); } catch(A) { i = 1; } }
void h() { try { g(); } catch(B) { i = 2; } }

Without catch-alls this compiles to something like:

define void @_Z1gv() {
entry:
	invoke void @_Z1fv( )
			to label %UnifiedReturnBlock unwind label %lpad
...
lpad:		; preds = %entry
	%eh_ptr = tail call i8* @llvm.eh.exception( )
	%eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i32 (...)* @__gxx_personality_v0, i8* A)
	%eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* A )
	%tmp15 = icmp eq i32 %eh_select, %eh_typeid
	br i1 %tmp15, label %bb, label %Unwind
...
Unwind:		; preds = %lpad
	tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )
	unreachable
...
}

define void @_Z1hv() {
entry:
	invoke void @_Z1gv( )
			to label %UnifiedReturnBlock unwind label %lpad
...
lpad:		; preds = %entry
	%eh_ptr = tail call i8* @llvm.eh.exception( )
	%eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr, i32 (...)* @__gxx_personality_v0, i8* B )
	%eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
	%tmp15 = icmp eq i32 %eh_select, %eh_typeid
	br i1 %tmp15, label %bb, label %Unwind
...
Unwind:		; preds = %lpad
	tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )
	unreachable
...
}

Currently when you inline you get something like:

define void @_Z1hv() {
entry:
	invoke void @_Z1fv( )
			to label %UnifiedReturnBlock2 unwind label %lpad.i
...
lpad.i:		; preds = %entry
	%eh_ptr.i = tail call i8* @llvm.eh.exception( )
	%eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A )
	%eh_typeid.i = tail call i32 @llvm.eh.typeid.for.i32( i8* A )
	%tmp15.i = icmp eq i32 %eh_select.i, %eh_typeid.i
	br i1 %tmp15.i, label %bb.i, label %Unwind.i
...
Unwind.i:		; preds = %lpad.i
	invoke i32 (...)* @_Unwind_Resume( i8* %eh_ptr.i )
			to label %UnifiedUnreachableBlock unwind label %lpad		; <i32>:0 [#uses=0]
...
lpad:		; preds = %Unwind.i
	%eh_ptr = tail call i8* @llvm.eh.exception( )
	%eh_select = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i32 (...)* @__gxx_personality_v0 , i8* B )
	%eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
	%tmp15 = icmp eq i32 %eh_select, %eh_typeid
	br i1 %tmp15, label %bb, label %Unwind
...
Unwind:		; preds = %lpad
	tail call i32 (...)* @_Unwind_Resume( i8* %eh_ptr )		; <i32>:1 [#uses=0]
	unreachable
...
}

However to get correct functioning the following adjustments have to be made:
(1) B has to be appended to the selector call for A:
	%eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A )
->
	%eh_select.i = tail call i32 (i8*, i8*, ...)* @llvm.eh.selector.i32( i8* %eh_ptr.i, i32 (...)* @__gxx_personality_v0 , i8* A , i8* B )
Otherwise if a B is thrown in f then the program will terminate.  Here the main difficulty is finding the selector for the A landing pad.

(2) The Unwind_Resume call needs to be turned into a jump to the handler code for the B case (this is half-way through lpad), something like:

Unwind.i:		; preds = %lpad.i
	; was an invoke of @_Unwind_Resume here
	; was a call to @llvm.eh.exception here
	; was a call to @llvm.eh.selector.i32 here
	%eh_typeid = tail call i32 @llvm.eh.typeid.for.i32( i8* B )
	%tmp15 = icmp eq i32 %eh_select, %eh_typeid
	br i1 %tmp15, label %bb, label %Unwind

This can all go wrong in several ways:
(a) if the A landing pad already (for some reason) tested for B then step (1) will
cause strangeness.  However I think we can say that the code was relying on undefined
behaviour and not worry about this.
(b) there needs to be some analysis to find @_Unwind_Resume calls reachable from the
A landing pad.  They may also be reachable from other landing pads, so code duplication
may be required.  This could get complicated.
(c) the selector call for the Unwind_Resume invoke needs to be determined and the result
of the (modified) A selector call needs to be used instead.  Since the selector may be
shared by several landing pads this could get tricky too.

To my mind a perfect solution would be:
(i) Find a trick (like my catch-all trick) so that invokes always branch to the unwind
label when an exception unwinds through it.  In other words, preserve the traditional
semantics of invoke.  This makes life much simpler at the level of the IR optimizers.
(ii) Add a pass that knows about Unwind_Resume and does the kind of transform described
above when it isn't too hard.  If it is too hard then it can give up because of (i).
(iii) At codegen time, completely abandon invoke semantics (invoke doesn't exist there
anyway) and exploit the way the unwinder works as much as possible.

I've applied the darwin CFA unwinder change and will see if I can find a way of getting (i).

Ciao,

Duncan.