[LLVMdev] RFC: Exception Handling Proposal II
Bill Wendling
wendling at apple.com
Tue Nov 23 23:31:55 PST 2010
Hi everyone!
I've been silently working on the previous exception handling proposal I published last year (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-November/027311.html). However, there were some major deficiencies in it.
After discussing this further with Jim Grosbach and John McCall, we've come up with another proposal. As you can see, it incorporates the idea of "regions" into LLVM's IR. It also leaves the door open for implementing Chris's EH changes in his LLVMNotes. In fact, this can be looked upon as a first step towards that goal.
Please enjoy and let me know if you have any comments!
-bw
Exception Handling in LLVM
Problem
=======
The current exception handling mechanisms in LLVM are inadequate for generating
proper exception handling support for C++ and Objective-C. In particular, it
only approximates the "Itanium C++ ABI: Exception Handling" documentation:
http://www.codesourcery.com/public/cxx-abi/abi-eh.html
Also, it is lacking for other languages which could cause catchable exceptions
to be thrown by non-invoke instructions:
http://nondot.org/~sabre/LLVMNotes/ExceptionHandlingChanges.txt
Thirdly, the exception handling metadata is "encoded" in the CFG, which makes it
hard to gather all of the information needed for EH table generation.
I believe that the problem is a conceptual one. We are trying to replicate the
concept of a "region" in LLVM's IR where no such facility currently exists. In
order to create regions, there has to be a tight coupling of information between
the site where the exception was thrown and the place that handles the
exception. For example, DWARF exception handling needs type information at both
places in order to generate the exception handling tables.
Definitions
===========
A "region" is defined as a section of code where, if an exception is thrown
anywhere within that section, control is passed to code that will handle the
exception (called a "landing pad"). The code which handles the specific
exception is unique to that region. For example, "invoke A" and "invoke B" are
within the same region (X):
Region X
..........................................
: :
: .----------. .----------. :
: | invoke A | | invoke B | :
: `----------' `----------' :
: / | / | :
: normal | normal | :
..........................................
| |
| v
| .------------------.
| | B's cleanup code |
| `------------------'
| |
`---------------------'
|
v
.-----------------------.
| A's cleanup code |
| dispatch for region X |
`-----------------------'
|
.-----------------------------.
| | | |
v v v v
.----. .----. .----. .--------.
| C1 | | C2 | ... | Cn | | resume |
C1, C2, ..., Cn are the catches for the exception thrown. If none of the
catches' types match the type of the exception thrown, control passes to the
"resume".
Notice that even though invokes A and B are in the same region, they do not
necessarily jump to the same basic block when an exception occurs.
Proposal
========
We want to take the concept of a region and make it explicit in the IR.
Dispatch Instruction
--------------------
We intruduce a new instruction called "dispatch." This instruction holds most of
the information necessary for the exception handler to work.
Syntax:
dispatch region(<value>) resume to label <resumedest>
catches [
<type> <val>, label <dest>
...
]
catchall [ <type> <val>, label <dest> ]
filters [
<type> <val>, ...
]
Description:
* The "catchall", "catches", and "filters" clauses are optional. If none are
specified, then the landing pad is implicitly a "cleanup."
* The <resumedest> basic block is the destination to unwind to if the type
thrown isn't matched to any of the choices.
* The "catches" clause is a list of types which the region can catch and the
destinations to jump to for each type.
* The "catchall" clause is the place to jump to if the exception type doesn't
match any of the types in the "catches" clause.
* The "region" value is an integer, similar to the "addrspace" value, and is
unique to each dispatch in a function. IR objects reference this value to
indicate that they belong to the same region.
* The "filters" clause lists the types of exceptions which may be thrown by the
region.
Example:
;; With catch-all block
dispatch region(37) resume to label %Resume [
%struct.__fundamental_type_info_pseudo* @_ZTIi, label %bb1.lpad
%struct.__pointer_type_info_pseudo* @_ZTIPKc, label %bb2.lpad
]
catchall [i8* null, label %catchall.lpad]
filters [i8* bitcast (%struct.__pointer_type_info_pseudo* @_ZTIPKc to i8*),
i8* bitcast (%struct.__fundamental_type_info_pseudo* @_ZTIi to i8*]
;; No catch-all block
dispatch region(927) resume to label %Resume [
%struct.__fundamental_type_info_pseudo* @_ZTIi, label %bb1.lpad
%struct.__pointer_type_info_pseudo* @_ZTIPKc, label %bb2.lpad
]
;; Cleanup only
dispatch region(42) resume to label %Unwind
Invoke Instruction
------------------
The invoke instruction would be augmented to add a "personality" field and a
"region" indicator.
Syntax:
<result> = invoke [cconv] [ret attrs] <ptr to function ty>
<function ptr val>(<function args>) [fn attrs]
to label <normal label>
unwind to label <exception label>
region(<value>)
personality [<type> <value>]
Description:
* The personality field indicates the personality function at that invoke call.
* The region field's value must match to a dispatch with the same region value.
Example:
%retval = invoke i32 @Func(i32 927)
to label %Normal unwind label %Landing.Pad
region(1)
personality [i32 (...)* @__gxx_personality_v0]
Inlining
========
Inlining regions results in a merging of dispatch instructions. Care must be
taken when inlining into the cleanup code of a landing pad. The "resume" of the
inlinee may need to point to the "resume" of the inliner region.
Future Work
===========
This design is meant to be extensible. We don't address augmenting basic blocks
as in Chris's note. However, this design allows for that eventuality; it simply
doesn't require it. I.e., when that design is implemented, the "region" keyword
would migrate from the invoke instruction to the basic block. The dispatch
instruction would remain the same.
More information about the llvm-dev
mailing list