[LLVMdev] RFC: New Exception Handling Proposal

Talin viridia at gmail.com
Mon Nov 23 18:36:58 PST 2009


I have a couple of questions/concerns. The main one has to do with the 
opacity of exception types - in other words, will it still be true that 
the exception types are opaque identifiers, and are only interpreted by 
the personality function?

Since my object representations are not C++-like, I am not using any of 
the cxa_* C++ library functions. Instead, my code calls the _Unwind 
functions directly, and I have my own personality function (which can be 
viewed here: 
http://code.google.com/p/tart/source/browse/trunk/runtime/lib/tart_eh_personality.c)

My compiler generates the list of filter parameters as pointers to Type 
objects. In the personality function, it examines each such pointer and 
does an IsSubclass test.  The action code returned by the personality 
function is a simple integer index (0, 1, 2, ...etc) which represents 
the number of the catch block to transfer control to. This index is then 
fed into a IR switch instruction in the landing pad.

The concern I have is that the 'convoke' instuction takes a list of 
catch types, but it doesn't say what those catch types are.  If the code 
does not assume anything about them, then I'm guessing that I would be 
able to adapt my code to work with it. On the other hand, if those catch 
types are assumed to be C++ exceptions or RTTI identifiers, then I'm hosed.

Bill Wendling wrote:
> I've been looking into a new way to implement exception handling in  
> LLVM. The current model has many disadvantages, in my opinion. I try  
> to address them with this proposal. I also try to make exception  
> handling much more understandable to the normal human reader. :-) Any  
> new proposal will need to address all present and future languages'  
> exception handling methodologies. I believe that I created something  
> which is generic enough.
>
> Please read and let me know your opinions!
>
> N.B. I'm not wedded to the name I chose here. Nor the implementation  
> of some if the intrinsics - some may be better placed as an attribute  
> of the function.
>
> Also, this does *not* address the general issue of how we handle all  
> exceptional situations, i.e. floating point exceptions and the like.
>
>
>                              NEW EXCEPTION HANDLING
>                              ======================
>
> Current Exception Handling
> --------------------------
>
> Zero-cost exception handling is done by generating metadata for the  
> unwinding
> library. That library, along with a "personality" function, determines  
> where to
> land after an exception is thrown. Given enough information, the  
> metadata can
> and should be generated only by the code generation section of the  
> compiler.
>
> The current exception handling mechanism encodes the exception  
> handling metadata
> by using intrinsics and the CFG itself. Some of the code it generates  
> isn't
> executable code, but meant purely to specify information for  
> generating the
> exception handling table. For instance, if you have this simple code  
> snippet:
>
>    #include <cstdio>
>    void bar();
>    void foo() {
>      try {
>        bar();
>      } catch (int i) {
>        printf("i == %d\n", i);
>      } catch (const char *s) {
>        printf("s == %s\n", s);
>      } catch (...) {
>        printf("catch-all\n");
>      }
>    }
>
> The llvm IR for foo looks similar to this (simplified for readability):
>
>    define void @_Z3foov() {
>    entry:
>      invoke void @_Z3barv()
>        to label %return unwind label %lpad
>
>    return:
>      ret void
>
>    lpad:
>      %eh_ptr = tail call i8* @llvm.eh.exception()
>      %eh_select27 = tail call i32 (i8*, i8*, ...)*
>        @llvm.eh.selector(i8* %eh_ptr,
>                          i8* @__gxx_personality_v0,
>                          i8* @_ZTIi,
>                          i8* @_ZTIPKc,
>                          i8* null)
>      %eh_typeid = tail call i32 @llvm.eh.typeid.for( @_ZTIi )
>      %6 = icmp eq i32 %eh_select27, %eh_typeid
>      br i1 %6, label %bb1, label %ppad
>
>    ppad:
>      %eh_typeid55 = tail call i32 @llvm.eh.typeid.for( @_ZTIPKc )
>      %7 = icmp eq i32 %eh_select27, %eh_typeid55
>      %8 = tail call i8* @__cxa_begin_catch(i8* %eh_ptr) nounwind
>      br i1 %7, label %bb2, label %bb3
>
>    bb1:
>      ;; printf("i == %d\n", i)
>      ret void
>
>    bb2:
>      ;; printf("s == %s\n", s)
>      ret void
>
>    bb3:
>      ;; printf("catch-all\n")
>      ret void
>    }
>
> Note that the `llvm.eh.selector' call indicates:
>
>    1. that the basic block it's in is a landing pad,
>    2. the personality function,
>    3. the types that can be caught,
>    4. the types that can be thrown by the calling function (though not  
> with this
>       example), and
>    5. the presence of a cleanup or catch-all block.
>
> The "post pad" (ppad) checks the type of the thrown exception to  
> determine if
> it's caught.
>
> Notice that all of this information is separated from the place where  
> it's most
> useful - the invoke instruction. None of the transformation passes  
> know about or
> can reason about these intrinsics (i.e., they can't be optimized). In
> particular, there's no concept of a "landing pad" for the rest of the  
> compiler,
> which may lead to transformations generating code that violate an  
> assumed
> invariant. E.g., a landing pad which has a branch to it instead of  
> being the
> target of the "unwind" branch of an invoke instruction. This is an  
> assumed
> invariant because a landing pad isn't code that's generated by the  
> user, but by
> the compiler to convey metadata information, and thus cannot be  
> branched to
> through normal code paths.
>
> Moreover, the "filter types" - those types that the caller function  
> can throw -
> are only encoded in the `llvm.eh.selector' intrinsic. They aren't part  
> of the
> callee function, which makes their presence in this intrinsic confusing,
> counter-intuitive, and hard to get at.
>
> Worse yet, the personality function is only encoded on these invoke  
> statements.
> However, they apply to the function as a whole. The optimizers  
> shouldn't inline
> functions with a different personality into a function. This is  
> especially a
> problem for LTO.
>
>
> Exception Handling Proposal
> ---------------------------
>
> My goals with this new exception handling proposal are:
>
>    1. create a robust exception handling model that's intuitive and  
> represented
>       in the llvm IR,
>
>    2. be able to follow the exception handling ABI (outlined in the  
> "Itanium C++
>       ABI: Exception Handling" document) without being tied to any  
> specific
>       exception format (EH tables, DWARF, etc.),
>
>    3. hold off the generation of metadata as late as possible during  
> code
>       generation, and
>
>    4. use the `unwind' instruction to throw or rethrow instead of a  
> call to
>       `_Unwind_Resume_or_Rethrow'.
>
> To achieve these goals, we'll need these new llvm instructions and  
> intrinsics.
>
>
> New Intrinsics
> --------------
>
> llvm.eh.filter:
>
> First the intrinsics. The first one is similar to Duncan's idea of a  
> filter
> intrinsic. In fact, it's named the same. ;-)
>
>          void llvm.eh.filter(i8*, ...)
>
> If present in the entry block, it enumerates all of the types that the  
> function
> may throw. If `llvm.eh.filter(i8* null)', then the function cannot  
> throw, but
> must still have an exception handling table generated for it. It  
> generates no
> executable code.
>
> An alternative is to add this information to the function's definition:
>
>          define void @foo() filters[i8* _ZTIi, i8* _ZTIKc] {
>            ;; ...
>          }
>
> or similar. This could allow optimizations based on knowing that a  
> function
> cannot throw a particular type. However, it's not a particularly  
> attractive
> solution.
>
> llvm.eh.personality:
>
> The next intrinsic is for the "personality function". The reason to  
> separate
> this from the `convoke' instruction is because we want to prevent  
> inlining of a
> function with a different personality function.
>
>           void llvm.eh.personality(i8*)
>
> This also lives in the entry block for ease of finding. As with the  
> filters,
> it may be beneficial to add this to the function's definition.
>
>
> Convoke: A New Instruction
> --------------------------
>
> Syntax:
>
> Now the instruction. I call it `convoke' (the name is subject to  
> change). The
> general form of the instruction is:
>
>    convoke void @func()
>      to label %normal
>      with catches
>        [i8* @CatchTy1, label %catch.1],
>        [i8* @CatchTy2, label %catch.2],
>           ...
>        [i8* @CatchTyN, label %catch.n],
>        [..., label %CatchAll]
>
> If a catch-all block wasn't specified, then we generate:
>
>        [..., unwind]
>
> indicating that we should unwind out of the function if the type  
> wasn't caught.
> An alternative syntax is:
>
>        [i8* null, unwind]
>
> It is an error to have two or more catch clauses with the same type.
>
> Exception Object:
>
> We specify the exception object before the jump to the catch blocks. For
> example:
>
>             .---------.
>             | convoke |
>             `---------'
>                  |
>                  v
>      .-----------------------.
>      |                       |
>      v                       |
>   %normal   .----------------+---------------.
>             |                |      ...      |
>             v                v               v
>        select.1 = 1   select.2 = 2    select.n = n
>             |                |               |
>             `----------------+---------------'
>                              |
>                              v
>           .----------------------------------------.
>           | %sel = phi [%select.1, ..., %select.n] |
>           |     %eh_ptr = llvm.eh.exception()      |
>           |           switch on %sel               |
>           `----------------------------------------'
>                              |
>                              v
>                   .--------------------.
>                   |          |   ...   |
>                   v          v         v
>                %catch.1   %catch.2  %catch.n
>
> Handling Cleanups:
>
> Cleanup code needs to be executed before any of the catches. We can  
> accomodate
> this easily - thanks to Eric! Basically, the idea is to jump to blocks  
> that
> identify which catch they mean to target, they set a value to be used  
> in a
> switch statement generated after the cleanup, execute the cleanup  
> code, and then
> switch on the values to the actual catch blocks.
>
>             .---------.
>             | convoke |
>             `---------'
>                  |
>                  v
>      .-----------------------.
>      |                       |
>      v                       |
>   %normal   .----------------+---------------.
>             |                |      ...      |
>             v                v               v
>        select.1 = 1   select.2 = 2    select.n = n
>             |                |               |
>             `----------------+---------------'
>                              |
>                              v
>           .----------------------------------------.
>           | %sel = phi [%select.1, ..., %select.n] |
>           |     %eh_ptr = llvm.eh.exception()      |
>           |           [cleanup code]               |
>           |           switch on %sel               |
>           `----------------------------------------'
>                              |
>                              v
>                   .--------------------.
>                   |          |   ...   |
>                   v          v         v
>                %catch.1   %catch.2  %catch.n
>
> (I love ASCII art!)
>
>
> Throw: A New Instruction
> ------------------------
>
> The `unwind' instruction has no semantic meaning outside of an exception
> context. I propose removing it as an instruction, and replacing it  
> with a new,
> more descriptively named instruction called `throw'. The `throw'  
> instruction
> would take the exception object as its only parameter. Its semantic  
> would be to
> throw that exception object. Essentially, rethrowing that exception.
>
> Syntax:
>
>          throw i8* %eh_ptr
>
>
> Good Things About This Method
> -----------------------------
>
> Exception handling metadata is no longer encoded in the CFG. It is  
> much more
> sanely specified, and thus easier to understand by normal humans. The  
> optimizers
> are free to modify the code as they see fit. In fact, they may be able  
> to do a
> better job at it. For instance, they could perform optimizations  
> mixing the
> cleanup and catch code. If the "filters" were part of the function  
> instead of an
> intrinsic, there is the potential for optimizations based upon  
> knowledge that a
> function cannot throw a particular type.
>
> Inlining inside of a cleanup or catch block will no longer result in  
> branching
> to a landing pad from a non-invoke instruction, because there are no  
> landing
> pads to mess up. The "filter" and "personality" intrinsics maintain  
> information
> important to proper EH semantics even if the catch clauses are removed.
>
> There is no longer a need for an explicit call to  
> `_Unwind_Resume_or_Rethrow',
> but we use the `unwind' instruction.
>
> And, because the exceptions are explicit, there is no need for an  
> artificial
> catch-all to be inserted into the generated code and EH table.
>
>
> Bad Things About This Method
> ----------------------------
>
> It's not a small change. It requires new instructions, which requires  
> teaching
> everything about them. The good news is that they will behave very  
> similarly to
> the current `invoke' and `unwind' instructions, so we can build upon  
> that work
> touching similar places in the code. Also, existing bitcode files  
> won't benefit
> from the new instructions. The code will have to be recompiled. This  
> shouldn't
> be a major problem for most people, because bitcode files apart from  
> LTO files
> are meant mostly for compiler developers.
>
>
> -bw
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>   




More information about the llvm-dev mailing list