[cfe-dev] RFC: How to represent SEH (__try / __except) in LLVM	IR
    C Bergström 
    cbergstrom at pathscale.com
       
    Tue Oct 21 15:58:04 PDT 2014
    
    
  
On Thu, Oct 2, 2014 at 12:43 AM, Reid Kleckner <rnk at google.com> wrote:
> I want to add SEH support to Clang, which means we need a way to represent
> it in LLVM IR.
>
> Briefly, this is how I think we should represent it:
> 1. Use a different landingpad personality function for SEH
> (__C_specific_handler / _except_handlerN)
> 2. Use filter function pointers in place of type_info globals
> 3. Outline cleanups such as destructors and __finally on Windows, and
> provide a function pointer to the landingpad cleanup clause
>
> See the example IR at the end of this email. Read on if you want to
> understand why I think this is the right representation.
>
> ---
>
> Currently LLVM's exception representation is designed around the Itanium
> exception handling scheme documented here:
> http://mentorembedded.github.io/cxx-abi/abi-eh.html
>
> LLVM's EH representation is described here, and it maps relatively cleanly
> onto the Itanium design:
> http://llvm.org/docs/ExceptionHandling.html
>
> First, a little background about what __try is for. It's documented here:
> http://msdn.microsoft.com/en-us/library/swezty51.aspx
>
> The __try construct exists to allow the user to recover from all manner of
> faults, including access violations and integer division by zero.
> Immediately, it's clear that this is directly at odds with LLVM IL
> semantics. Regardless, I believe it's still useful to implement __try, even
> if it won't behave precisely as it does in other compilers in the presence
> of undefined behavior.
>
> ---
>
> The first challenge is that loads in C/C++ can now have exceptional
> control flow edges. This is impossible to represent in LLVM IR today
> because only invoke instructions can transfer control to a landing pad. The
> simplest way to work around this is to outline the body of the __try block
> and mark it noinline, which is what I propose to do initially.
>
> Long term, we could lower all potentially trapping operations to
> intrinsics that we 'invoke' at the IR level. See also Peter Collingbourne's
> proposal for iload and istore instructions here (
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-April/071732.html).
>
> With the outlining approach, in addition to noinline, we need to invent
> another function attribute to prevent functionattrs from inferring nounwind
> and readonly, or the optimizers will delete the invoke unwind edge or
> entire call site.
>
> ---
>
> The next challenge is actually catching the exception. The __except
> construct allows the user to evaluate a (mostly) arbitrary expression to
> decide if the exception should be caught. Code generated by MSVC catches
> these exceptions with an exception handler provided by the CRT. This
> handler is analogous to personality functions (__gxx_personality_v0) in
> LLVM and GCC, so that's what I call it.
>
> Notably, SEH and C++ exceptions use *different* personality functions in
> MSVC, and each function can only have one personality function. This is the
> underlying reason why one cannot mix C++ EH (even C++ RAII!) and __try in
> the same function with MSVC. Hypothetically, there is no reason why one
> could not have a more powerful personality function that handles both types
> of exception, but I intend to use the personality function from the CRT
> directly for the time being.
>
> On x86_64, the SEH personality function is called __C_specific_handler. On
> x86, it is __except_handler4 (or 3), but it's similar to
> __C_specific_handler and doesn't change how we should represent this in IR.
>
> The personality function interprets some side tables similar to the
> Itanium LSDA to find the filter function pointers that must be evaluated to
> decide which except handler to run at a given PC. The filter expressions
> are emitted as separate function bodies that the personality function
> calls. If a filter function returns '1', that indicates which except block
> will perform the recovery, and phase 1 of unwinding ends, similar to
> Itanium EH.
>
> I propose we represent this in IR by:
> 1. changing the personality function to __C_specific_handler,
> __except_handler4, or in the future something provided by LLVM
> 2. replacing the type_info globals we currently put in landing pads with
> pointers to filter functions
>
> Then, in llvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp where we currently
> emit the LSDA for Itanium exceptions, we can emit something else keyed off
> which kind of personality function we're using.
>
> ---
>
> SEH also allows implementing cleanups with __finally, but cleanups in
> general are implemented with a fundamentally different approach.
>
> During phase 2 of Itanium EH unwinding, control is propagated back up the
> stack. To run a cleanup, the stack is cleared, control enters the landing
> pad, and propagation resumes when _UnwindResume is called.
>
> On Windows, things work differently. Instead, during phase 2, each
> personality function is invoked a second time, wherein it must execute all
> cleanups *without clearing the stack* before returning control to the
> runtime where the runtime continues its loop over the stack frames. You can
> observe this easily by breaking inside a C++ destructor when an exception
> is thrown and taking a stack trace.
>
> MinGW's Win64 "SEH" personality function finesses this issue by taking
> complete control during phase 2 and following the Itanium scheme of
> successive stack unwinding. It has the drawback that it's not really ABI
> compatible with cleanups emitted by other compilers, which I think should
> be a goal for our implementation.
>
> It might be possible to do something similar to what MinGW does, and
> implement our own __gxx_personality* style personality function that
> interprets the same style of LSDA tables, but we *need* to be able to
> establish a new stack frame to run cleanups. We cannot unwind out to the
> original frame that had the landing pad.
>
> In the long term, I think we need to change our representation to
> implement this. Perhaps the cleanup clause of a landing pad could take a
> function pointer as an operand. However, in the short term, I think we can
> model this by always catching the exception and then re-raising it.
> Obviously, this isn't 100% faithful, but it can work.
>
> ---
>
> Here’s some example IR for how we might lower this C code:
>
> #define GetExceptionCode() _exception_code()
> enum { EXCEPTION_INT_DIVIDE_BY_ZERO = 0xC0000094 };
> int safe_div(int n, int d) {
>   int r;
>   __try {
>     r = n / d;
>   } __except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
>     r = 0;
>   }
>   return r;
> }
>
> define internal void @try_body(i32* %r, i32* %n, i32* %d) {
> entry:
>   %0 = load i32* %n, align 4
>   %1 = load i32* %d, align 4
>   %div = sdiv i32 %0, %1
>   store i32 %div, i32* %r, align 4
>   ret void
> }
>
> define i32 @safe_div(i32 %n, i32 %d) {
> entry:
>   %d.addr = alloca i32, align 4
>   %n.addr = alloca i32, align 4
>   %r = alloca i32, align 4
>   store i32 %d, i32* %d.addr, align 4
>   store i32 %n, i32* %n.addr, align 4
>   invoke void @try_body(i32* %r, i32* %n.addr, i32* %d.addr)
>           to label %__try.cont unwind label %lpad
>
> lpad:                                             ; preds = %entry
>   %0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
> @__C_specific_handler to i8*)
>           catch i8* bitcast (i32 (i8*, i8*)* @"\01?filt$0 at 0@safe_div@@"
> to i8*)
>   store i32 0, i32* %r, align 4
>   br label %__try.cont
>
> __try.cont:                                       ; preds = %__except,
> %entry
>   %2 = load i32* %r, align 4
>   ret i32 %2
> }
>
> define internal i32 @"\01?filt$0 at 0@safe_div@@"(i8* %exception_pointers,
> i8* %frame_pointer) {
> entry:
>   %0 = bitcast i8* %exception_pointers to i32**
>   %1 = load i32** %0, align 8
>   %2 = load i32* %1, align 4
>   %cmp = icmp eq i32 %2, -1073741676
>   %conv = zext i1 %cmp to i32
>   ret i32 %conv
> }
>
> declare i32 @__C_specific_handler(...)
>
I can't comment (yet) on the outlined proposal above, but I'll see if I can
get someone to help review. Clang + SEH recently came up in a discussion
and it would be great if others cared too.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20141022/7fb6fa11/attachment.html>
    
    
More information about the cfe-dev
mailing list