[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

Mon Nov 10 14:07:37 PST 2014

Moving this month old RFC to llvmdev. Not sure why I sent this to cfe-dev
in the first place...

---

Based on code review discussion from John, he thinks filter expressions
should be emitted into the body of the function with the try, rather than
being outlined by the frontend.

Instead of having the frontend create filter functions, we would use labels
in place of typeinfo. The IR would look like "landingpad ... catch label
%filter0 ..." instead of "landingpad ... catch ... @filter_func0 ...".
There would be a backend pass similar to SjLjEHPrepare that would outline
the filter function and cleanup actions. Once we do the outlining, there is
no turning back, because the outlined function has to know something about
the stack layout of the parent function. If the parent function is inlined,
we would have to duplicate the filter function along with it.

Given that we want this kind of outlining to handle cleanups, it shouldn't
be difficult to use the same machinery for filter expressions.

The IR sample for safe_div at the end of my RFC would look like this
instead:

define i32 @safe_div(i32 %n, i32 %d) {
entry:
  %d.addr = alloca i32, align 4
  %n.addr = alloca i32, align 4
  %r = alloca i32, align 4
  store i32 %d, i32* %d.addr, align 4
  store i32 %n, i32* %n.addr, align 4
  invoke void @try_body(i32* %r, i32* %n.addr, i32* %d.addr)
          to label %__try.cont unwind label %lpad

filter:
  %eh_code = call i32 @llvm.eh.seh.exception_code() ; or similar
  %cmp = icmp eq i32 %eh_code, 0xC0000094
  %r = zext i1 %cmp to i32
  call void @llvm.eh.seh.filter(i32 %r)

lpad:
  %0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
@__C_specific_handler to i8*)
          catch label %filter
  store i32 0, i32* %r, align 4
  br label %__try.cont

__try.cont:
  %2 = load i32* %r, align 4
  ret i32 %2
}

define internal void @try_body(i32* %r, i32* %n, i32* %d) {
entry:
  %0 = load i32* %n, align 4
  %1 = load i32* %d, align 4
  %div = sdiv i32 %0, %1
  store i32 %div, i32* %r, align 4
  ret void
}

On Wed, Oct 1, 2014 at 10:43 AM, Reid Kleckner <rnk at google.com> wrote:

> I want to add SEH support to Clang, which means we need a way to represent
> it in LLVM IR.
>
> Briefly, this is how I think we should represent it:
> 1. Use a different landingpad personality function for SEH
> (__C_specific_handler / _except_handlerN)
> 2. Use filter function pointers in place of type_info globals
> 3. Outline cleanups such as destructors and __finally on Windows, and
> provide a function pointer to the landingpad cleanup clause
>
> See the example IR at the end of this email. Read on if you want to
> understand why I think this is the right representation.
>
> ---
>
> Currently LLVM's exception representation is designed around the Itanium
> exception handling scheme documented here:
> http://mentorembedded.github.io/cxx-abi/abi-eh.html
>
> LLVM's EH representation is described here, and it maps relatively cleanly
> onto the Itanium design:
> http://llvm.org/docs/ExceptionHandling.html
>
> First, a little background about what __try is for. It's documented here:
> http://msdn.microsoft.com/en-us/library/swezty51.aspx
>
> The __try construct exists to allow the user to recover from all manner of
> faults, including access violations and integer division by zero.
> Immediately, it's clear that this is directly at odds with LLVM IL
> semantics. Regardless, I believe it's still useful to implement __try, even
> if it won't behave precisely as it does in other compilers in the presence
> of undefined behavior.
>
> ---
>
> The first challenge is that loads in C/C++ can now have exceptional
> control flow edges. This is impossible to represent in LLVM IR today
> because only invoke instructions can transfer control to a landing pad. The
> simplest way to work around this is to outline the body of the __try block
> and mark it noinline, which is what I propose to do initially.
>
> Long term, we could lower all potentially trapping operations to
> intrinsics that we 'invoke' at the IR level. See also Peter Collingbourne's
> proposal for iload and istore instructions here (
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-April/071732.html).
>
> With the outlining approach, in addition to noinline, we need to invent
> another function attribute to prevent functionattrs from inferring nounwind
> and readonly, or the optimizers will delete the invoke unwind edge or
> entire call site.
>
> ---
>
> The next challenge is actually catching the exception. The __except
> construct allows the user to evaluate a (mostly) arbitrary expression to
> decide if the exception should be caught. Code generated by MSVC catches
> these exceptions with an exception handler provided by the CRT. This
> handler is analogous to personality functions (__gxx_personality_v0) in
> LLVM and GCC, so that's what I call it.
>
> Notably, SEH and C++ exceptions use *different* personality functions in
> MSVC, and each function can only have one personality function. This is the
> underlying reason why one cannot mix C++ EH (even C++ RAII!) and __try in
> the same function with MSVC. Hypothetically, there is no reason why one
> could not have a more powerful personality function that handles both types
> of exception, but I intend to use the personality function from the CRT
> directly for the time being.
>
> On x86_64, the SEH personality function is called __C_specific_handler. On
> x86, it is __except_handler4 (or 3), but it's similar to
> __C_specific_handler and doesn't change how we should represent this in IR.
>
> The personality function interprets some side tables similar to the
> Itanium LSDA to find the filter function pointers that must be evaluated to
> decide which except handler to run at a given PC. The filter expressions
> are emitted as separate function bodies that the personality function
> calls. If a filter function returns '1', that indicates which except block
> will perform the recovery, and phase 1 of unwinding ends, similar to
> Itanium EH.
>
> I propose we represent this in IR by:
> 1. changing the personality function to __C_specific_handler,
> __except_handler4, or in the future something provided by LLVM
> 2. replacing the type_info globals we currently put in landing pads with
> pointers to filter functions
>
> Then, in llvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp where we currently
> emit the LSDA for Itanium exceptions, we can emit something else keyed off
> which kind of personality function we're using.
>
> ---
>
> SEH also allows implementing cleanups with __finally, but cleanups in
> general are implemented with a fundamentally different approach.
>
> During phase 2 of Itanium EH unwinding, control is propagated back up the
> stack. To run a cleanup, the stack is cleared, control enters the landing
> pad, and propagation resumes when _UnwindResume is called.
>
> On Windows, things work differently. Instead, during phase 2, each
> personality function is invoked a second time, wherein it must execute all
> cleanups *without clearing the stack* before returning control to the
> runtime where the runtime continues its loop over the stack frames. You can
> observe this easily by breaking inside a C++ destructor when an exception
> is thrown and taking a stack trace.
>
> MinGW's Win64 "SEH" personality function finesses this issue by taking
> complete control during phase 2 and following the Itanium scheme of
> successive stack unwinding. It has the drawback that it's not really ABI
> compatible with cleanups emitted by other compilers, which I think should
> be a goal for our implementation.
>
> It might be possible to do something similar to what MinGW does, and
> implement our own __gxx_personality* style personality function that
> interprets the same style of LSDA tables, but we *need* to be able to
> establish a new stack frame to run cleanups. We cannot unwind out to the
> original frame that had the landing pad.
>
> In the long term, I think we need to change our representation to
> implement this. Perhaps the cleanup clause of a landing pad could take a
> function pointer as an operand. However, in the short term, I think we can
> model this by always catching the exception and then re-raising it.
> Obviously, this isn't 100% faithful, but it can work.
>
> ---
>
> Here’s some example IR for how we might lower this C code:
>
> #define GetExceptionCode() _exception_code()
> enum { EXCEPTION_INT_DIVIDE_BY_ZERO = 0xC0000094 };
> int safe_div(int n, int d) {
>   int r;
>   __try {
>     r = n / d;
>   } __except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
>     r = 0;
>   }
>   return r;
> }
>
> define internal void @try_body(i32* %r, i32* %n, i32* %d) {
> entry:
>   %0 = load i32* %n, align 4
>   %1 = load i32* %d, align 4
>   %div = sdiv i32 %0, %1
>   store i32 %div, i32* %r, align 4
>   ret void
> }
>
> define i32 @safe_div(i32 %n, i32 %d) {
> entry:
>   %d.addr = alloca i32, align 4
>   %n.addr = alloca i32, align 4
>   %r = alloca i32, align 4
>   store i32 %d, i32* %d.addr, align 4
>   store i32 %n, i32* %n.addr, align 4
>   invoke void @try_body(i32* %r, i32* %n.addr, i32* %d.addr)
>           to label %__try.cont unwind label %lpad
>
> lpad:                                             ; preds = %entry
>   %0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
> @__C_specific_handler to i8*)
>           catch i8* bitcast (i32 (i8*, i8*)* @"\01?filt$0 at 0@safe_div@@"
> to i8*)
>   store i32 0, i32* %r, align 4
>   br label %__try.cont
>
> __try.cont:                                       ; preds = %__except,
> %entry
>   %2 = load i32* %r, align 4
>   ret i32 %2
> }
>
> define internal i32 @"\01?filt$0 at 0@safe_div@@"(i8* %exception_pointers,
> i8* %frame_pointer) {
> entry:
>   %0 = bitcast i8* %exception_pointers to i32**
>   %1 = load i32** %0, align 8
>   %2 = load i32* %1, align 4
>   %cmp = icmp eq i32 %2, -1073741676
>   %conv = zext i1 %cmp to i32
>   ret i32 %conv
> }
>
> declare i32 @__C_specific_handler(...)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141110/f39771c9/attachment.html>

[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR