[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

Reid Kleckner rnk at google.com
Mon Nov 10 15:39:05 PST 2014


Hm, this idea won't work. If we point to labels from landingpadinst then
passes like SimplifyCFG will consider the blocks to be unreachable. I
realized this by looking at llvm-dis output after hacking in asmparser
support for this syntax. :)

I'll have to think longer.

On Mon, Nov 10, 2014 at 2:07 PM, Reid Kleckner <rnk at google.com> wrote:

> Moving this month old RFC to llvmdev. Not sure why I sent this to cfe-dev
> in the first place...
>
> ---
>
> Based on code review discussion from John, he thinks filter expressions
> should be emitted into the body of the function with the try, rather than
> being outlined by the frontend.
>
> Instead of having the frontend create filter functions, we would use
> labels in place of typeinfo. The IR would look like "landingpad ... catch
> label %filter0 ..." instead of "landingpad ... catch ... @filter_func0
> ...". There would be a backend pass similar to SjLjEHPrepare that would
> outline the filter function and cleanup actions. Once we do the outlining,
> there is no turning back, because the outlined function has to know
> something about the stack layout of the parent function. If the parent
> function is inlined, we would have to duplicate the filter function along
> with it.
>
> Given that we want this kind of outlining to handle cleanups, it shouldn't
> be difficult to use the same machinery for filter expressions.
>
> The IR sample for safe_div at the end of my RFC would look like this
> instead:
>
> define i32 @safe_div(i32 %n, i32 %d) {
> entry:
>   %d.addr = alloca i32, align 4
>   %n.addr = alloca i32, align 4
>   %r = alloca i32, align 4
>   store i32 %d, i32* %d.addr, align 4
>   store i32 %n, i32* %n.addr, align 4
>   invoke void @try_body(i32* %r, i32* %n.addr, i32* %d.addr)
>           to label %__try.cont unwind label %lpad
>
> filter:
>   %eh_code = call i32 @llvm.eh.seh.exception_code() ; or similar
>   %cmp = icmp eq i32 %eh_code, 0xC0000094
>   %r = zext i1 %cmp to i32
>   call void @llvm.eh.seh.filter(i32 %r)
>
> lpad:
>   %0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
> @__C_specific_handler to i8*)
>           catch label %filter
>   store i32 0, i32* %r, align 4
>   br label %__try.cont
>
> __try.cont:
>   %2 = load i32* %r, align 4
>   ret i32 %2
> }
>
> define internal void @try_body(i32* %r, i32* %n, i32* %d) {
> entry:
>   %0 = load i32* %n, align 4
>   %1 = load i32* %d, align 4
>   %div = sdiv i32 %0, %1
>   store i32 %div, i32* %r, align 4
>   ret void
> }
>
> On Wed, Oct 1, 2014 at 10:43 AM, Reid Kleckner <rnk at google.com> wrote:
>
>> I want to add SEH support to Clang, which means we need a way to
>> represent it in LLVM IR.
>>
>> Briefly, this is how I think we should represent it:
>> 1. Use a different landingpad personality function for SEH
>> (__C_specific_handler / _except_handlerN)
>> 2. Use filter function pointers in place of type_info globals
>> 3. Outline cleanups such as destructors and __finally on Windows, and
>> provide a function pointer to the landingpad cleanup clause
>>
>> See the example IR at the end of this email. Read on if you want to
>> understand why I think this is the right representation.
>>
>> ---
>>
>> Currently LLVM's exception representation is designed around the Itanium
>> exception handling scheme documented here:
>> http://mentorembedded.github.io/cxx-abi/abi-eh.html
>>
>> LLVM's EH representation is described here, and it maps relatively
>> cleanly onto the Itanium design:
>> http://llvm.org/docs/ExceptionHandling.html
>>
>> First, a little background about what __try is for. It's documented here:
>> http://msdn.microsoft.com/en-us/library/swezty51.aspx
>>
>> The __try construct exists to allow the user to recover from all manner
>> of faults, including access violations and integer division by zero.
>> Immediately, it's clear that this is directly at odds with LLVM IL
>> semantics. Regardless, I believe it's still useful to implement __try, even
>> if it won't behave precisely as it does in other compilers in the presence
>> of undefined behavior.
>>
>> ---
>>
>> The first challenge is that loads in C/C++ can now have exceptional
>> control flow edges. This is impossible to represent in LLVM IR today
>> because only invoke instructions can transfer control to a landing pad. The
>> simplest way to work around this is to outline the body of the __try block
>> and mark it noinline, which is what I propose to do initially.
>>
>> Long term, we could lower all potentially trapping operations to
>> intrinsics that we 'invoke' at the IR level. See also Peter Collingbourne's
>> proposal for iload and istore instructions here (
>> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-April/071732.html).
>>
>> With the outlining approach, in addition to noinline, we need to invent
>> another function attribute to prevent functionattrs from inferring nounwind
>> and readonly, or the optimizers will delete the invoke unwind edge or
>> entire call site.
>>
>> ---
>>
>> The next challenge is actually catching the exception. The __except
>> construct allows the user to evaluate a (mostly) arbitrary expression to
>> decide if the exception should be caught. Code generated by MSVC catches
>> these exceptions with an exception handler provided by the CRT. This
>> handler is analogous to personality functions (__gxx_personality_v0) in
>> LLVM and GCC, so that's what I call it.
>>
>> Notably, SEH and C++ exceptions use *different* personality functions in
>> MSVC, and each function can only have one personality function. This is the
>> underlying reason why one cannot mix C++ EH (even C++ RAII!) and __try in
>> the same function with MSVC. Hypothetically, there is no reason why one
>> could not have a more powerful personality function that handles both types
>> of exception, but I intend to use the personality function from the CRT
>> directly for the time being.
>>
>> On x86_64, the SEH personality function is called __C_specific_handler.
>> On x86, it is __except_handler4 (or 3), but it's similar to
>> __C_specific_handler and doesn't change how we should represent this in IR.
>>
>> The personality function interprets some side tables similar to the
>> Itanium LSDA to find the filter function pointers that must be evaluated to
>> decide which except handler to run at a given PC. The filter expressions
>> are emitted as separate function bodies that the personality function
>> calls. If a filter function returns '1', that indicates which except block
>> will perform the recovery, and phase 1 of unwinding ends, similar to
>> Itanium EH.
>>
>> I propose we represent this in IR by:
>> 1. changing the personality function to __C_specific_handler,
>> __except_handler4, or in the future something provided by LLVM
>> 2. replacing the type_info globals we currently put in landing pads with
>> pointers to filter functions
>>
>> Then, in llvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp where we currently
>> emit the LSDA for Itanium exceptions, we can emit something else keyed off
>> which kind of personality function we're using.
>>
>> ---
>>
>> SEH also allows implementing cleanups with __finally, but cleanups in
>> general are implemented with a fundamentally different approach.
>>
>> During phase 2 of Itanium EH unwinding, control is propagated back up the
>> stack. To run a cleanup, the stack is cleared, control enters the landing
>> pad, and propagation resumes when _UnwindResume is called.
>>
>> On Windows, things work differently. Instead, during phase 2, each
>> personality function is invoked a second time, wherein it must execute all
>> cleanups *without clearing the stack* before returning control to the
>> runtime where the runtime continues its loop over the stack frames. You can
>> observe this easily by breaking inside a C++ destructor when an exception
>> is thrown and taking a stack trace.
>>
>> MinGW's Win64 "SEH" personality function finesses this issue by taking
>> complete control during phase 2 and following the Itanium scheme of
>> successive stack unwinding. It has the drawback that it's not really ABI
>> compatible with cleanups emitted by other compilers, which I think should
>> be a goal for our implementation.
>>
>> It might be possible to do something similar to what MinGW does, and
>> implement our own __gxx_personality* style personality function that
>> interprets the same style of LSDA tables, but we *need* to be able to
>> establish a new stack frame to run cleanups. We cannot unwind out to the
>> original frame that had the landing pad.
>>
>> In the long term, I think we need to change our representation to
>> implement this. Perhaps the cleanup clause of a landing pad could take a
>> function pointer as an operand. However, in the short term, I think we can
>> model this by always catching the exception and then re-raising it.
>> Obviously, this isn't 100% faithful, but it can work.
>>
>> ---
>>
>> Here’s some example IR for how we might lower this C code:
>>
>> #define GetExceptionCode() _exception_code()
>> enum { EXCEPTION_INT_DIVIDE_BY_ZERO = 0xC0000094 };
>> int safe_div(int n, int d) {
>>   int r;
>>   __try {
>>     r = n / d;
>>   } __except(GetExceptionCode() == EXCEPTION_INT_DIVIDE_BY_ZERO) {
>>     r = 0;
>>   }
>>   return r;
>> }
>>
>> define internal void @try_body(i32* %r, i32* %n, i32* %d) {
>> entry:
>>   %0 = load i32* %n, align 4
>>   %1 = load i32* %d, align 4
>>   %div = sdiv i32 %0, %1
>>   store i32 %div, i32* %r, align 4
>>   ret void
>> }
>>
>> define i32 @safe_div(i32 %n, i32 %d) {
>> entry:
>>   %d.addr = alloca i32, align 4
>>   %n.addr = alloca i32, align 4
>>   %r = alloca i32, align 4
>>   store i32 %d, i32* %d.addr, align 4
>>   store i32 %n, i32* %n.addr, align 4
>>   invoke void @try_body(i32* %r, i32* %n.addr, i32* %d.addr)
>>           to label %__try.cont unwind label %lpad
>>
>> lpad:                                             ; preds = %entry
>>   %0 = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
>> @__C_specific_handler to i8*)
>>           catch i8* bitcast (i32 (i8*, i8*)* @"\01?filt$0 at 0@safe_div@@"
>> to i8*)
>>   store i32 0, i32* %r, align 4
>>   br label %__try.cont
>>
>> __try.cont:                                       ; preds = %__except,
>> %entry
>>   %2 = load i32* %r, align 4
>>   ret i32 %2
>> }
>>
>> define internal i32 @"\01?filt$0 at 0@safe_div@@"(i8* %exception_pointers,
>> i8* %frame_pointer) {
>> entry:
>>   %0 = bitcast i8* %exception_pointers to i32**
>>   %1 = load i32** %0, align 8
>>   %2 = load i32* %1, align 4
>>   %cmp = icmp eq i32 %2, -1073741676
>>   %conv = zext i1 %cmp to i32
>>   ret i32 %conv
>> }
>>
>> declare i32 @__C_specific_handler(...)
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141110/ddaf4ddb/attachment.html>


More information about the llvm-dev mailing list