[LLVMdev] Is LLVM expressive enough to represent asynchronous exceptions?

Sohail Somani sohail at taggedtype.net
Sat Jun 11 21:01:22 PDT 2011


Is LLVM expressive enough to represent asynchronous exceptions?
---------------------------------------------------------------

Summary: Need new LLVM instructions or extending of all instructions.

C++ exceptions are synchronous: the compiler knows when/where they are
being raised.

Asynchronous exceptions can be raised at any time. For example, an
integer divide-by-zero may raise an asynchronous exception.

Windows structured exception handling (SEH) is an example of
asynchronous exceptions. UNIX signals are another.

Chip Davis is working on implementing GNU-style C++ exceptions on top
of table-based Windows SEH for the COFF format. That is, implementing
synchronous exceptions using the native platform's asynchronous
exception framework.

I am concerned with representing the handling of asynchronous
exceptions in LLVM as there is language-level support in Windows C++
compilers via __try/__except/__finally blocks (Clang already supports
this at the AST level). I believe that this is not currently possible
and needs new no-op instructions or a change in syntax.

A SEH-block in C++ consists of a __try block and either of two
following blocks:

* A __except block consisting of a filter and body

* A __finally (cleanup) block consisting of a body

The __except filter is essentially a generalized catch as it is a
function call that determines how to handle the exception. It is
different from llvm.eh.selector (although llvm.eh.selector could be
modified to support it.)

An example:

DWORD filter(int code,int*p) {
  printf("code: %x, p: %x\n",code,p);
  *p = 10;
  return EXCEPTION_EXECUTE_HANDLER;
}

void whatever() {
     int p = 5;
     __try {
       printf("p: %d\n",p);
       p /= 0; // SEH
       printf("unreachable\n");
     }
     __except( filter(GetExceptionCode(),&p) ) {
       printf("p: %d\n",p);
     }
}

The output of calling whatever() is:

p: 5                       <-- in __try block
code: c0000094, p: 1af8d0  <-- in filter function
p: 10                      <-- handler (notice value of p changed)

To summarize:

* Exceptions can be raised at any point (asynchronous).

* Control will jump back and forth between multiple contexts generally
  in a user-defined manner, but managed by the runtime.

* The user-defined handling needs to be defined in LLVM IR somehow.

The last one means that some changes to LLVM IR are necessary.

Option 0: Re-using existing machinery
-------------------------------------

This is unfortunately not really an option as far as I can tell
because there is no way to delineate where different handlers are
active without using either option below.

I'd be very happy to be wrong, however.

Option 1: Extend LLVM syntax
----------------------------

For synchronous (C++) exceptions, we use the following syntax when
calling a function:

  invoke ... to label %continue unwind label %cleanup

LLVM assumes exceptions can only arise from function calls and this is
made explicit with the invoke syntax.

One option, which would be consistent with this syntax, is to extend
/every/ single instruction with similar syntax:

  %result = udiv i32 %p, 0 to label %continue unwind label %cleanup

Personally, I like this because of the consistency but I think it may
be a bit too verbose (and requires a lot more changes to syntax).

There is a major problem though: it still assumes that only LLVM
instructions can cause SEH exceptions. This is *not* true in the case
of UNIX signals. SIGINT, for example. So it probably does not fully
capture asynchronous exceptions.

Option 2: Add a no-op
---------------------

Another option is to transliterate the SEH handling code and add some
no-ops like bitcast.

define void whatever()
{
entry:
  ...
aeh.try.enter0:
  ;; -----> this would be the no-op <-----
  llvm.aeh.enter blockaddress(@whatever,%aeh.try.enter0),
blockaddress(@whatever,%aeh.try.except.filter0)
  ...
  br label aeh.try.exit0:
aeh.try.except.filter0: ; no predecessor
  %result = <user code here>
  call
llvm.aeh.continue(%result,blockaddress(@whatever,%aeh.try.except.handler0))
  unreachable
aeh.try.except.body0:   ; no predecessor
  call printf "handler\n"
  br label aeh.try.exit0
aeh.try.exit0:
  ;; -----> this would be the no-op <-----
  llvm.aeh.exit
  br label aeh.try.cont0
aeh.try.cont0:
  <whatever>
}

I like this one because it is explicit.

All options here would require a little bit extra futzing to integrate
with C++ exceptions.

So my question to you is: given that there is interest in proper
Windows support, and that asynchronous exceptions are generally
useful, which option above (or one of your own choosing) would you
use?

Thanks for your time!

Sohail





More information about the llvm-dev mailing list