[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR

Mon Nov 24 13:37:59 PST 2014

On Mon, Nov 24, 2014 at 12:12 PM, Kaylor, Andrew <andrew.kaylor at intel.com>
wrote:

>    Hi Reid,
>
>
>
> I've been working on the outlining code and have a prototype that produces
> what I want for a simple case.
>
>
>
> Now I'm thinking about the heuristics for recognizing the various logical
> pieces for C++ exception handling code and removing them once they’ve been
> cloned.  I've been working from various comments you've made earlier in
> this thread, and I'd like to run something by you to make sure we're on the
> same page.
>
>
>
> Starting from a C++ function that looks like this:
>
...

> I'll have IR that looks more or less like this:
>
...

> If I've understood your intentions correctly, we'll have an outlining pass
> that transforms the above IR to this:
>
...

> Does that look about like what you’d expect?
>

Yep! That's basically what I had in mind, but I still have concerns with
this model listed below.

We should also think about how to call std::terminate when cleanup dtors
throw. The current representation for Itanium is inefficient. As a
strawman, I propose making @__clang_call_terminate an intrinsic:

  ...
  invoke void @dtor(i8* %this) to label %cont unwind label %terminate.lpad
cont:
  ret void
terminate.lpad:
  landingpad ... catch i8* null
  call void @llvm.eh.terminate()
  unreachable

This would be good for Itanium EH, as we can actually completely elide
table entries for landing pads that just catch-all and terminate.

> I just have a few questions.
>
>
>
> I'm pretty much just guessing at how you intended the
> llvm.eh.set_capture_block intrinsic to work.  It wasn't clear to me if I
> just needed to set it where the structure was created or if it would need
> to be set anywhere an exception might be thrown.  The answer is probably
> related to my next question.
>

I was imagining it would be called once in the entry block.

Chandler expressed strong concerns about this design, however, as
@llvm.eh.get_capture_block adds an ordering constraint on CodeGen. Once you
add this intrinsic, we *have* to do frame layout of @_Z13do_some_thingRi
*before* we can emit code for all the callers of
@llvm.eh.get_capture_block. Today, this is easy, because module order
defines emission order, but in the great glorious future, codegen will
hopefully be parallelized, and then we've inflicted this horrible
constraint on the innocent.

His suggestion to break the ordering dependence was to lock down the frame
offset of the capture block to always be some fixed offset known by the
target (ie ebp - 4 on x86, if we like that).

In the above example I created a single capture block for the entire
> function.  That works reasonably well for a simple case like this and
> corresponds to the co-location of the allocas in the original IR, but for
> functions with more complex structures and multiple try blocks it could get
> ugly.  Do you have ideas for how to handle that?
>

Not really, it would just get ugly. All allocas used from landing pad code
would get mushed into one allocation. =/

> For C++ exception handling, we need cleanup code that executes before the
> catch handlers and cleanup code that excutes in the case on uncaught
> exceptions.  I think both of these need to be outlined for the MSVC
> environment. Do you think we need a stub handler to be inserted in cases
> where no actual cleanup is performed?
>

I think it's actually harder than that, once you consider nested trys:
void f() {
  try {
    Outer outer;
    try {
      Inner inner;
      g();
    } catch (int) {
      // ~Inner gets run first
    }
  } catch (float) {
    // ~Inner gets run first
    // ~Outer gets run next
  }
  // uncaught exception? Run ~Inner then ~Outer.
}

It's easy to hit this case after inlining as well.

We'd have to generalize @llvm.eh.outlined_handlers more to handle this
case. However, if we generalize further it starts to perfectly replicate
the landing pad structure, with cleanup, catch, and then we'd want to think
about how to represent filter. Termination on exception spec violation
seems to be unimplemented in MSVC, so we'd need our own personality
function to implement filters, but it'd be good to support them in the IR.

We also have to decide how much code duplication of cleanups we're willing
to tolerate, and whether we want to try to annotate the beginning and end
of cleanups like ~Inner and ~Outer.

> I didn't do that in the mock-up above, but it seems like it would simplify
> things.  Basically, I'm imagining a final pattern that looks like this:
>
>
>
> lpad:
>
>   %eh_vals = landingpad { i8*, i32 } personality i8* bitcast (i32 (...)*
> @__CxxFrameHandler3 to i8*)
>
>       cleanup
>
>       catch i8* @typeid1
>
>       catch i8* @typeid2
>
>       ...
>
>   %label = call i8* (...)* @llvm.eh.outlined_handlers(
>
>       void (i8*, i8*)* @<pre-catch cleanup function>,
>
>       i8* @typeid1, i8* (i8*, i8*)* @<typeid1 catch function>,
>
>       i8* @typeid2, i8* (i8*, i8*)* @<typeid2 catch function>,
>
>       ...
>
>       void (i8*, i8*)* @<uncaught exception cleanup function>)
>
>   indirectbr i8* %label
>
>
>
>
>
> Finally, how do you see this meshing with SEH?  As I understand it, both
> the exception handlers and the cleanup code in that case execute in the
> original function context and only the filter handlers need to be
> outlined.  I suppose the outlining pass can look at the personality
> function and change its behavior accordingly.  Is that what you were
> thinking?
>

Pretty much. The outlining pass would behave differently based on the
personality function. SEH cleanups (__finally blocks) actually do need to
get outlined as well as filters, but catches (__except blocks) do not need
to be outlined. That's the main difference. I think it reflects the fact
that you can rethrow a C++ exception, but you can't faithfully "rethrow" a
trap caught by SEH.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141124/063d750a/attachment.html>

[LLVMdev] RFC: How to represent SEH (__try / __except) in LLVM IR

[LLVMdev] RFC: How to represent SEH (try / except) in LLVM IR