[LLVMdev] Exception Handling Proposal - Second round

Tue May 17 17:09:34 PDT 2011

Hi Renato,

Thanks for the summary. John and I have been working a lot on our proposal. It's changed significantly since I wrote about it last. It encompasses a lot of John's requirements and fixes the main issues. The key is getting enough time to implement the ideas. As you can imagine, we're swamped here. But this issue has not been dropped at all. :-)

I'm not ready yet to submit the proposal to the LLVM community – it's still a bit rough. Some initial work seems to show that it's not bad and will be easy to implement.

-bw

On May 17, 2011, at 3:16 PM, Renato Golin wrote:

> Hi all,
> 
> Following John's, Duncan's and Bill's proposals about exception
> handling, I thought I'd summarise what has been discussed so far.
> 
> ** The problems we're trying to solve are:
> 
> P1. Different languages have different EH concepts and IR needs to be
> agnostic (as possible) about that
> P2. Inlining and optimisations (currently) destroy the EH semantics
> and produce code that can't unwind
> P3. Clutter in the IR representation of EH leads to unnecessary
> complexity when optimising or inlining
> P4. The back-end should have a simple and unified representation on
> which to build (different) EH tables
> 
> 
> ** The key-facts I've collected after re-reading all emails are:
> 
> F1. There are different families of EH: zero-cost, SjLj etc and they
> should have similar IR representations
> F2. Back-ends should know how to implement them or bail out (thus,
> representation should be *clear*)
> F3. Optimisations should make sure unwinding and normal flow do not overlap
> F4. Avoid artificial impositions on basic-block behaviour and
> dependency to simplify optimisations
> F5. We *must* keep the unwind actions and the order in which they
> execute when inlining
> F6. Some instructions (such as divide in Java) can throw exceptions
> without an explicit dispatch mechanism
> 
> 
> There are two quasi-orthogonal proposals to change the EH mechanism:
> - Duncan Sands', regarding rules on how to protect the dispatch
> mechanism (and preserve actions and their orders) when inlining or
> optimising code, and
> - Bill Wendling's IR simplification using the "dispatch" mechanism to
> better express unwinding flow and ease inlining and optimisations
> 
> 
> ** Proposal 1: Rules on how to protect the unwind flow (P2, F3, F4, F5)
> 
> Current LLVM inlining can create some unreachable blocks that get
> optimised away (and shouldn't). Some languages demand that certain
> clean-up areas must be executed, others that it must not. Some
> libstdc++ code apparently relies on this implementation defined
> behaviour. To solve this problem, work arounds were coded to redirect
> flow to catch-all regions, that created other problems, etc.
> 
> Instead of running around in circles, the following rules must be
> observed when inlining/optimising:
> - When inlining a dispatch area, the inlined block must resume to the
> inlinee's dispatch block
> - If using eh.selector, inlining should append actions to inlinee's
> selector block
> - Optimisers should not remove unwind actions nor change their
> control flow (unless semantics is preserved)
> - If we allow changes, we need to explicitly describe the semantics
> or have one to rule them all
> 
> 
> ** Proposal 2: Dispatch and basic-block markings (P3, P4, F5)
> 
> Replace the eh.selector/eh.typeid by a dispatch mechanism, that
> explicitly lists the possible catch areas, filters, personality and
> belongs to a basic block, that needs an attribute "landingpad" to help
> optimisations understand that that block is special for EH (this might
> not be strictly necessary).
> 
> The general syntax of the dispatch is:
> 
> lpad: landingpad
> %eh_ptr = tail call i8* @llvm.eh.exception()
> dispatch region label %lpad resume to label %unwind
>   catches [
>     %struct.__fundamental_type_info_pseudo* @_ZTIi, label %ch.int.main
>   ]
>   personality [i32 (...)* @__gxx_personality_v0]
> 
> This dispatch instruction is the last instruction in its block. It
> explicitly belongs to that block ("region label %lpad") and resume
> unwinding to label %unwind. It catches only INT exceptions (whatever
> that means in the source language) and the personality routine that is
> going to interpret it during run-time is __gxx_personality_v0.
> 
> When optimising, passes should see the catch/clean-up blocks that are
> dominated by the lading pad and keep their natural flow. When
> inlining, they should be move inside the inlinee and the the "resume
> label" should be the inlinee's dispatch landing pad, so the sequence
> of actions (and the actions themselves) is kept intact.
> 
> The dispatch call can also be attached to the invoke instruction,
> though there were some problems with clean-ups (Bill) and it may
> clutter the IR by repeating the same dispatch for many invokes in one
> single try block.
> 
> I see that the %eh_ptr is not used by the dispatch, how does it know
> what is the type of exception thrown?
> 
> 
> ** What was not covered
> 
> P1/F1/F2: Are these changes EH-style agnostic? Does it at least work
> for Dwarf AND SjLj using the same IR representation? Do we want that
> to happen?
> 
> F6: If a div instruction inside a basic block without EH unwind
> information throws an exception, how does the IR represents that? Do
> we create an invoke to a fake function for every instruction that
> could throw? Do we put the unwind information in the basic-block? In
> the dispatch instruction (like we do for region label)?
> 
> 
> ** Amount of work to do
> 
> I reckon that both changes can be done at the same time. Current work
> is being done in the ARM back-end to support EHABI, which should also
> be orthogonal to those changes (Anton?).
> 
> The inlining changes can be done at any time, no need to change the IR
> or anything and the changes can be reused by the second proposal later
> on.
> 
> The problem is that, to change the IR representation, we need to
> change all front-ends that deal with exception handling (clang,
> llvm-gcc, ada, python etc), and make the back-end iteratively more
> robust to accept the new format, but it'd be hard to quickly
> deactivate the old format.
> 
> I've seen this thread show up and die a few times, and I'm not sure we
> have a pressure to do this at any given time. Do we?
> 
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev