[LLVMdev] RFC: Exception Handling Proposal II

Wed Nov 24 04:58:07 PST 2010

On Nov 24, 2010, at 2:59 AM, Renato Golin wrote:
> If I got it right, the dispatch instruction will tell the
> instructions/calls to unwind to specific landing pads (cleanup areas,
> terminate), but the region number will encode try/catch areas, so that
> all those cleanup landing pads should ultimately end up in the catch
> area for that region.

Caveat: I'm speaking from what I remember of our discussions, which
is not necessarily what Bill is intending to propose;  that said, I'm
pretty confident that the design hasn't significantly changed.

A dispatch instruction is part of a landing pad, not part of the normal
instruction stream.  A dispatch is actually 1-1 with a specific landing
pad, and that pair of landing pad + dispatch instruction is basically
all a region is.  So the term is a bit misleading because it suggests
that the landing pad is directly associated in the IR with a range of
instructions, whereas in fact the current design is orthogonal from
the question of how you actually reach a landing pad in the first place.

For now, that's still via explicit invokes;  the invoke names the region
it unwinds to — Bill has it listing both the region number and the
landing pad block, which I think is redundant but harmless.

In my opinion, the most crucial property of the new design is that
it makes the chaining of regions explicit in the IR.  The "resume"
edge from a dispatch instruction always leads to either another
region or to a bit of code which re-enters the unwinder in some
opaque way.  When the inliner inlines a call in a protected region
(i.e. an invoke, for now), it just forwards the outermost resume
edges in the inlined function to the innermost region in the calling
function, potentially making the old code unreachable.  Frontends
are responsible for emitting regions and associated resume code
for which this preserves semantics.

So every landing pad actually has a stack of regions which
CodeGen has to examine to write out the unwind tables, but
it's easy to figure out that stack just by chasing links.

While I'm at it, there's another important property of dispatch —
it's undefined behavior to leave the function between landing
at a landing pad and reaching the dispatch.

> If that's so, how do you encode which which landing pad is to be
> followed per region?
> 
> Consider the following code:
> 
> try {
>  Foo f();
>  f.run(); // can throw exception
>  Bar b();
>  b.run(); // can throw exception
>  Baz z();
>  z.run(); // can throw exception
> } catch (...) {
> }

I assume you don't mean these to be function declarations. :)

> The object 'f' is in a different cleanup area than 'b' which, in turn
> is in a different area than 'z'. These three regions should point to
> three different landing pads (or different offsets in the same landing
> pad), which (I believe) are encoded in IR by being declared after
> different dispatch instructions, all of which within the same region.

Nope.  Three regions, three landing pads, three dispatch instructions.
(actually four if Foo::Foo() can throw).  The Baz-destructing region
chains to the Bar-destructing region which chains to the Foo-destructing
region which chains to the catching region;  the first three are
cleanup-only.

> If that's so, why do you still have the invoke call? Why should you
> treat call-exceptions any differently than instruction-exceptions?

One of my favorite things about this design is that it's totally
independent of what exactly is allowed to throw.  I'm really not sure
how best to represent other throwing instructions, except that I'm
pretty confident that we don't want anything as heavyweight as
invoke.  There's a pretty broad range of possibilities — we could
make invoke-like instructions for all of them (ick...), or we could
tag individual instructions with regions, or we could mark basic
blocks as unwinding to particular places.  But we can wrestle
with that independently of deciding to adopt explicitly-chained
landing pads.

John.