[LLVMdev] LLVMdev Digest, Vol 85, Issue 50

Wed Jul 27 11:10:28 PDT 2011

On Jul 27, 2011, at 10:27 AM, Peter Lawrence wrote:
> 3.b)  I have been thinking about other possible control-flow-graph invariants of the
> landingpad blocks and the catch blocks that they lead to, but so far have not come up
> with very muchl,  I wonder if anyone else is thinking about this...?...
> 
> for example cleanups come before __cxa_begin_catch, but it isn't clear what is a cleanup
> and what isn't other than what comes before a __cxa_begin_catch and what comes after ?

The EH representation is independent of things like this.

> however, using that as the definition of cleanup, for C++ any InvokeInst that is so
> identified as cleanup then its only operand has to be terminate  (I think, someone
> please correct me if I've made an incorrect conclusion here).

In C++, any destructor call executed as an EH cleanup would need to be
an invoke whose unwind edge leads to a landing pad with a catch-all and
a call to std::terminate().  However, after inlining etc., I don't know that this
gives us any interesting invariants in the IR.

> 3.c)  I have been thinking about whether the original source code structure of try-catch
> statements can be reconstructed from the IR,  are two try-catches nested, either in the
> try or the catch part, or are they disjoint,  and can the cleanups be identified as such at
> the IR level or have things potentially  been mixmastered up too much after optimization.
> I wonder if anyone else is thinking about this also...?...

It would be difficult to reliably reconstruct try/catch statements from the IR
even before optimization.

> 4)  IIUC, llvm has inherited a bug from gcc where the debugger cannot let the user know an exception is
> going to be uncaught until after the stack has been unwound -- contrary to the design intentions of the 
> unwind library that most exception implementations are based on (with a two phase unwind algorithm) --
> which creates a problem for the debugger user.

I don't see this as a compiler bug.  I can't imagine any personality function
design which would let debuggers interrupt or control unwinding without
hooking libUnwind, short of requiring every single call to have an
associated landing pad which the personality always lands at, even if
there's nothing to do there.  That will never, ever be acceptable.

> and will there be a __llvm_personality_v0 that is designed to do the right thing for this case.
> 
> yes, I know this is a can-of-worms, it will break gcc compatibility, but then perhaps we can be the
> motivation for gnu folks to fix their implementation, be the leader rather than the follower.!.

Using our own personality function would not necessarily break GCC
compatibility;  we'd just need to provide it in compiler-rt or something.

> 4.b) it is not at all clear from your write up what the "cleanup" option for a landingpad is, and
> how this is used when both cleanups AND catches are necessary in a given try-catch source
> code statement, including if one of the user specified catches is a catch-all.

The 'cleanup' bit says that the personality function needs to land
there even if there's no handler.  And yes, it's technically redundant
with a catch-all handler.

> 5) its not clear from your email what is done with the result value of the landingpad instruction,
> but I presume that your intent is that this does not change from the current scheme where
> the "llvm.eh.typeid.for()" is called and its result is compared with the landingpad instruction's
> result...
> 
> ...and then a miracle happens in CodeGen, and most of the intrinsics are thrown away and the
> hard register contents at the resumption at a landingpad from an Unwind include the value that
> llvm.eh.typeid.for() would have returned...

The miracle is just that llvm.eh.typeid.for are replaced with constant values
after all interprocedural optimizations are finished.  Unfortunately, since
the range of constants is global over the function, there is no other
reasonable way to do this while maintaining correctness across inlining
and dead code elimination.

> Also, what is going to happen for the case of cleanup AND catches, currently the result of not
> only the llvm.eh.select() result is cached, but in fact the complete decoding of it relative to
> all the llvm.eh.typeid.for() calls is cached, then the cleanup code executed, THEN finally the
> already decoded value is used to "switch" from the landing pad to the correct catch-block.
> 
> who is going to generate all that code, is it  still going to be explicit in the IR, or is CodeGen going
> to now be responsible creating it.

It will still be explicit in the IR.

> 6) it would be nice if the existing UnwindInst could be retained.  I wince at naming an instruction
> "Resume" since in the English language it is so ambiguous (resume normal execution following
> the conclusion of handing an exception, verses resume throwing an exception).   IE cosmetics
> do matter.

I would be fine with still calling resume "unwind", but the new instruction
does need to carry extra information.

> 7) there are still lots of other intrinsics/routines involved:
> 	__cxa_allocate_exception
> 	__cxa_throw,   cxa_rethrow
> 	__cxa_begin_catch(),    __cxa_end_catch
> although these particular ones seem to be the easiest to document as they do seem to be
> translated verbatim (no CodeGen miracles).

These are not intrinsics, and it's not our responsibility to document them.
If you're borrowing the Itanium C++ EH routines to implement exceptions
in your own language, then you need to understand how Itanium C++ EH
works, and you should read their documentation.

> 8)  I really like the idea of "terminate" being one of the options to the landingpad
> instruction, it makes identification of abnormal code more direct (otherwise control-
> flow analysis has to be done to see if __terminate() is reachable to conclude that
> something is abnormal code, and I really don't like that analysis, it seems too error-
> prone as __terminate() might be reachable for other reasons (not that I have come
> up with such a scenario yet, but I think I might be able to), and this conclusion would
> then be ambiguous).

_gxx_personality_v0 can only do its special-case terminate encoding
in the LSDA if that's the only possible handler.  That means that, for
correctness under inlining, front-ends targeting that personality will
still always need their landing pads to contain explicit calls to
std::terminate().

John.