[llvm-commits] [llvm] r129235 - in /llvm/trunk: include/llvm/BasicBlock.h include/llvm/Bitcode/LLVMBitCodes.h lib/AsmParser/LLLexer.cpp lib/AsmParser/LLParser.cpp lib/AsmParser/LLToken.h lib/Bitcode/Reader/BitcodeReader.cpp lib/Bitcode/Writer/BitcodeWriter.cpp lib/VMCore/AsmWriter.cpp lib/VMCore/BasicBlock.cpp test/Feature/bb_attrs.ll

Sun Apr 10 19:40:31 PDT 2011

On Apr 10, 2011, at 12:33 AM, Duncan Sands wrote:

> Hi Chris,
> 
>>> This sounds very interesting.  Do you have a description of what you're angling for?  I'd really like to understand and digest the model before you write too much code.
> 
> since Bill's proposal seemed way too complicated to me, I sent in an
> alternative proposal shortly afterwards, here:
> 
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-December/036731.html
> 
> In particular it has a fairly detailed analysis of some of the existing
> problems with LLVM exception handling.
> 
I didn't comment too much on your idea because I think of the problem differently. To me the problem is systemic and not cosmetic. Your solution is cosmetic in nature, and still has the major problems of the current implementation.

For instance in order to generate good EH tables (this is not to say that the EH code has to be DWARF-specific, but it does need to be rich enough to handling DWARF EH stuff), we need to know at two points in the code all of the information about the call that may throw: (1) at the point where the call is made, and (2) at the point where the decision to execute a catch or cleanup is made. Because of the cleanup code (and even more so inlining), there is a massive disconnect between the call and the decision code that is almost impossible to regain.

This is evident in the DwarfEHPrepare pass, which attempts (in a way that I'm not 100% convinced works correctly for everything) to move the llvm.eh.exception and llvm.eh.selector calls back into the "landing pad". I.e., into the basic block coming off of the `unwind' edge of an invoke instruction. It has to do this because we have an invariant that these intrinsics must be called only from the landing pad and not a post pad. In fact, it's what we use to determine that a landing pad is indeed a landing pad. :-)

This is bad enough, but take into account inlining into the cleanup of an invoke. The inlined code may be arbitrarily complex. Because of that, the point where we make the decision of which catch to execute is separated from the invoke, and it's hard if not impossible to get back the information about where the landing pad really was. If we use the llvm.eh.exception or llvm.eh.selector values, then it must be kept in mind that those values could be stored, loaded, merged in PHI nodes, etc. multiple times between the "landing pad" and catch decision logic. Not only is this prone to error, but it generates rather ugly code that is hard to understand. In truth, there should never be a reason for a the llvm.eh.exception value to be merged in a PHI node, but it is on a routine basis.

The strengths of my proposal are the following:

1. The `landingpad' basic block attribute and the landing pad reference in the `dispatch' instruction make explicit the relationship between an invoke and the catch decision code. There is no need for the DwarfEHPrepare pass to move intrinsics around, and there is no need for the passes to guess at what is a landing pad, because it's explicit.

2. It doesn't rely as heavily upon intrinsics. Intrinsics, while very useful, are difficult for passes to reason about. Few of the passes have knowledge of the llvm.eh.exception and llvm.eh.selector intrinsics built into them, so those instructions are free to move about the function. This would be okay except that we rely upon the code "looking" a certain way because the EH metadata is effectively encoded in the AST.

3. It doesn't encode the EH metadata in the AST. Or, more precisely, it prevents passes from destroying that information for us. The invariants that are built into my proposal ensure that the  code will be easy to understand when it comes time for code generation.

4. It doesn't change the `invoke' instruction, which introduces incompatibility.

-bw