[PATCH] D11097: New EH representation for MSVC compatibility
listmail at philipreames.com
Wed Jul 22 13:40:21 PDT 2015
On 07/21/2015 05:45 PM, Reid Kleckner wrote:
> rnk added a comment.
> In http://reviews.llvm.org/D11097#209400, @reames wrote:
>> In http://reviews.llvm.org/D11097#209367, @majnemer wrote:
>>> @reames thinking about it a little more, would you be happier with `CatchPadInst` instead of `CatchBlockInst`, etc. and `isEHPad` instead of `isEHBlock`?
> I like the pad terminology. The block terminology came about because we're really imbuing some special properties on the whole BasicBlock. Currently the landingpad instruction is the only instruction with the power to make an edge unsplittable, for example. Anyway, `pad` sounds good to me.
>> That would help. Reusing block causes confusion with the widespread use of "block" to refer to a basic block in the code base.
>> More generally, consistency in the terminology would be a good thing. Let me lay out my current understanding:
>> - A X-"pad" is a place the exceptional control flow might resume.
>> - A catch is a specific type of "pad" which models a language/ABI level catch clause. A catch is exception type specific.
>> - A cleanup is a specific type of "pad" which models scope exit without an explicit exception type.
>> - Both catch and cleanup are statically scoped. There's a begin and end construct for each.
>> In this context, what is a terminateblock? It's clearly a "pad", but why is not just a cleanup?
> Currently, Clang expresses noexcept() as a catch-all clause that simply calls terminate. This is inefficient for both Itanium and MSVC because both LSDA tables can express noexcept with a single bit. We do it because it makes it possible to inline one noexcept function into another EH scope without teaching the inliner that it should transform some function attribute into LLVM IR that makes some C++ runtime function call.
> Terminatepad solves this all by keeping data as data, allowing late IR preparation passes to expand the IR into code (catch-all or filter) or leave it alone if it can be encoded in the table.
Just to be clear, there's no ABI requirement that a terminate block NOT
be a cleanup with a terminate call right? If so, why can't this be
expressed via a late pattern match on the MI level? If it's "just an
optimization", I'm really hesitant to complicate the IR for this corner
>> On the topic of motivation, I still feel like I'm missing a lot. In particular, it feels like the catchpad, cleanuppad, and terminatepad are really close to what's already described by landingpad(*). I get that you need to prevent merging of catch blocks, but why isn't a single "pad fence" at the end of each clause which is unmergeable solve that problem?
>> - We might end up changing how you describe a catch clause, but the idea is already there. You do seem to need different arguments then the existing catch clause bits.
> Yeah, they are the same set of clauses. :) However, with the new instructions, we won't need to do impossibly error-prone pattern matching.
I really don't get this statement. How is a landingpad with a catch
clause instruction fundamentally different then a catchpad? Is it the
fact that landingpads w/compatible clauses be merged? Or is there
something more here?
> How would a "pad fence" statically indicate that after running this cleanup, the next EH action is that catch block over there? Right now we try to figure this out by walking the CFG and trying to find the next conditional branch based on a comparison of the EH selector value. I think the primary failure mode of WinEHPrepare today is that we inline a destructor containing control flow, and then this analysis falls over. I'd rather not have to do dataflow analysis to rediscover this very basic nesting property of the C++ source code.
I was picturturing something like this:
br label %next
The control flow would be separate. We don't currently allow branches
to landingpads, but that seems like a smaller change than adding a bunch
of new instructions.
We'd also need to prevent splitting the padfence from the end of it's
block, but that seems workable. Doing this gives us one new instruction
rather than (catchpadend, catchret, cleanupret).
> Essentially, the new instructions are exactly that: a really strong "pad fence" that keeps all the data and control flow transfers that we want to know about grouped together.
I get that. I'm mostly just concerned that we need this many new
instructions for one new concept in the IR.
More information about the llvm-commits