[cfe-dev] Lambdas and the ABI?

Tue Mar 21 13:02:34 PDT 2017

> On Mar 21, 2017, at 9:03 AM, Hal Finkel <hfinkel at anl.gov> wrote:
> Hi Richard, Chandler, John, et al.,
> 
> "Quick" question: What aspects, if any, of a C++ lambda (e.g. size, layout, alignment) leak into the ABI or are (potentially) semantically visible?

C++ lambdas are anonymous local types that are constrained to appear in function bodies (including implicit "function" bodies, e.g. the initializers of global variables).  The main semantic rule for local types like lambdas is that they're supposed to be the "same type" in all translation units that see the same function body, meaning (among many other things) that the ABI properties of the type must be the same.  Now, most function bodies can only appear in only a single translation unit, so this rule is trivially satisfied and the compiler has total flexibility.  But the body of an inline or template function can usually appear in multiple translation units, and so the ABI has to be consistent.

> Context...
> 
> We're investigating late lowering/outlining schemes to improve code generation for OpenMP and other parallel programming models. One important advantage of outlining late is that the IR-level optimizer still has access to the pointer-aliasing and loop information from the containing function. There are natural benefits to C++ lambdas as well. Lambdas that are used "in place" (i.e. only have one call site) should always be inlined, so the issues don't come up there, but for lambdas that have multiple call sites, or worse, escape (via std::function or some other type-erasure mechanism), we can get suboptimal optimization results for the body of the lambda. It would seem sad to fix this for OpenMP but not for lambdas.
> 
> However, optimizing before outlining could mean changes in what variables need to be captured (not semantically, but at the IR level), and so I'd like to think through what would constrain the optimizer's freedom to act in this regard.

"Subfunction" representations are very useful for coroutine-esque situations where an inner function's control flow interactions with the outer function are essentially well-understood.  That is, to be useful, you really want the blocks of the inner function to fit into the CFG of the outer function at a unique and appropriate place; otherwise you're just contorting the representation of the outer function in a way that will probably massively hamper optimization.

I can definitely see how there might be many situations in OpenMP that have this kind of CFG knowledge.  Swift has some situations like this as well.  Unfortunately, C++ lambdas and ObjC blocks are not a case where we have this knowledge, because the closure can be passed around and invoked arbitrarily many times and at arbitrary later points, which is not something that can ever fit cleanly into the CFG; pre-emptive "outlining" is just a superior representation for such cases.

John.