Patch to fix IRGen crash using lambda expression as value in dictionary literal

Wed Sep 17 17:31:24 PDT 2014

On Sep 17, 2014, at 4:55 PM, jahanian <fjahanian at apple.com> wrote:
> Begin forwarded message:
>>> Hmm.  We’ve had a number of similar bugs in the past, and usually the fix is to enter a new expression evaluation context in the right place.  The right place here would be around the emission of the copy-initialization expression, not around the array/dictionary elements.  Doing that should be good enough to fix this in general, I think, but I’d like Doug and Richard to weigh in because there’s weirdness afoot.
>>> 
>>> So, the interesting code here is in BuildBlockForLambdaConversion, at the bottom of SemaLambda.cpp.  This function is used to implement the conversion of a lambda to a block pointer, which occurs (1) within the implementation of the non-standard conversion operator to block-pointer type that we add to all lambda types and (2) as a special case of applying that conversion when the source is a LiteralExpr.  Let’s call case (2) the Literal Hack.  The Literal Hack solves some problems with block memory management which would otherwise bite us, but it’s also the code path that’s causing this bug.
>>> 
>>> ExprResult Sema::BuildBlockForLambdaConversion(SourceLocation CurrentLocation,
>>>                                               SourceLocation ConvLocation,
>>>                                               CXXConversionDecl *Conv,
>>>                                               Expr *Src) {
>>>  …
>>>   ExprResult Init = PerformCopyInitialization(
>>>                      InitializedEntity::InitializeBlock(ConvLocation, 
>>>                                                         Src->getType(), 
>>>                                                         /*NRVO=*/false),
>>>                      CurrentLocation, Src);
>>>  if (!Init.isInvalid())
>>>    Init = ActOnFinishFullExpr(Init.get());
>>>  ...
>>> 
>>> This Init expression will become the copy-init expression in the block's capture of the lambda (its own capture), which is actually used in two different places.  The first is that it’s emitted during the IR-generation of the block literal expression in order to initialize the corresponding capture field; the second is that it's emitted within the block’s copy-helper function in order to initialize the destination capture field.  In this latter situation, IRGen first binds the address of all the captured local variables to the corresponding field in the block.  It’s because of this second use that we generally want to enter a new expression evaluation context around emitting the copy-init expression of a capture, because an arbitrary class’s copy constructor might require arbitrary stuff in default arguments, including the killer cases that absolutely require proper tracking of cleanups.  (Ordinary C++ temporaries themselves are not problematic because we can conservatively say that any expression might have C++ cleanups without causing any lasting harm.)
>>> 
>>> Note that the Src expression parameter to BuildBlockForLambdaConversion becomes a part of this Init expression, and it’s already been emitted;  in the normal case, we could just pass in a std::function to generate it, but in the Literal Hack case we can’t, because we don’t know that we’re doing the Literal Hack until after we’ve fully parsed that expression and the enclosing expression and decided that we need to use the conversion operator.  So if Src could be an arbitrary expression, we’d be in deep trouble, and I think we’d need to split the two use cases and maintain two separate copy-init expressions.
>>> 
>>> Fortunately, it can’t.  In the normal case it’ll be a DeclRefExpr, and in the Literal Hack case it’ll be a LambdaExpr, which has some arbitrarily complex sub-expressions to do capturing but which wraps them all up as full-expressions.  So I think we can just wrap the construction in an EnterExpressionEvaluationContext, flag that there are cleanups in the new context if the lambda type has a non-trivial destructor, and then bind that up as its own full-expression.
>>> 
>>> This is slightly wrong from a modeling perspective because it will put the LambdaExpr inside an ExprWithCleanups, which in principle limits the lifetime of that temporary too much, but the generated code has the correct semantics because we’ll elide the lambda temporary directly into the block’s capture field, and the block's lifetime is at least as long as the lambda temporary’s.
>>> 
>>> So I think the right fix is basically:
>>> 
>>> ExprResult Init;
>>> {
>>>  EnterExpressionEvaluationContext evalContext(…);
>>>  ExprHasCleanups = !Lambda->hasTrivialDestructor();
>>>  Init = PerformCopyInitialization(…);
>>>  if (!Init.isInvalid())
>>>    Init = ActOnFinishFullExpr(Init.get());
>>> }
>>> 
>>> But I’d like to know if I’m missing some way in which arbitrary subexpression structure can show up in Src.
>> 
>> Yeh. This change (with minor correction as attached) has the desired effect of postponing cleanup code for the Literal hack until the outer-most expression is 
>> seen. But, I wait if there are more comments.
>> 
>> 	
> <patch.txt>
> 
> If there is no additional comment or feedback, I would like to check this in.

Unfortunately, I was just talking about this with Doug, and there’s a C++14 feature which basically dooms this approach, sorry.  Block captures need to have separate “here’s how you capture the variable for the first time” and “here’s how you copy the capture” expressions.  The test case is something like:

void test() {
  int x = 0;
  int y = 1;
  void (^block)() = [z=x+y] {};
}

The copy expression cannot use the lambda expression because it cannot refer to x and y.

John.