Patch to fix IRGen crash using lambda expression as value in dictionary literal

John McCall rjmccall at apple.com
Tue Sep 16 19:20:26 PDT 2014


On Sep 16, 2014, at 3:56 PM, jahanian <fjahanian at apple.com> wrote:
> Hi John,
> 
> This test case:
> 
> id test_dict()
> {
>         return @{
>                 @"a": [](){},
>                 @"b": [](){}
>         };
> }
> 
> Crashes in IRGen because ExprWithCleanups for the dummy BlockDecl we use to facilitate lambda to block conversion
> is at the wrong place.
> This happens because a lambda expression is considered a full expression so it calls MaybeCreateExprWithCleanups
> (see Sema::BuildBlockForLambdaConversion).
> which ends up adding the clean up code for the BlockDecl of the first lambda to the expression for the 2nd lambda!
> 
> Ideally, MaybeCreateExprWithCleanups should be called when the entire dictionary literal is seen. But in this case,
> this does not happen. Attached patch fixes this by making sure that after ‘every’ parse of the ‘value’ expression
> MaybeCreateExprWithCleanups is called. This, in effect, assumes that the ‘value’ literal is a full expression.
> I don’t know the ramification of this change in the common case as couple of non-lambda literal tests broke.
> (it now generates the none-exc version of call instead of exc versions in couple of places).
> 
> Please review. Let me know if you need additional info.

Hmm.  We’ve had a number of similar bugs in the past, and usually the fix is to enter a new expression evaluation context in the right place.  The right place here would be around the emission of the copy-initialization expression, not around the array/dictionary elements.  Doing that should be good enough to fix this in general, I think, but I’d like Doug and Richard to weigh in because there’s weirdness afoot.

So, the interesting code here is in BuildBlockForLambdaConversion, at the bottom of SemaLambda.cpp.  This function is used to implement the conversion of a lambda to a block pointer, which occurs (1) within the implementation of the non-standard conversion operator to block-pointer type that we add to all lambda types and (2) as a special case of applying that conversion when the source is a LiteralExpr.  Let’s call case (2) the Literal Hack.  The Literal Hack solves some problems with block memory management which would otherwise bite us, but it’s also the code path that’s causing this bug.

  ExprResult Sema::BuildBlockForLambdaConversion(SourceLocation CurrentLocation,
                                                 SourceLocation ConvLocation,
                                                 CXXConversionDecl *Conv,
                                                 Expr *Src) {
    …
     ExprResult Init = PerformCopyInitialization(
                        InitializedEntity::InitializeBlock(ConvLocation, 
                                                           Src->getType(), 
                                                           /*NRVO=*/false),
                        CurrentLocation, Src);
    if (!Init.isInvalid())
      Init = ActOnFinishFullExpr(Init.get());
    ...

This Init expression will become the copy-init expression in the block's capture of the lambda (its own capture), which is actually used in two different places.  The first is that it’s emitted during the IR-generation of the block literal expression in order to initialize the corresponding capture field; the second is that it's emitted within the block’s copy-helper function in order to initialize the destination capture field.  In this latter situation, IRGen first binds the address of all the captured local variables to the corresponding field in the block.  It’s because of this second use that we generally want to enter a new expression evaluation context around emitting the copy-init expression of a capture, because an arbitrary class’s copy constructor might require arbitrary stuff in default arguments, including the killer cases that absolutely require proper tracking of cleanups.  (Ordinary C++ temporaries themselves are not problematic because we can conservatively say that any expression might have C++ cleanups without causing any lasting harm.)

Note that the Src expression parameter to BuildBlockForLambdaConversion becomes a part of this Init expression, and it’s already been emitted;  in the normal case, we could just pass in a std::function to generate it, but in the Literal Hack case we can’t, because we don’t know that we’re doing the Literal Hack until after we’ve fully parsed that expression and the enclosing expression and decided that we need to use the conversion operator.  So if Src could be an arbitrary expression, we’d be in deep trouble, and I think we’d need to split the two use cases and maintain two separate copy-init expressions.

Fortunately, it can’t.  In the normal case it’ll be a DeclRefExpr, and in the Literal Hack case it’ll be a LambdaExpr, which has some arbitrarily complex sub-expressions to do capturing but which wraps them all up as full-expressions.  So I think we can just wrap the construction in an EnterExpressionEvaluationContext, flag that there are cleanups in the new context if the lambda type has a non-trivial destructor, and then bind that up as its own full-expression.

This is slightly wrong from a modeling perspective because it will put the LambdaExpr inside an ExprWithCleanups, which in principle limits the lifetime of that temporary too much, but the generated code has the correct semantics because we’ll elide the lambda temporary directly into the block’s capture field, and the block's lifetime is at least as long as the lambda temporary’s.

So I think the right fix is basically:

  ExprResult Init;
  {
    EnterExpressionEvaluationContext evalContext(…);
    ExprHasCleanups = !Lambda->hasTrivialDestructor();
    Init = PerformCopyInitialization(…);
    if (!Init.isInvalid())
      Init = ActOnFinishFullExpr(Init.get());
  }

But I’d like to know if I’m missing some way in which arbitrary subexpression structure can show up in Src.

John.



More information about the cfe-commits mailing list