[cfe-dev] Lambda expr AST representation

Eli Friedman eli.friedman at gmail.com
Thu Oct 4 14:11:07 PDT 2012


On Thu, Oct 4, 2012 at 2:05 PM, Abramo Bagnara
<abramo.bagnara at bugseng.com> wrote:
> Il 04/10/2012 21:26, Eli Friedman ha scritto:
>> On Thu, Oct 4, 2012 at 11:51 AM, Abramo Bagnara
>> <abramo.bagnara at bugseng.com> wrote:
>>> Il 04/10/2012 20:23, Eli Friedman ha scritto:
>>>> On Thu, Oct 4, 2012 at 4:05 AM, Abramo Bagnara
>>>> <abramo.bagnara at bugseng.com> wrote:
>>>>>
>>>>> Despite what is written in C++11 5.1.2p7:
>>>>>
>>>>> The lambda-expression’s compound-statement yields the function-body
>>>>> (8.4) of the function call operator, but for purposes of name lookup
>>>>> (3.4), determining the type and value of this (9.3.2) and transforming
>>>>> id-expressions referring to non-static class members into class member
>>>>> access expressions using (*this) (9.3.1), the compound-statement is
>>>>> considered in the context of the lambda-expression.
>>>>>
>>>>> currently clang in its AST insert DeclRefExpr instead of correct
>>>>> MemberExpr, as the following typescript shows:
>>>>>
>>>>> $ cat p.cc
>>>>> int f(int a) {
>>>>>   return [a]()->int { return a; }();
>>>>> }
>>>>> $ _clang++ -cc1 -ast-dump -std=c++0x p.cc
>>>>> typedef __int128 __int128_t;
>>>>> typedef unsigned __int128 __uint128_t;
>>>>> typedef __va_list_tag __builtin_va_list[1];
>>>>> int f(int a) (CompoundStmt 0x4629a50 <p.cc:1:14, line:3:1>
>>>>>   (ReturnStmt 0x4629a30 <line:2:3, col:35>
>>>>>     (CXXOperatorCallExpr 0x46299b0 <col:10, col:35> 'int'
>>>>>       (ImplicitCastExpr 0x4629998 <col:34, col:35> 'auto (*)(void) const
>>>>> -> int' <FunctionToPointerDecay>
>>>>>         (DeclRefExpr 0x4629910 <col:34, col:35> 'auto (void) const ->
>>>>> int' lvalue CXXMethod 0x4629580 'operator()' 'auto (void) const -> int'))
>>>>>       (ImplicitCastExpr 0x4629a18 <col:10, col:33> 'const class <lambda
>>>>> at p.cc:2:10>' <NoOp>
>>>>>         (LambdaExpr 0x4629748 <col:10, col:33> 'class <lambda at p.cc:2:10>'
>>>>>           (ImplicitCastExpr 0x46296b0 <col:11> 'int' <LValueToRValue>
>>>>>             (DeclRefExpr 0x4629688 <col:11> 'int' lvalue ParmVar
>>>>> 0x45fbf00 'a' 'int'))
>>>>>           (CompoundStmt 0x4629728 <col:21, col:33>
>>>>>             (ReturnStmt 0x4629708 <col:23, col:30>
>>>>>               (ImplicitCastExpr 0x46296f0 <col:30> 'int' <LValueToRValue>
>>>>>                 (DeclRefExpr 0x46296c8 <col:30> 'const int' lvalue
>>>>> ParmVar 0x45fbf00 'a' 'int')))))))))
>>>>>
>>>>> Although I'm aware that these DeclRefExpr are handled especially in
>>>>> CodeGen I think that this behavior should be considered a defect of AST.
>>>>
>>>> Despite the reference to "transforming id-expressions" in the
>>>> standard, the ASTs were intentionally designed the way they are
>>>> because an expression in a lambda acts more like a reference to the
>>>> original variable in terms of semantic analysis than some sort of
>>>> member reference expression.
>>>
>>> I'm amazed by this phrase: my message is specifically oriented to have a
>>> proper built AST under a semantic analysis point of view. AFAIK the
>>> reference to captured variables are *indeed* references to record field
>>> and not to original variable: e.g. if original variable captured by
>>> value changes after lambda class (closure type) instance generation and
>>> before operator() call the value that should be seen is the field of
>>> lambda class instance and not the value of captured variable.
>>>
>>> I'm missing something?
>>>
>>>> The alternative involves some entirely
>>>> new AST nodes to keep around the relevant semantic information, and
>>>> from my perspective that would just bloat the AST without any
>>>> substantial benefit.
>>>
>>> Can you explain which semantic information?
>>
>> Hmm... maybe the current implementation makes more sense to me because
>> I implemented large parts of it, but I don't think of references to
>> captured variables like normal member variables... I think of them
>> more like references to the original variable from the perspective of
>> inside the lambda.  The whole implementation was a bi colored by the
>> existing implementation of the Apple blocks extension, where it isn't
>> as clear-cut that there's actually an in-memory object containing the
>> relevant members.
>>
>> There are two reasons we'd need new AST nodes: one, we have to treat
>> the "implicit this" differently from the implicit this for normal
>> class members, and two, we would need a different kind of member
>> reference expression to track the original variable referred to.
>
> I'd suggest a slightly different path:
>
> 1) the closure type FieldDecl has the name (actually a pseudo-name) of
> the captured variable ("this" to refer to captured this)
>
> 2) the FieldDecl uses a bit to represent the fact that fields are the
> fields of the closure type (this means they are actually unnamed)
>
> In this way the source pretty printing is easily doable, the semantic
> info is accurate, no new AST node is needed, CodeGen is simpler (it does
> not need to map DeclRefExpr to MemberExpr).
>
> I've forgot something?

That could work... although it would be a bit tricky to find the
original captured variable given a MemberExpr of this sort.

-Eli




More information about the cfe-dev mailing list