[cfe-dev] Lambda expr AST representation

Thu Oct 4 14:05:50 PDT 2012

Il 04/10/2012 21:26, Eli Friedman ha scritto:
> On Thu, Oct 4, 2012 at 11:51 AM, Abramo Bagnara
> <abramo.bagnara at bugseng.com> wrote:
>> Il 04/10/2012 20:23, Eli Friedman ha scritto:
>>> On Thu, Oct 4, 2012 at 4:05 AM, Abramo Bagnara
>>> <abramo.bagnara at bugseng.com> wrote:
>>>>
>>>> Despite what is written in C++11 5.1.2p7:
>>>>
>>>> The lambda-expression’s compound-statement yields the function-body
>>>> (8.4) of the function call operator, but for purposes of name lookup
>>>> (3.4), determining the type and value of this (9.3.2) and transforming
>>>> id-expressions referring to non-static class members into class member
>>>> access expressions using (*this) (9.3.1), the compound-statement is
>>>> considered in the context of the lambda-expression.
>>>>
>>>> currently clang in its AST insert DeclRefExpr instead of correct
>>>> MemberExpr, as the following typescript shows:
>>>>
>>>> $ cat p.cc
>>>> int f(int a) {
>>>>   return [a]()->int { return a; }();
>>>> }
>>>> $ _clang++ -cc1 -ast-dump -std=c++0x p.cc
>>>> typedef __int128 __int128_t;
>>>> typedef unsigned __int128 __uint128_t;
>>>> typedef __va_list_tag __builtin_va_list[1];
>>>> int f(int a) (CompoundStmt 0x4629a50 <p.cc:1:14, line:3:1>
>>>>   (ReturnStmt 0x4629a30 <line:2:3, col:35>
>>>>     (CXXOperatorCallExpr 0x46299b0 <col:10, col:35> 'int'
>>>>       (ImplicitCastExpr 0x4629998 <col:34, col:35> 'auto (*)(void) const
>>>> -> int' <FunctionToPointerDecay>
>>>>         (DeclRefExpr 0x4629910 <col:34, col:35> 'auto (void) const ->
>>>> int' lvalue CXXMethod 0x4629580 'operator()' 'auto (void) const -> int'))
>>>>       (ImplicitCastExpr 0x4629a18 <col:10, col:33> 'const class <lambda
>>>> at p.cc:2:10>' <NoOp>
>>>>         (LambdaExpr 0x4629748 <col:10, col:33> 'class <lambda at p.cc:2:10>'
>>>>           (ImplicitCastExpr 0x46296b0 <col:11> 'int' <LValueToRValue>
>>>>             (DeclRefExpr 0x4629688 <col:11> 'int' lvalue ParmVar
>>>> 0x45fbf00 'a' 'int'))
>>>>           (CompoundStmt 0x4629728 <col:21, col:33>
>>>>             (ReturnStmt 0x4629708 <col:23, col:30>
>>>>               (ImplicitCastExpr 0x46296f0 <col:30> 'int' <LValueToRValue>
>>>>                 (DeclRefExpr 0x46296c8 <col:30> 'const int' lvalue
>>>> ParmVar 0x45fbf00 'a' 'int')))))))))
>>>>
>>>> Although I'm aware that these DeclRefExpr are handled especially in
>>>> CodeGen I think that this behavior should be considered a defect of AST.
>>>
>>> Despite the reference to "transforming id-expressions" in the
>>> standard, the ASTs were intentionally designed the way they are
>>> because an expression in a lambda acts more like a reference to the
>>> original variable in terms of semantic analysis than some sort of
>>> member reference expression.
>>
>> I'm amazed by this phrase: my message is specifically oriented to have a
>> proper built AST under a semantic analysis point of view. AFAIK the
>> reference to captured variables are *indeed* references to record field
>> and not to original variable: e.g. if original variable captured by
>> value changes after lambda class (closure type) instance generation and
>> before operator() call the value that should be seen is the field of
>> lambda class instance and not the value of captured variable.
>>
>> I'm missing something?
>>
>>> The alternative involves some entirely
>>> new AST nodes to keep around the relevant semantic information, and
>>> from my perspective that would just bloat the AST without any
>>> substantial benefit.
>>
>> Can you explain which semantic information?
> 
> Hmm... maybe the current implementation makes more sense to me because
> I implemented large parts of it, but I don't think of references to
> captured variables like normal member variables... I think of them
> more like references to the original variable from the perspective of
> inside the lambda.  The whole implementation was a bi colored by the
> existing implementation of the Apple blocks extension, where it isn't
> as clear-cut that there's actually an in-memory object containing the
> relevant members.
> 
> There are two reasons we'd need new AST nodes: one, we have to treat
> the "implicit this" differently from the implicit this for normal
> class members, and two, we would need a different kind of member
> reference expression to track the original variable referred to.

I'd suggest a slightly different path:

1) the closure type FieldDecl has the name (actually a pseudo-name) of
the captured variable ("this" to refer to captured this)

2) the FieldDecl uses a bit to represent the fact that fields are the
fields of the closure type (this means they are actually unnamed)

In this way the source pretty printing is easily doable, the semantic
info is accurate, no new AST node is needed, CodeGen is simpler (it does
not need to map DeclRefExpr to MemberExpr).

I've forgot something?

-- 
Abramo Bagnara

BUGSENG srl - http://bugseng.com
mailto:abramo.bagnara at bugseng.com