[cfe-dev] Lambda expr AST representation

Tue Oct 9 10:33:08 PDT 2012

On Tue, Oct 9, 2012 at 9:16 AM, Abramo Bagnara
<abramo.bagnara at bugseng.com>wrote:

> Il 09/10/2012 17:38, Douglas Gregor ha scritto:
> >
> > On Oct 9, 2012, at 8:11 AM, Abramo Bagnara
> > <abramo.bagnara at bugseng.com> wrote:
> >
> >> Il 09/10/2012 15:29, Douglas Gregor ha scritto:
> >>>
> >>> On Oct 4, 2012, at 2:36 PM, Abramo Bagnara
> >>> <abramo.bagnara at bugseng.com> wrote:
> >>>
> >>>>>> I'd suggest a slightly different path:
> >>>>>>
> >>>>>> 1) the closure type FieldDecl has the name (actually a
> >>>>>> pseudo-name) of the captured variable ("this" to refer to
> >>>>>> captured this)
> >>>>>>
> >>>>>> 2) the FieldDecl uses a bit to represent the fact that
> >>>>>> fields are the fields of the closure type (this means they
> >>>>>> are actually unnamed)
> >>>>>>
> >>>>>> In this way the source pretty printing is easily doable,
> >>>>>> the semantic info is accurate, no new AST node is needed,
> >>>>>> CodeGen is simpler (it does not need to map DeclRefExpr to
> >>>>>> MemberExpr).
> >>>>>>
> >>>>>> I've forgot something?
> >>>>>
> >>>>> That could work... although it would be a bit tricky to find
> >>>>> the original captured variable given a MemberExpr of this
> >>>>> sort.
> >>>>
> >>>> I've thought to that, but I failed to imagine a case where this
> >>>> is needed.
> >>>
> >>> It matters a lot for features that care more about the results
> >>> of name lookup than the underlying semantics. For example,
> >>> libclang's clang_findReferencesInFile, which finds all of the
> >>> references to a given declaration, would need to introduce new
> >>> code to map the fields of implicitly-generated MemberExprs back
> >>> to references to a normal variable declaration. In general, these
> >>> tools expect (reasonably, IMO) that a local variable or static
> >>> data member will be referenced with DeclRefExpr, while a
> >>> non-static data member will be referenced with a MemberExpr.
> >>> That's actually a very nice invariant. Doing as you suggest would
> >>> complicate the invariants for these clients, forcing them to deal
> >>> specifically with lambda captures (which they otherwise wouldn't
> >>> have to consider). And if we have to have the complication
> >>> somewhere, I'd rather it be with the more intelligent clients
> >>> that care about semantics, rather than the clients that only care
> >>> about cross-referencing.
> >>
> >> I have a rather different perspective for libclang that IMHO is
> >> more accurate (and congruent with semantic): the variable in body
> >> references the capture list entry (implicit or explicit) while the
> >> capture list entry references the captured variable.
> >
> > You're saying it's "more accurate" because it matches more closely
> > with the as-i" implementation written in the standard, and I don't
> > dispute that. What I dispute is that exactly modeling the as-if rule
> > in the standard is the right thing for Clang. We're not bound to
> > implement lambda classes via exactly that as-i" rule, and we probably
> > shouldn't do so for [&] lambdas anyway (because it's silly to store
> > several references to local variables in the same stack frame). And,
> > as noted above, our ASTs are meant to describe the language (which
> > they certainly do, even if not strictly based on the as-if rule) and
> > are meant to be usable by clients. Your suggestion may make some
> > trivial improvement in the former (for those who want to think in
> > terms of the as-if rule), but complicate other clients. That's not a
> > good trade-off.
>
> I'm rather unconvinced this is a good choice: what about AST Matchers
> that want to find references to captured variable? I'm sure there are
> also other example of things that become rather complex if we permit
> such a large deviation from correct semantics to AST representation (and
> as far as I know this is almost unprecedented).

Most simple cases "just work" if captured variables are represented as
DeclRefExprs referring to the actual variable. For instance:
cross-referencing, renaming, a "replace with initializer" refactoring, etc.
These tools need no knowledge of lambdas or blocks to work correctly right
now. Essentially, we want to make it easy to find all uses of a
declaration, and requiring all consumers to check whether the variable is a
capture and then to map to the captured entity is not easy.

I don't find your argument about standards-conformance to be persuasive.
The capturing names used inside the lambda *do* refer to the variables
declared outside, right up until the transformation in
[expr.prim.lambda]p17 is applied. But Clang's AST is designed to represent
the program *before* desugaring transformations are applied -- we don't
prematurely lower things to make semantic analysis easier -- so there's
certainly an argument that the current representation is more faithful.

Finally, it would seem truly strange to rewrite odr-uses as MemberExprs but
leave non-odr-use mentions of variables as DeclRefExprs, and that is the
natural consequence of the approach you are suggesting. (And, as Doug
points out, rewriting references to entities captured by reference would
seem like a bad idea too.)

> >>>> 1) the closure type FieldDecl has the name (actually a
> >>>> pseudo-name) of the captured variable ("this" to refer to
> >>>> captured this)
> >>>
> >>> As noted elsewhere, we can't do this exactly. The __name bit
> >>> could work, but I'd prefer that we simply keep these as anonymous
> >>> fields, because the __ names really don't help that much.
> >>
> >> I apologize, it seems I've missed the elsewhere where this has
> >> been clarified. Can you explain that?
> >
> > The __ names still require you to detect that you're in a lambda and
> > then remove the '__' from the name. You can't simply remove __ from
> > every name, or you'll get bogus results from various system classes
> > that actually do use __.
>
> I meant to name the fields exactly with the captured name (without __
> prefix) and "this" for captured this. These names would be hidden to
> lookup.

How would the AST refer to that captured 'this'? (implicit
this)->"this"->whatever doesn't work, because you lose source fidelity if
the original reference to "this" was implicit. Again, this seems to add
more complexity.

> >> It would be a pity to not have such names in FieldDecl, this would
> >> make pretty printer job very easy.
> >
> > Adding a
> >
> > VarDecl *getCapturedVariable(FieldDecl *)
> >
> > for lambda classes would make the pretty printer's job trivial.
> > Besides, what's a pretty-printer doing printing the generated lambda
> > class?
>
> The body of operator() of lambda class should be printed.

So you only need the fields to have names because you're proposing that we
build MemberExprs referring to them, right? That is a complexity not
present in the current design.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20121009/da5b0f30/attachment.html>