[cfe-dev] Getting DecompositionDecl from BindingDecl

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 3 11:02:42 PDT 2018



On 4/2/18 6:25 PM, Richard Smith wrote:
> On 2 April 2018 at 18:04, Artem Dergachev via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     On 4/2/18 2:11 PM, Richard Smith via cfe-dev wrote:
>>     On 2 April 2018 at 13:52, George Karpenkov via cfe-dev
>>     <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>>
>>         Hi Richard,
>>
>>         Thanks for your reply!
>>
>>>         On Mar 30, 2018, at 8:49 PM, Richard Smith
>>>         <richard at metafoo.co.uk <mailto:richard at metafoo.co.uk>> wrote:
>>>
>>>
>>>
>>>             Right, sure, but I don’t have a convenient way to find
>>>             that DecompositionDecl from a given BindingDecl,
>>>             and sometimes I need to act based on the BindingDecl alone.
>>>
>>>
>>>         Can you give an example?
>>
>>         Let’s say a structured binding is written into, and then some
>>         function is called, and then we read from that binding again.
>>         If it was global, we have to assume any function call can
>>         invalidate it, but if it’s local and it wasn’t passed as a
>>         parameter,
>>         chances are it will remain the same.
>>
>>
>>     I think that will just work if you expand references to
>>     BindingDecls into their binding expressions: both before and
>>     after, you'll end up evaluating an lvalue denoting the same
>>     subobject of the DecompositionDecl.
>>
>>>         It was chosen because it matches the semantic model desired
>>>         by the designers of the feature. For example, you can't
>>>         handle bitfields if you model structured bindings as
>>>         reference variables.
>>>
>>>         This was a somewhat controversial decision, but it doesn't
>>>         look like it's going to be reversed.
>>>
>>>>             That's what the CFG for the above function should
>>>>             represent.
>>>
>>>             Sorry, I’m not sure what do you mean here: from my
>>>             understanding, CFG does not transform AST inside
>>>             statements (apart from maybe tiny syntactic things).
>>>
>>>
>>>         IIRC, the CFG expands CXXDefaultArgExprs and
>>>         CXXDefaultInitExprs; this case is analogous to those.
>>
>
>     Thank you for the explanation! I was also very confused.
>
>     We do expand CXXDefaultInitExprs for sema warnings CFG, but not
>     for the CFG analyzer. We don't expand CXXDefaultArgExprs yet.
>
>     I have a concern, which is probably invalid, about expanding
>     expressions that can appear more than once. So far in a lot of
>     places (in the analyzer in particular) we've been relying on every
>     expression appearing only once within a single CFG, which would no
>     longer be the case if we expand CXXDefaultArgExprs or
>     DeclRefExprs-to-BindingDecls.
>
>     In particular, in the analyzer it's common to keep track of "the
>     value of the expression". We are unable to handle the situation
>     when the same expression, within the same call stack frame, in the
>     same moment of time, has two different values. For BindingDecl
>     this shouldn't be a problem because its expression will always
>     have the same value.
>
>
> Yes, a BindingDecl should always evaluate to the same lvalue 
> throughout its lifetime (at least, per the current language rules; 
> there's been some discussion of changing the semantics so that 
> get<N>(e) is evaluated each time the binding is named, but so far that 
> looks unlikely to happen).
>
>     For CXXDefaultArgExpr it seems to also not be a problem because it
>     is always surrounded by its respective function call, which is
>     already the next moment of time - and we won't expand the same
>     CXXDefaultArgExpr twice in the same call. Even if
>     CXXDefaultArgExpr calls itself recursively, we're not on the same
>     stack frame anymore.
>
>
> Interesting, you evaluate default arguments in the callee's stack frame?

We evaluate the default arguments in the caller stack frame before 
jumping into the callee stack frame, and call destructors for them after 
we're back to the callee stack frame. Or rather plan to evaluate, 
because right now we aren't doing very sensible things anyway when the 
argument is not an ICE. I mean, we're not yet evaluating default 
arguments, but we're already having a very conservative evaluation of 
their respective destructors, which already suffers a bit from the 
expression agglutination problem because it relies on the 
CXXBindTemporaryExpr in their initializer expression.

In the analyzer we not only have "stack frames", but we have a 
generalized notion of "location context" which might represent a stack 
frame (which is necessary to avoid collisions of expression values 
during recursion) but it might as well represent an if-statement branch 
or a single loop iteration (it still helps us track local variable 
lifetime; this isn't in fact implemented yet, but it was the original 
design). And we evaluate all expressions keeping in mind that we're in a 
certain location that is represented by the stack of "location 
contexts", which helps us de-duplicate expressions. So the solution of 
having a special "location context" for every CXXDefaultArgExpr, so that 
actual initialization expressions were isolated in their own environment 
and didn't collide, looks quite reliable / flexible / future-proof.

> (We've historically had problems with this in the constant expressoin 
> evaluator, where we're now numbering "versions" of variables and 
> temporaries to distinguish them in situations like this -- see 
> https://reviews.llvm.org/D42776). It's not completely clear to me that 
> you can't need to have the same CXXDefaultArgExpr live twice at the 
> same time; a construct like:
>
> int f(int a = f(0), int b = g());
> f();

Hmm, right, this sounds really scary. We aren't defended from this sort 
of things.

> ... would trigger that if it were valid, and it's conceivable that 
> there are variants of that which are valid.
>
>     Am i understanding correctly that we're not ever going to be
>     required to maintain two different values for CXXDefaultArgExpr
>     sub-expressions or for binding sub expressions? Is there something
>     that protects us from such situations, apart from "that's where we
>     are at the moment"?
>
>
> Good question. I don't think so. And we already can't rely on that for 
> CXXDefaultInitExprs.

CXXDefaultInitExprs look safe to me because every member is initialized 
just once within a single constructor, and it also indeed stays within 
the constructor's stack frame together with their CXXCtorInitializers.

>     The other problem i'm immediately seeing in the analyzer is to
>     prevent it from thinking that we've found an infinite loop when we
>     visit the same expression twice without any change in the state of
>     the program (which is a known but presumably relatively harmless
>     bug in our current CXXDefaultArgExpr handling). But that should be
>     relatively easy to resolve.
>
>     There is also a certain amount of "unknown unknowns" here, because
>     i've no idea in how many other places we rely upon, or will need
>     to rely upon every expression appearing only once.
>
>     The alternative i've been considering for now was to provide "fake
>     stack frames" for evaluating such expandable expressions instead
>     of expanding them directly into the CFG. This would be an
>     analyzer-side fix, but it may be fine if other users of the CFG
>     are fine with expanding.
>
>     It might be also possible to deep-copy the expressions when
>     expanding. But in case of CXXDefaultArgExpr it would assume
>     deep-copying arbitrary expressions, which sounds hard.
>
>
> Yes. Another option might be to stop using the Expr* to identify the 
> CFG element / value, but I suspect that would be a very substantial 
> undertaking?

Yeah, that'd essentially involve deep-converting *every* statement into 
a whole new format of CFG elements and it'd make the CFG much more complex.

If we want to avoid manipulating expressions directly, we need to teach 
our CFG to answer questions like "here's the expression i'm trying to 
evaluate, what CFG elements correspond to its respective 
sub-expressions?", which constitutes a large chunk of the Stmt hierarchy 
complexity. And poor-man's solutions like CFGStmtMap wouldn't cut it as 
an answer because the whole point, to begin with, is to de-duplicate 
potentially repeating expressions, so we can't use those as keys anymore.

>>         Yes, you are right; To be honest, I wasn’t previously
>>         familiar with that part of the codebase.
>>         Yet still, those are very simple replacements, and IIRC those
>>         are the only two.
>>
>>         Supporting structured bindings would be indeed easier if we
>>         could have a simpler AST,
>>         could you give an advice on what would be an equivalent AST
>>         we should rewrite to?
>>
>>
>>     You should rewrite a DeclRefExpr denoting a BindingDecl into the
>>     binding expression (BindingDecl::getBinding()) of that binding.
>>
>>         Would producing a MemberExpr reading from the struct at *use*
>>         time be completely semantically equivalent?
>>
>>
>>     Yes.
>>
>>         My previous impression was that for structured bindings load
>>         from the struct happens when the binding occurs,
>>         not when the actual read is performed.
>>         [though of course such rewrites would impede producing good
>>         diagnostic messages for the user]
>>
>>
>>     The member access happens each time the binding is named, not
>>     up-front.
>>
>>     (However, for a tuple-like class type -- eg, std::tuple<...> or
>>     std::pair<...> -- the binding declarations have expressions that
>>     denote auxiliary variables (BindingDecl::getHoldingVar()) which
>>     are initialized up-front. In order to correctly handle those
>>     cases, when you build the CFG for a DecompositionDecl, you'll
>>     need to walk its bindings, check for holding variables, and also
>>     build CFG for those variables. You can search for
>>     "getHoldingVar()" through the clang codebase to see the other
>>     places this is already done -- in CodeGen and the constant
>>     expression evaluator.)
>>
>>>>             The BindingDecls should not even show up, except as
>>>>             sugar so that clients who care can know that the 'e.x'
>>>>             expression was /written as/ 'v'.
>>>>
>>>>>                 I'd imagine this is best modeled by generating CFG
>>>>>                 nodes for the subexpression on each occurrence of
>>>>>                 such a DeclRefExpr, much like (IIRC) is done for
>>>>>                 CXXDefaultArgExpr and CXXDefaultInitExpr. That
>>>>>                 seems like it would not require too many special
>>>>>                 cases elsewhere.
>>>>>
>>>>>                 (Perhaps the static analyzer needs more than that
>>>>>                 in order to produce more user-friendly diagnostics?)
>>>>
>>>>                 We are not even there yet for structured bindings,
>>>>                 just trying to make the analyzer understand them.
>>>>
>>>>
>>>>
>>>>
>>>>                 _______________________________________________
>>>>                 cfe-dev mailing list
>>>>                 cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>>                 http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>>                 <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>>
>>>
>>>             _______________________________________________
>>>             cfe-dev mailing list
>>>             cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>>             http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>>             <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>>
>>
>>
>>         _______________________________________________
>>         cfe-dev mailing list
>>         cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>>         http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>>         <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>>
>>
>>
>>     _______________________________________________
>>     cfe-dev mailing list
>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev  <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180403/26c8ce2f/attachment.html>


More information about the cfe-dev mailing list