[cfe-dev] [analyzer] Temporaries.

Thu Apr 12 13:50:23 PDT 2018

On 4/12/18 1:10 PM, Gábor Horváth wrote:
>
>
> On 14 February 2018 at 22:20, Artem Dergachev via cfe-dev 
> <cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>> wrote:
>
>     More explanations on how the analyzer keeps making its way around
>     the C++ AST.
>
>     == Lifetime extension ==
>
>     This is a brain dump of how (and how much) lifetime extension of
>     temporary objects is currently broken in the static analyzert.
>     Spoilers: not too much, it seems, but annoying nevertheless.
>
>     Consider an example:
>
>          1    class C {
>          2    public:
>          3      C() {}
>          4      ~C() {}
>          5    };
>          6
>          7    void foo() {
>          8      const C &c = C();
>          9    }
>
>     With the AST for the variable declaration:
>
>           DeclStmt 0x7fa5ac85cba0 <line:8:3, col:19>
>           `-VarDecl 0x7fa5ac85c878 <col:3, col:18> col:12 c 'const C
>     &' cinit
>             `-ExprWithCleanups 0x7fa5ac85cb30 <col:16, col:18> 'const
>     C' lvalue
>               `-MaterializeTemporaryExpr 0x7fa5ac85cb00 <col:16,
>     col:18> 'const C' lvalue extended by Var 0x7fa5ac85c878 'c' 'const
>     C &'
>                 `-ImplicitCastExpr 0x7fa5ac85cae8 <col:16, col:18>
>     'const C' <NoOp>
>                   `-CXXBindTemporaryExpr 0x7fa5ac85cac8 <col:16,
>     col:18> 'C' (CXXTemporary 0x7fa5ac85cac0)
>                     `-CXXTemporaryObjectExpr 0x7fa5ac85ca88 <col:16,
>     col:18> 'C' 'void ()'
>
>     *here goes a periodic reminder that CXXTemporaryObjectExpr is a
>     sub-class of CXXConstructExpr*
>
>     Notice how MaterializeTemporaryExpr is the innermost expression
>     (the first one in the order of execution) that is an lvalue.
>     Essentially, you can think of it as the mythical
>     "rvalue-to-lvalue" cast that takes in a C++ object rvalue and
>     returns the this-value for that object. Because all C++ objects
>     have a this-value that never changes throughout their lifetime, it
>     is essentially their identity. Otherwise you can't call methods on
>     them.
>
>     MaterializeTemporaryExpr also contains information about the
>     lifetime extension process: we needed the this-value in order to
>     bind it to variable 'c'. You see that in the AST.
>
>     In the analyzer, however, MaterializeTemporaryExpr does a
>     different thing, as a temporary solution (no pun intended). It
>     constructs a new temporary region out of thin air and binds the
>     rvalue object to that temporary in the Store. The respective
>     function in our code is called "createTemporaryRegionIfNeeded". It
>     also has a separate purpose of converting rvalue sub-object
>     adjustments into lvalue sub-object adjustments, which we wouldn't
>     discuss this time.
>
>     Now that we learned how to inline temporary constructors and
>     destructors, it essentially means that the this-value in the
>     constructor and in the destructor would be different. Because
>     first we construct the object into temporary region R1, then we
>     take lazyCompoundVal{R1} to represent the value of
>     CXXTemporaryObjectExpr, then we materialize lazyCompoundVal{R1} to
>     R2, then we bind R2 to variable 'c', then we call the automatic(!)
>     destructor for 'c' which contains R2. To be clear, the region
>     store at the time of destruction would be:
>
>       c: R2,
>       R2: lazyCompoundVal{R1}.
>
>     It means that fields of the object would contain the correct
>     values, there would be the correct number of destructors called
>     (no temporary destructors, just one automatic destructor), but the
>     identity of the object (this-value) would change in the process.
>     Unless the object actually makes any decisions or does any
>     manipulations that involve its this-value, the modeling should be
>     correct. When the object starts to actively use its this-value in
>     its inlined methods, the analysis would do wrong stuff.
>     Additionally, it causes a mess in the checkers when they try to
>     track objects by their this-values - i.e. IteratorChecker has a
>     lot of additional code to work around the fact that the region for
>     the object constantly changes.
>
>     From the above it is clear that MaterializeTemporaryExpr should
>     not construct any new regions, at least not in this case. We
>     already have the correct region, namely R1, which should be re-used.
>
>
> Hi Artem!
>
> We found a strange false positive that might be related to what you 
> describe above but not sure though. Could you take a look?
> It looks like we are seeing a null pointer dereference error, and the 
> null value comes from the destructor which was invoked on a temporary.
> If this is not the case, the path diagnostic might be misleading.
>
> Here is the finding: 
> http://cc.elte.hu:15010/Default/#run=Xerces_Xerces-C_3_2_1_unexplored_first&checker-name=unix.MismatchedDeallocator&checker-name=cplusplus.NewDeleteLeaks&checker-name=core.uninitialized.UndefReturn&checker-name=core.UndefinedBinaryOperatorResult&checker-name=core.NullDereference&checker-name=core.NonNullParamChecker&checker-name=core.DivideZero&checker-name=core.CallAndMessage&tab=allReports&reportHash=2f8548f84329734e674f15d844caeb38&report=6764&subtab=2f8548f84329734e674f15d844caeb38
>
> Regards,
> Gábor

Hmm, i don't fully understand it yet. It seems to me that 
RefHash3KeysIdPoolEnumerator drains the whole pool upon destruction 
(fIdPtrs is not a member of the temporary, it's an external entity), so 
it indeed can't really be safely passed around or copied (i.e. elidable 
copies are not moves).

So i'd easily believe that this code only works because all copies are 
elided, i.e. the code is not portable (at least not until C++17) but 
usually works because most compilers are clever enough. We'll probably 
get rid of such positives when we implement copy elision.

In case i got it all wrong, do you accidentally have a reproducer for me 
to have a closer look? If CTU stuff is hard to provide a reproducer for, 
you should be able to write a small code snippet directly into 
RefHash3KeysIdPool.c to reproduce the problem. I'll also accept a 
trimmed (as in -trim-egraph) exploded graph dot file even if it's huge.

P.S. The "Calling..." icon is hilarious :D
>
>
>     It is tempting to take R1 directly from lazyCompoundVal{R1} - it
>     already has memories about once being a value of R1. I'm not sure
>     it's not going to work - it may work, at least i'm not ready to
>     come up with a counterexample. But the meaning of
>     LazyCompoundVal's parent region is different and coincides with
>     what we want just accidentally. Namely, lazyCompoundVal{R1} is a
>     value of an object that was bound to R1 in some particular moment
>     of time in the past, without any explanation of when this moment
>     of time was - but there's no indication if R1 is the region of the
>     temporary we've just constructed, or a region of an unrelated
>     object that used to have the same value back then. As we'd see
>     later, MaterializeTemporaryExpr doesn't always contain a
>     constructor within it - there are a lot of cases to cover, and if
>     the trick doesn't work even once, it's going to be hard, so i'd
>     probably not going to commit into maintaining this invariant.
>     Though it might be plausible to modify add an SVal kind that does
>     exactly what we mean here - i.e. a value that does indeed
>     correspond to a specific C++ object identified by region. It might
>     be a beautiful solution, but it may also fail miserably if tricky
>     cornercases show up - so i'm not ready to commit to that. Also the
>     correct way of dealing with such values (i.e. for how long are
>     they relevant?) would be extremely difficult to explain to checker
>     developers.
>
>     The more straightforward approach here is to include
>     MaterializeTemporaryExpr (hereinafter MTE) into the construction
>     context. It means, if a temporary that we're about to construct
>     would be lifetime-extended later, we'd rather know about that
>     during construction, and maintain a map in the program state from
>     MTE to their respective temporary regions that were used for
>     representing the respective construction targets. Upon
>     encountering the respective MTE we'd simply retrieve the implicit
>     temporary storage for the value from the program state and declare
>     that temporary region to be the value of the MTE. This would mimic
>     the approach we have with CXXBindTemporaryExprs (hereinafter BTE)
>     and their respective regions that allows temporary destructors to
>     work - but this time it's going to be about
>     MaterializeTemporaryExprs and automatic destructors. I imagine
>     that on the checker side this can potentially be exposed via some
>     sort of State->getTemporaryStorage(Expr) call, but i believe that
>     generally this process should be as transparent to the checkers as
>     possible.
>
>     It sounds as if both of these maps could be eliminated by always
>     making sure that the target temporary is constructed "with" the
>     MTE (for lifetime-extended temproraries) or BTE (for temporaries
>     that require destruction at the end of full-expression). In this
>     case, with the help of construction context-assisted lookahead, we
>     declare that the target of the construction is
>     CXXTempObjectRegion(MTE, LC) or CXXTempObjectRegion(BTE, LC)
>     respectively, rather than CXXTempObjectRegion(CXXConstructExpr).
>     Then during evaluation of MTE or BTE we'd simply construct the
>     same region with the expression we're currently evaluating, and
>     it's automagically going to be the correct region. This approach,
>     however, becomes confusing when we start dealing with elidable
>     constructors (explained below). So for now i believe that it is
>     quite irrelevant which expression is identifying the temporary region.
>
>     == Elidable constructors ==
>
>     While it doesn't sound like an immediately important task to
>     implement copy elision in the analyzer, it may help with making
>     some things easier. And it'd also make some reports fancier, as
>     mentioned in https://reviews.llvm.org/D43144
>     <https://reviews.llvm.org/D43144>.
>
>     Elidable copy-constructors can be explained as a form of lifetime
>     extension. Instead of copying the temporary, they keep using the
>     original value of the temporary, which in some pretty twisted
>     sense means that they are lifetime-extending it to be able to use
>     it. For example, if we modify our example by replacing the
>     lifetime-extending reference variable with a value-type variable:
>
>          1    class C {
>          2    public:
>          3      C() {}
>          4      ~C() {}
>          5    };
>          6
>          7    void foo() {
>          8      C c = C();
>          9    }
>
>     ...then we'd still have an MTE, even though lifetime extension
>     would seem to be gone:
>
>           DeclStmt 0x7fb8f005afb8 <line:8:3, col:12>
>           `-VarDecl 0x7fb8f005ac50 <col:3, col:11> col:5 c 'C' cinit
>             `-ExprWithCleanups 0x7fb8f005afa0 <col:9, col:11> 'C'
>               `-CXXConstructExpr 0x7fb8f005af68 <col:9, col:11> 'C'
>     'void (const C &) noexcept' elidable
>                 `-MaterializeTemporaryExpr 0x7fb8f005af00 <col:9,
>     col:11> 'const C' lvalue
>                   `-ImplicitCastExpr 0x7fb8f005aee8 <col:9, col:11>
>     'const C' <NoOp>
>                     `-CXXBindTemporaryExpr 0x7fb8f005aec8 <col:9,
>     col:11> 'C' (CXXTemporary 0x7fb8f005aec0)
>                       `-CXXTemporaryObjectExpr 0x7fb8f005ae88 <col:9,
>     col:11> 'C' 'void ()'
>
>     In this case the MTE is expressing the fact that the temporary
>     constructed via CXXTemporaryObjectExpr can be "lifetime-extended"
>     (by actually merging it with the stack variable) rather than
>     copied, if the CXXConstructExpr surrounding it would be chosen to
>     be elided. The AST doesn't make the elision choice for us - but is
>     compatible with both choices. The MTE essentially overrides the
>     necessity of immediate destruction provided by the BTE, and lets
>     the surrounding AST decide upon the lifetime of the object.
>
>     Because the analyzer currently does not do copy elision, it will
>     use the MTE only to compute the argument for the elidable
>     copy-constructor, and then perform the copy-construction, and then
>     destroy the original temporary at the end of the full-expression.
>     Note, however, that in this case we need to properly support both
>     the BTE (for the temporary destructor to work) and the MTE (for
>     computing its value). We need to implement the MTE's ability to
>     perform "rvalue-to-lvalue-cast" even if the temporary destruction
>     is still necessary. For this reason, if we rely on constructing
>     temporary regions with the correct BTEs or MTEs, at least one of
>     these tasks becomes impossible to perform.
>
>     If we were to support copy elision, then the
>     CXXTemporaryObjectExpr constructor would go directly into the
>     variable region. For the purposes of modeling, it'd mean that only
>     CXXTemporaryObjectExpr would actually need to be modeled. But this
>     would require additional coding in the construction context to be
>     able to realize that the target is the variable while modeling the
>     CXXTemporaryObjectExpr.
>
>     For the sake of completeness, let's consider the ternary operator
>     example:
>
>          1    class C {
>          2    public:
>          3      C(int) {}
>          4      ~C() {}
>          5    };
>          6
>          7    void foo(int coin) {
>          8      const C &c = coin ? C(1) : C(2);
>          9    }
>
>     The respective AST would be:
>
>           DeclStmt 0x7fc1e20023e0 <line:8:3, col:34>
>           `-VarDecl 0x7fc1e2001dc8 <col:3, col:33> col:12 c 'const C
>     &' cinit
>             `-ExprWithCleanups 0x7fc1e2002370 <col:16, col:33> 'const
>     C' lvalue
>               `-MaterializeTemporaryExpr 0x7fc1e2002340 <col:16,
>     col:33> 'const C' lvalue extended by Var 0x7fc1e2001dc8 'c' 'const
>     C &'
>                 `-ImplicitCastExpr 0x7fc1e2002328 <col:16, col:33>
>     'const C' <NoOp>
>                   `-ConditionalOperator 0x7fc1e20022f8 <col:16,
>     col:33> 'C'
>                     |-ImplicitCastExpr 0x7fc1e2002170 <col:16> 'bool'
>     <IntegralToBoolean>
>                     | `-ImplicitCastExpr 0x7fc1e2002158 <col:16> 'int'
>     <LValueToRValue>
>                     |   `-DeclRefExpr 0x7fc1e2001e28 <col:16> 'int'
>     lvalue ParmVar 0x7fc1e2001c18 'coin' 'int'
>                     |-CXXBindTemporaryExpr 0x7fc1e2002248 <col:23,
>     col:26> 'C' (CXXTemporary 0x7fc1e2002240)
>                     | `-CXXConstructExpr 0x7fc1e2002208 <col:23,
>     col:26> 'C' 'void (const C &) noexcept' elidable
>                     |   `-MaterializeTemporaryExpr 0x7fc1e20021a0
>     <col:23, col:26> 'const C' lvalue
>                     |     `-ImplicitCastExpr 0x7fc1e2002188 <col:23,
>     col:26> 'const C' <NoOp>
>                     |       `-CXXFunctionalCastExpr 0x7fc1e2002078
>     <col:23, col:26> 'C' functional cast to class C
>     <ConstructorConversion>
>                     |         `-CXXBindTemporaryExpr 0x7fc1e2002058
>     <col:23, col:26> 'C' (CXXTemporary 0x7fc1e2002050)
>                     |           `-CXXConstructExpr 0x7fc1e2002018
>     <col:23, col:26> 'C' 'void (int)'
>                     |             `-IntegerLiteral 0x7fc1e2001e60
>     <col:25> 'int' 1
>                     `-CXXBindTemporaryExpr 0x7fc1e20022d8 <col:30,
>     col:33> 'C' (CXXTemporary 0x7fc1e20022d0)
>                       `-CXXConstructExpr 0x7fc1e2002298 <col:30,
>     col:33> 'C' 'void (const C &) noexcept' elidable
>                         `-MaterializeTemporaryExpr 0x7fc1e2002280
>     <col:30, col:33> 'const C' lvalue
>                           `-ImplicitCastExpr 0x7fc1e2002268 <col:30,
>     col:33> 'const C' <NoOp>
>                             `-CXXFunctionalCastExpr 0x7fc1e2002130
>     <col:30, col:33> 'C' functional cast to class C
>     <ConstructorConversion>
>                               `-CXXBindTemporaryExpr 0x7fc1e2002110
>     <col:30, col:33> 'C' (CXXTemporary 0x7fc1e2002108)
>                                 `-CXXConstructExpr 0x7fc1e20020d0
>     <col:30, col:33> 'C' 'void (int)'
>                                   `-IntegerLiteral 0x7fc1e20020b0
>     <col:32> 'int' 2
>
>     Each branch contains two constructors: the temporary and the
>     elidable copy. The temporaries are surrounded with their
>     respective BTEs and copy-elision-kind MTEs, which indicates that
>     they need to be either destroyed as temporaries, or, if copy
>     elision is chosen, have their lifetime decided upon by the
>     surrounding AST. The elidable copy constructors also, being
>     temporaries, have their respective BTEs. Note, however, that there
>     is only one MTE for both BTEs for the elidable constructors.
>
>     So after the conditional operator is resolved (which is the first
>     thing we need to do, according to the CFG), we'd go ahead and
>     perform the constructors, and their trigger would be their
>     respective BTE in the non-elide case, and the single top-level MTE
>     in the elide case. In the non-elide case, copy constructors would
>     be triggered by the top-level MTE.
>
>     It means that, once again, copy elision would prevent us from
>     handling both the BTE and the copy-elision-kind MTE in the single
>     construction, allowing the "predictable target region" trick to
>     work: when we need the temporary destructor, we construct directly
>     into CXXTempObjectRegion of the BTE and it gets automatically
>     picked up during destruction, and when we need the automatic
>     destructor, we construct directly into CXXTempObjectRegion of the
>     MTE and we can easily compute the value of the MTE. But when we
>     don't do copy elision, we'd have to keep at least one of those in
>     the program state map.
>
>     == Return by value ==
>
>     Returning C++ objects by value is actually very similar to
>     constructing it. Consider:
>
>          1    class C {
>          2    public:
>          3      C() {}
>          4      ~C() {}
>          5    };
>          6
>          7    C bar() {
>          8      C c;
>          9      return c;
>         10    }
>         11
>         12    void foo() {
>         13      const C &c = bar();
>         14    }
>
>     With the respective AST for DeclStmt in foo():
>
>           DeclStmt 0x7fe62c84f8e8 <line:13:3, col:21>
>           `-VarDecl 0x7fe62c84f6b8 <col:3, col:20> col:12 c 'const C
>     &' cinit
>             `-ExprWithCleanups 0x7fe62c84f878 <col:16, col:20> 'const
>     C' lvalue
>               `-MaterializeTemporaryExpr 0x7fe62c84f848 <col:16,
>     col:20> 'const C' lvalue extended by Var 0x7fe62c84f6b8 'c' 'const
>     C &'
>                 `-ImplicitCastExpr 0x7fe62c84f830 <col:16, col:20>
>     'const C' <NoOp>
>                   `-CXXBindTemporaryExpr 0x7fe62c84f810 <col:16,
>     col:20> 'C' (CXXTemporary 0x7fe62c84f808)
>                     `-CallExpr 0x7fe62c84f7e0 <col:16, col:20> 'C'
>                       `-ImplicitCastExpr 0x7fe62c84f7c8 <col:16> 'C
>     (*)()' <FunctionToPointerDecay>
>                         `-DeclRefExpr 0x7fe62c84f770 <col:16> 'C ()'
>     lvalue Function 0x7fe62c84f190 'bar' 'C ()'
>
>     And for the ReturnStmt in bar():
>
>           ReturnStmt 0x7fe62c84f5b0 <line:9:3, col:10>
>           `-CXXConstructExpr 0x7fe62c84f578 <col:10> 'C' 'void (const
>     C &) noexcept' elidable
>             `-ImplicitCastExpr 0x7fe62c84f518 <col:10> 'const C'
>     lvalue <NoOp>
>               `-DeclRefExpr 0x7fe62c84f4f0 <col:10> 'C' lvalue Var
>     0x7fe62c84f280 'c' 'C'
>
>     Since https://reviews.llvm.org/D42875
>     <https://reviews.llvm.org/D42875> we can already realize that the
>     constructor in bar(), assuming that we're inlining bar() during
>     analysis, would be constructed into something that is a return
>     value of bar(). This allows us, by looking that the
>     StackFrameContext's call site, to figure out that it is being
>     constructed into the CallExpr in foo(). Now if only we knew that
>     that the call site is a lifetime-extended temporary, i.e. if only
>     we had a pointer to the foo()'s MTE at the CallExpr's CFG element,
>     we'd be able to find the correct target region for construction:
>     the CXXTempObjectRegion for the MTE in the StackFrameContext of
>     foo(). So i'm proposing to add some sort of construction context
>     to not only constructors, but also to functions that return
>     objects, and then during construction perform the lookup in three
>     easy steps:
>
>       1. in the callee's CFG from constructor to return statement,
>       2. through the location from the return statement to the call site,
>       3. then through the caller's CFG from the call site to the MTE.
>
>     If the function is not inlined, we can still make use of the
>     construction context to represent the return value as a
>     LazyCompoundValue of the MTE's temporary. It would eliminate the
>     need to replace the return value with another value while
>     evaluating the MTE, and of course the need to re-bind the object
>     to a different this-region.
>
>     So i believe that this is a good way to eliminate the need for the
>     "createTemporaryRegionIfNeeded" thing in the function calls as well.
>
>
>
>     On 06/02/2018 1:41 PM, Artem Dergachev wrote:
>
>         A bit of an update.
>
>         == Temporary destructors ==
>
>         Adding some initial support for temporary destructors seems
>         pretty easy and straightforward at this point, given the
>         awesome work done by Manuel Klimek on our CFG a few years ago.
>
>         1. We already have the awesome CFGTemporaryDtor elements,
>         which have the backreference to the CXXBindTemporaryExpr for
>         their temporaries.
>
>         2. In simple cases CXXBindTemporaryExprs have an immediate
>         constructor within them, and therefore we can provide the
>         CXXBindTemporaryExprs as the construction context's trigger
>         statements, and therefore have a reliable CXXTempObjectRegion
>         for constructors.
>
>         3. Then we already track the CXXBindTemporaryExprs for the
>         active temporaries in the program state. We can attach any
>         constructor information to them, such as the target region, if
>         we need to (probably we can reconstruct it by looking at the
>         expression and the location context, not sure if we want to).
>
>         4. So when we reach the CFGTemporaryDtor element, we can just
>         lookup all the info we need, and perform the destruction properly.
>
>         5. There's a bit of a liveness problem, because it seems that
>         our liveness analysis tends to throw away the temporary too
>         early. I can easily hack this problem away by marking all
>         regions that correspond to active temporaries as live. I'll
>         see if there's a better solution.
>
>         == CXXDefaultArgExpr problems ==
>
>         There's a known problem with those. Consider:
>
>           void foo(const C &c = C()) {
>           }
>
>           void bar() {
>             foo();
>             foo();
>           }
>
>         Each call of foo() contains a CXXDefaultArgExpr for c. The
>         default argument value C() is constructed before we enter
>         foo() and destroyed after we leave foo(). However, c's
>         initializer, "= C()", is *not part of the AST of bar()*. In
>         particular, when foo() is called twice, the initializer for
>         the two calls is the same, only CXXDefaultArgExprs are
>         different. This screws a lot of invariants in the analyzer:
>         program points start coinciding (causing the analysis to loop
>         and cache out), Environment doesn't expect the same expression
>         in the same location context have two different values
>         (suppose calls are nested into each other), analysis taking
>         wrong branches, and so on.
>
>         Luckily, default-arg expressions that aren't zero integers or
>         null pointers are pretty rare. Still, we'd need to eventually
>         think how to avoid any really bad practical problems with them.
>
>         On 25/01/2018 9:08 AM, Artem Dergachev wrote:
>
>             Handling C++ temporary object construction and destruction
>             seems to be the biggest cause of false positives on C++
>             code at the moment. I'd be looking into this, even though
>             for now i don't see the whole scale of problem.
>
>             == CFG, destructors, and ProgramPoints ==
>
>             We should probably enable `-analyzer-config
>             cfg-temporary-dtors=true` by default soon. It is a fairly
>             low-impact change because it only alters the CFG but the
>             analyzer rarely actually uses the new nodes. Destructors
>             for the most part are still evaluated conservatively, with
>             improper object regions. So it causes almost no changes in
>             the analyzer's positives for now, but it definitely opens
>             up room for further improvements.
>
>             I'm observing a couple of problems with this mode at the
>             moment, namely the rich CFG destructor element hierarchy
>             is not currently complemented by an equally rich
>             ProgramPoint hierarchy. This causes the analysis to merge
>             nodes which are not equivalent, for example two implicit
>             destructors of the same type (with the same function
>             declaration) may sometimes cause the ExplodedGraph to coil
>             up and stop analysis (lost coverage) because of having the
>             same program state at the erroneously-same program point.
>             Because situations when we can guarantee a change in the
>             program state are pretty rare, we'd have to produce more
>             program point kinds to handle this correctly.
>
>             CallEvent hierarchy is also very much affected in a
>             similar manner - because apparently we're constructing
>             program points by looking at CallEvents, so they'd need to
>             carry all the information that's needed to construct the
>             pre-call/post-call program point.
>
>             == Construction contexts ==
>
>             We are correctly modeling "this" object region during
>             construction/destruction of variables with automatic
>             storage duration, fields and base classes, and on operator
>             new() since recently, as long as these aren't arrays of
>             objects. It was not yet implemented for other cases such
>             as temporaries, initializer lists, fields or C++17 base
>             classes within aggregates, and pass-by-value from/to
>             functions (the latter being a slightly different problem
>             than temporaries).
>
>             First of all, by "not yet implemented" i mean that instead
>             of constructing into (destroying) the correct object (in
>             the correct memory region), we come up with a "temporary
>             region", which looks exactly like a region of a valid C++
>             temporary but is only used for communicating that it is
>             not the right region. Then we look at the region, see that
>             it is a temporary, and avoid inlining constructors,
>             because it would make little sense when the region is not
>             correct. However, if we are to model temporaries, we need
>             another way of communicating our failure to find the
>             correct region, which is being addressed by
>             https://reviews.llvm.org/D42457
>             <https://reviews.llvm.org/D42457>
>
>             Also in the cases when the correct region is used, it is
>             being computed in a funky way: in order to figure out
>             where do we construct the object, we walk forward in the
>             CFG (or from child to parent in the AST) to find the
>             next/parent statement that would accomodate the newly
>             constructed object. This means that the CFG, while
>             perfectly telling us what to do in what order (which we,
>             unlike CodeGen, cannot easily take from AST because we
>             cannot afford recursive evaluation of statements, mainly
>             because of inlining), discards too much context to be
>             helpful in understanding how to do it.
>
>             I tried to add the information about such "trigger
>             statements" for constructors into CFG and it is extremely
>             helpful and easy to both use and extend. This assumes
>             adding a new sub-class of CFGElement for constructors,
>             which maintain a "construction context" - for now it's
>             just a trigger statement. For instance, in
>
>               class C { ... };
>               void foo() {
>                 C c;
>               }
>
>             ...the trigger for constructor C() would be DeclStmt `C
>             c`, and once we know this we can immediately figure out
>             that the construction target is the region of variable
>             `c`. Construction trigger is not necessarily a statement -
>             it may be a CXXCtorInitializer, which is an AST node kind
>             of its own. Additionally, when constructing aggregates via
>             initializer lists, we may have the same statement trigger
>             multiple constructors, eg. in
>
>               class C { public: C(int); ~C(); };
>               struct S { C c1, c2, c3; };
>               void foo() {
>                 S s { 1, 2, 3 };
>               }
>
>             ... we have three different constructors (actually six
>             different constructors if we include elidable copy
>             constructors) for c1, c2, c3 (and lack of constructor for
>             `s` because of the aggregate thing). It would be more
>             natural to declare that the specific field or index would
>             be a part of the CFG's construction context, as well as
>             the intermediate InitListExpr, so even in these simple
>             cases the construction context may get quite bulky. And
>             this example is quite popular on actual code - probably
>             the second worst cause of false positives after temporaries.
>
>             For now i have no specific plan on what would construction
>             context for temporaries contain in the general case. I
>             might not be able to get the data structures right from
>             the start. In any case, it might be necessary to perform
>             additional memory allocations for these CFG elements (for
>             analyzer's CFG only, so it wouldn't affect compilation
>             time or warnings).
>
>             I believe that getting the correct target region in as
>             many cases as possible would be the main part of my work
>             for the nearest future. And i want to move most this work
>             to CFG, while letting the analyzer pick up the trigger
>             statement from it and trust it as blindly as possible.
>
>             == Workflow ==
>
>             When working on operator new, i tried hard to maintain a
>             reasonable history, making around 15 local rebases. It was
>             not awesome because it was hard for the reviewers to
>             understand the context of the new changes, and changes
>             could have easily kicked in during rebases. A few lessons
>             learned here would be to commit more aggressively, i.e.
>             avoiding stockpiling a large history of patches
>             (essentially a large branch), which in turn would be
>             possible through trying harder to avoid unrelated
>             hard-to-test core changes (as long as it doesn't require
>             weird workarounds) that aren't covered by a flag (such as
>             those evalCast fixes), in order to make sure reviewing
>             take less responsibility. It's fine if some parts would
>             later be fixed (assuming they would indeed be fixed), if
>             it means making the turnaround faster and the tail of
>             patches shorter - that's also the attitude i'd try to
>             maintain when reviewing stuff myself.
>
>
>
>     _______________________________________________
>     cfe-dev mailing list
>     cfe-dev at lists.llvm.org <mailto:cfe-dev at lists.llvm.org>
>     http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>     <http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180412/da21f2d4/attachment.html>