[cfe-dev] [analyzer] Temporaries.

Mon Feb 19 01:51:57 PST 2018

Hello Artem,

Thank you for working on this and sharing the status.
Do we expect coverage changes after allowing destructors to be inlined?

25.01.2018 20:08, Artem Dergachev via cfe-dev пишет:
> Handling C++ temporary object construction and destruction seems to be 
> the biggest cause of false positives on C++ code at the moment. I'd be 
> looking into this, even though for now i don't see the whole scale of 
> problem.
>
> == CFG, destructors, and ProgramPoints ==
>
> We should probably enable `-analyzer-config cfg-temporary-dtors=true` 
> by default soon. It is a fairly low-impact change because it only 
> alters the CFG but the analyzer rarely actually uses the new nodes. 
> Destructors for the most part are still evaluated conservatively, with 
> improper object regions. So it causes almost no changes in the 
> analyzer's positives for now, but it definitely opens up room for 
> further improvements.
>
> I'm observing a couple of problems with this mode at the moment, 
> namely the rich CFG destructor element hierarchy is not currently 
> complemented by an equally rich ProgramPoint hierarchy. This causes 
> the analysis to merge nodes which are not equivalent, for example two 
> implicit destructors of the same type (with the same function 
> declaration) may sometimes cause the ExplodedGraph to coil up and stop 
> analysis (lost coverage) because of having the same program state at 
> the erroneously-same program point. Because situations when we can 
> guarantee a change in the program state are pretty rare, we'd have to 
> produce more program point kinds to handle this correctly.
>
> CallEvent hierarchy is also very much affected in a similar manner - 
> because apparently we're constructing program points by looking at 
> CallEvents, so they'd need to carry all the information that's needed 
> to construct the pre-call/post-call program point.
>
> == Construction contexts ==
>
> We are correctly modeling "this" object region during 
> construction/destruction of variables with automatic storage duration, 
> fields and base classes, and on operator new() since recently, as long 
> as these aren't arrays of objects. It was not yet implemented for 
> other cases such as temporaries, initializer lists, fields or C++17 
> base classes within aggregates, and pass-by-value from/to functions 
> (the latter being a slightly different problem than temporaries).
>
> First of all, by "not yet implemented" i mean that instead of 
> constructing into (destroying) the correct object (in the correct 
> memory region), we come up with a "temporary region", which looks 
> exactly like a region of a valid C++ temporary but is only used for 
> communicating that it is not the right region. Then we look at the 
> region, see that it is a temporary, and avoid inlining constructors, 
> because it would make little sense when the region is not correct. 
> However, if we are to model temporaries, we need another way of 
> communicating our failure to find the correct region, which is being 
> addressed by https://reviews.llvm.org/D42457
>
> Also in the cases when the correct region is used, it is being 
> computed in a funky way: in order to figure out where do we construct 
> the object, we walk forward in the CFG (or from child to parent in the 
> AST) to find the next/parent statement that would accomodate the newly 
> constructed object. This means that the CFG, while perfectly telling 
> us what to do in what order (which we, unlike CodeGen, cannot easily 
> take from AST because we cannot afford recursive evaluation of 
> statements, mainly because of inlining), discards too much context to 
> be helpful in understanding how to do it.
>
> I tried to add the information about such "trigger statements" for 
> constructors into CFG and it is extremely helpful and easy to both use 
> and extend. This assumes adding a new sub-class of CFGElement for 
> constructors, which maintain a "construction context" - for now it's 
> just a trigger statement. For instance, in
>
>   class C { ... };
>   void foo() {
>     C c;
>   }
>
> ...the trigger for constructor C() would be DeclStmt `C c`, and once 
> we know this we can immediately figure out that the construction 
> target is the region of variable `c`. Construction trigger is not 
> necessarily a statement - it may be a CXXCtorInitializer, which is an 
> AST node kind of its own. Additionally, when constructing aggregates 
> via initializer lists, we may have the same statement trigger multiple 
> constructors, eg. in
>
>   class C { public: C(int); ~C(); };
>   struct S { C c1, c2, c3; };
>   void foo() {
>     S s { 1, 2, 3 };
>   }
>
> ... we have three different constructors (actually six different 
> constructors if we include elidable copy constructors) for c1, c2, c3 
> (and lack of constructor for `s` because of the aggregate thing). It 
> would be more natural to declare that the specific field or index 
> would be a part of the CFG's construction context, as well as the 
> intermediate InitListExpr, so even in these simple cases the 
> construction context may get quite bulky. And this example is quite 
> popular on actual code - probably the second worst cause of false 
> positives after temporaries.
>
> For now i have no specific plan on what would construction context for 
> temporaries contain in the general case. I might not be able to get 
> the data structures right from the start. In any case, it might be 
> necessary to perform additional memory allocations for these CFG 
> elements (for analyzer's CFG only, so it wouldn't affect compilation 
> time or warnings).
>
> I believe that getting the correct target region in as many cases as 
> possible would be the main part of my work for the nearest future. And 
> i want to move most this work to CFG, while letting the analyzer pick 
> up the trigger statement from it and trust it as blindly as possible.
>
> == Workflow ==
>
> When working on operator new, i tried hard to maintain a reasonable 
> history, making around 15 local rebases. It was not awesome because it 
> was hard for the reviewers to understand the context of the new 
> changes, and changes could have easily kicked in during rebases. A few 
> lessons learned here would be to commit more aggressively, i.e. 
> avoiding stockpiling a large history of patches (essentially a large 
> branch), which in turn would be possible through trying harder to 
> avoid unrelated hard-to-test core changes (as long as it doesn't 
> require weird workarounds) that aren't covered by a flag (such as 
> those evalCast fixes), in order to make sure reviewing take less 
> responsibility. It's fine if some parts would later be fixed (assuming 
> they would indeed be fixed), if it means making the turnaround faster 
> and the tail of patches shorter - that's also the attitude i'd try to 
> maintain when reviewing stuff myself.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev

-- 
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics