[cfe-dev] [analyzer] Temporaries.
Aleksei Sidorin via cfe-dev
cfe-dev at lists.llvm.org
Mon Feb 19 01:51:57 PST 2018
Hello Artem,
Thank you for working on this and sharing the status.
Do we expect coverage changes after allowing destructors to be inlined?
25.01.2018 20:08, Artem Dergachev via cfe-dev пишет:
> Handling C++ temporary object construction and destruction seems to be
> the biggest cause of false positives on C++ code at the moment. I'd be
> looking into this, even though for now i don't see the whole scale of
> problem.
>
> == CFG, destructors, and ProgramPoints ==
>
> We should probably enable `-analyzer-config cfg-temporary-dtors=true`
> by default soon. It is a fairly low-impact change because it only
> alters the CFG but the analyzer rarely actually uses the new nodes.
> Destructors for the most part are still evaluated conservatively, with
> improper object regions. So it causes almost no changes in the
> analyzer's positives for now, but it definitely opens up room for
> further improvements.
>
> I'm observing a couple of problems with this mode at the moment,
> namely the rich CFG destructor element hierarchy is not currently
> complemented by an equally rich ProgramPoint hierarchy. This causes
> the analysis to merge nodes which are not equivalent, for example two
> implicit destructors of the same type (with the same function
> declaration) may sometimes cause the ExplodedGraph to coil up and stop
> analysis (lost coverage) because of having the same program state at
> the erroneously-same program point. Because situations when we can
> guarantee a change in the program state are pretty rare, we'd have to
> produce more program point kinds to handle this correctly.
>
> CallEvent hierarchy is also very much affected in a similar manner -
> because apparently we're constructing program points by looking at
> CallEvents, so they'd need to carry all the information that's needed
> to construct the pre-call/post-call program point.
>
> == Construction contexts ==
>
> We are correctly modeling "this" object region during
> construction/destruction of variables with automatic storage duration,
> fields and base classes, and on operator new() since recently, as long
> as these aren't arrays of objects. It was not yet implemented for
> other cases such as temporaries, initializer lists, fields or C++17
> base classes within aggregates, and pass-by-value from/to functions
> (the latter being a slightly different problem than temporaries).
>
> First of all, by "not yet implemented" i mean that instead of
> constructing into (destroying) the correct object (in the correct
> memory region), we come up with a "temporary region", which looks
> exactly like a region of a valid C++ temporary but is only used for
> communicating that it is not the right region. Then we look at the
> region, see that it is a temporary, and avoid inlining constructors,
> because it would make little sense when the region is not correct.
> However, if we are to model temporaries, we need another way of
> communicating our failure to find the correct region, which is being
> addressed by https://reviews.llvm.org/D42457
>
> Also in the cases when the correct region is used, it is being
> computed in a funky way: in order to figure out where do we construct
> the object, we walk forward in the CFG (or from child to parent in the
> AST) to find the next/parent statement that would accomodate the newly
> constructed object. This means that the CFG, while perfectly telling
> us what to do in what order (which we, unlike CodeGen, cannot easily
> take from AST because we cannot afford recursive evaluation of
> statements, mainly because of inlining), discards too much context to
> be helpful in understanding how to do it.
>
> I tried to add the information about such "trigger statements" for
> constructors into CFG and it is extremely helpful and easy to both use
> and extend. This assumes adding a new sub-class of CFGElement for
> constructors, which maintain a "construction context" - for now it's
> just a trigger statement. For instance, in
>
> class C { ... };
> void foo() {
> C c;
> }
>
> ...the trigger for constructor C() would be DeclStmt `C c`, and once
> we know this we can immediately figure out that the construction
> target is the region of variable `c`. Construction trigger is not
> necessarily a statement - it may be a CXXCtorInitializer, which is an
> AST node kind of its own. Additionally, when constructing aggregates
> via initializer lists, we may have the same statement trigger multiple
> constructors, eg. in
>
> class C { public: C(int); ~C(); };
> struct S { C c1, c2, c3; };
> void foo() {
> S s { 1, 2, 3 };
> }
>
> ... we have three different constructors (actually six different
> constructors if we include elidable copy constructors) for c1, c2, c3
> (and lack of constructor for `s` because of the aggregate thing). It
> would be more natural to declare that the specific field or index
> would be a part of the CFG's construction context, as well as the
> intermediate InitListExpr, so even in these simple cases the
> construction context may get quite bulky. And this example is quite
> popular on actual code - probably the second worst cause of false
> positives after temporaries.
>
> For now i have no specific plan on what would construction context for
> temporaries contain in the general case. I might not be able to get
> the data structures right from the start. In any case, it might be
> necessary to perform additional memory allocations for these CFG
> elements (for analyzer's CFG only, so it wouldn't affect compilation
> time or warnings).
>
> I believe that getting the correct target region in as many cases as
> possible would be the main part of my work for the nearest future. And
> i want to move most this work to CFG, while letting the analyzer pick
> up the trigger statement from it and trust it as blindly as possible.
>
> == Workflow ==
>
> When working on operator new, i tried hard to maintain a reasonable
> history, making around 15 local rebases. It was not awesome because it
> was hard for the reviewers to understand the context of the new
> changes, and changes could have easily kicked in during rebases. A few
> lessons learned here would be to commit more aggressively, i.e.
> avoiding stockpiling a large history of patches (essentially a large
> branch), which in turn would be possible through trying harder to
> avoid unrelated hard-to-test core changes (as long as it doesn't
> require weird workarounds) that aren't covered by a flag (such as
> those evalCast fixes), in order to make sure reviewing take less
> responsibility. It's fine if some parts would later be fixed (assuming
> they would indeed be fixed), if it means making the turnaround faster
> and the tail of patches shorter - that's also the attitude i'd try to
> maintain when reviewing stuff myself.
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
--
Best regards,
Aleksei Sidorin,
SRR, Samsung Electronics
More information about the cfe-dev
mailing list