[cfe-dev] [analyzer] Inlining, operator new(), checker callbacks.

Fri Dec 8 15:46:18 PST 2017

All right, so how do we want to equip C++ operator new() with checker 
callbacks?

~~~

First of all, a quick note about how do we evaluate call-expressions. 
Typical code to do that in the analyzer (eg. ExprEngine::VisitCallExpr) 
kinda looks like this:

```
   // Take node `Pred` from the CoreEngine's worklist.
   NodeSet1 = { Pred };

   NodeSet2;
   for (N in NodeSet1) {
     // Checkers evaluate check::PreStmt<CallExpr> event, put their 
transitions to NodeSet2.
     runCheckersForPreStmt(N, &NodeSet2);
   }

   NodeSet3;
   for (N in NodeSet2) {
     // Checkers evaluate check::PreCall event, put their transitions to 
NodeSet3.
     runCheckersForPreCall(N, &NodeSet3);
   }

   NodeSet4;
   for (N in NodeSet3) {
     // Either eval::Call event in Checkers, or inlineCall, or 
conservativeEvalCall.
     evalCall(N, &NodeSet4);
   }

   NodeSet5;
   for (N in NodeSet4) {
     // Checkers evaluate check::PostCall event, put their transitions 
to NodeSet5.
     runCheckersForPostCall(N, &NodeSet5);
   }

   NodeSet6;
   for (N in NodeSet5) {
     // Checkers evaluate check::PostStmt<CallExpr> event, put their 
transitions to NodeSet6.
     runCheckersForPreStmt(N, &NodeSet6);
   }

   // Put nodes from NodeSet6 back to the worklist.
```

During evalCall(), if any checker's eval::Call succeeds, that checker 
simply puts their transitions to NodeSet4. During conservativeEvalCall, 
the core puts exactly one new node to NodeSet4.

The interesting part here is that in case of inlineCall(), *the call is 
not actually evaluated* (!). The only thing that happens here is that 
inlineCall enters the stack frame and puts the node with the new stack 
frame *directly to the worklist* (!!), completely bypassing NodeSet4. 
Because of that, in inlineCall case, NodeSet{4,5,6} are all empty, and 
the remaining code does nothing, so no PostCall callbacks get called 
here until the call is actually evaluated. However, when the call is 
fully inlined, at the CallExit program point 
(ExprEngine::processCallExit()), PostCall and PostStmt callbacks are 
called manually. So we get the correct sequence of callbacks regardless.

~~~

Now, suppose we have operator new:

   new (args1...) C(args2...)

The semantics of this code can be expressed in the following 
"statement-expression" pseudo-code:

```
   01  C *_this = (C *) operator new(sizeof(C), args1...);
   02  if (_this) _this->C(args2...);
   03  return _this;
```

Note that the constructor is simply not called when the operator returns 
nullptr.
Note the cast on the first line - needs to be modeled, because operator 
new returns `void *`.
Note that i did ask George if he thinks that expressing this piece of 
code as a body-farm-thing is a good idea, and he didn't think so :)

By looking at this construct, it should be obvious that the relationship 
between CXXNewExpr and its respective CXXConstructExpr is more 
complicated than that of a "normal" expression and its sub-expression. 
They are kinda computed in the counter-intuitive order, and the easiest 
way to explain that would be to announce that there are actually three 
"expressions" involved: operator new call fake expression, constructor 
call expression, and the "big new-expression" of which both of these are 
children, in that order.

This is how our CFG currently sees it, in case of `-analyzer-config 
c++-allocator-inlining=true`:

   1. CFGNewAllocator  // evaluate `operator new(sizeof(C), args1...)`
   2. CXXConstructExpr  // evaluate `_this->C(args2...)`
   3. CXXNewExpr  // bind `_this` to the expression

So i guess it's so far so good.

At 1., we do defaultEvalCall for operator new. FIXME: do a regular 
evalCall, i.e. allow checkers to evaluate operator new. Also in 
https://reviews.llvm.org/D40560 we add a new program state trait to hold 
the artificial variable "_this", since there's no room for it in the 
Environment, since it's not an expression but something we made up. Also 
we perform the cast from `void *` to `C *`.

At 2., we evalCall for the constructor with the help of our fake 
variable `_this`. FIXME: we should also do the if() part of it, i.e. 
don't evaluate the constructor when the operator returns null. It 
shouldn't be a state split though, i guess we should just suppress the 
branch on which the return value is null, unless we inlined the operator 
new and sure it's null.

At 3., we take `_this` and declare that the value stored in it would 
from now on be the value of the "big new-expression" that unites them 
all, regardless of whether it's null or not.

~~~

Now when it comes to checker callbacks, it's a bit messy right now. On 
CFGNewAllocator call evaluation, the call site is CXXNewExpr. It means 
that if the operator is inlined, and we hit processCallExit(), as 
explained above, we'd be triggering PostStmt<CXXNewExpr>, even though we 
didn't ever trigger PreStmt<CXXNewExpr>. Then at 3., we'd have 
PreStmt<CXXNewExpr> and then another(!!!) PostStmt<CXXNewExpr>, which 
drives MallocChecker crazy. This, of course, can be trivially fixed. But 
i wanted to write this long explanation to specifically highlight this 
bug, with the hope that in the future it would be less likely to get 
reintroduced.

And then, now that we realize that we have three kinda-statements here, 
the question is, do we want all three equipped with checker callbacks? 
We could plan to make a new callback that'd be surrounding 
CFGNewAllocator similarly to how PreStmt/PostStmt surrounds 
CXXConstructExpr and CXXNewExpr.

It may look like this:

(A)

   -> check::PreCXXAllocator  // new callback
     -> check::PreCall
       1. CFGNewAllocator
     <- check::PostCall
   <- check::PostCXXAllocator  // new callback

   -> check::PreStmt<CXXConstructExpr>
     -> check::PreCall
       2. CXXConstructExpr
     <- check::PostCall
   <- check::PostStmt<CXXConstructExpr>

   -> check::PreStmt<CXXNewExpr>
     3. CXXNewExpr
   <- check::PostStmt<CXXNewExpr>

Or like this:

(B)

   -> check::PreStmt<CXXNewExpr>

     -> check::PreCXXAllocator  // new callback
       -> check::PreCall
         1. CFGNewAllocator
       <- check::PostCall
     <- check::PostCXXAllocator  // new callback

     -> check::PreStmt<CXXConstructExpr>
       -> check::PreCall
         2. CXXConstructExpr
       <- check::PostCall
     <- check::PostStmt<CXXConstructExpr>

     3. CXXNewExpr
   <- check::PostStmt<CXXNewExpr>

Or like this:

(C)

   -> check::PreStmt<CXXNewExpr>
     -> check::PreCall
       1. CFGNewAllocator
     <- check::PostCall
   <- check::PostStmt<CXXNewExpr>

   -> check::PreStmt<CXXConstructExpr>
     -> check::PreCall
       2. CXXConstructExpr
     <- check::PostCall
   <- check::PostStmt<CXXConstructExpr>

   -> check::PreBigNewExpr  // new callback (needs better name)
     3. CXXNewExpr
   <- check::PostBigNewExpr  // new callback

Or even like this:

(D)

   -> check::PreWholeNewThing  // new callback

     -> check::PreStmt<CXXNewExpr>
       -> check::PreCall
         1. CFGNewAllocator
       <- check::PostCall
     <- check::PostStmt<CXXNewExpr>

     -> check::PreStmt<CXXConstructExpr>
       -> check::PreCall
         2. CXXConstructExpr
       <- check::PostCall
     <- check::PostStmt<CXXConstructExpr>

     3. CXXNewExpr
   <- check::PostWholeNewThing  // new callback

Or maybe even like this if we want:

(E)

   -> check::PreWholeNewThing  // new callback (needs better name)

     -> check::PreCXXAllocator  // another new callback
       -> check::PreCall
         1. CFGNewAllocator
       <- check::PostCall
     <- check::PostCXXAllocator  // new callback

     -> check::PreStmt<CXXConstructExpr>
       -> check::PreCall
         2. CXXConstructExpr
       <- check::PostCall
     <- check::PostStmt<CXXConstructExpr>

     -> check::PreStmt<CXXNewExpr>
       3. CXXNewExpr
     <- check::PostStmt<CXXNewExpr>

   -> check::PostWholeNewThing  // new callback

Variant (A) is what we commonly do for all other expressions. For 
instance, if we have an expression `a + b`, we only do 
PreStmt<BinaryOperator> *after* evaluation of both `a` and `b`. It kind 
of represents the semantics exactly. However, it would be completely 
counter-intuitive for checker authors to subscribe to 
PreStmt<CXXNewExpr> and find out that by the time their callback fires, 
both operator new() and the constructor(!) have already been called. So 
i expect confusion between CXXNewExpr in our sense (the big 
new-expression) and CXXNewExpr in a common person's sense (the allocator 
call). In particular, it would be nice to move MallocChecker to the new 
callback - eg. we're already done with region extent when we're 
constructing.

Variant (B) tries to go around this confusion while keeping the 
"user-facing meaning" of CXXNewExpr as the big new-expression, but it 
goes against what we normally do, which would anyway be confusing.

In variants (C) and (D) we have the "user-facing meaning" of CXXNewExpr 
changed into "here's where the allocation occurs". This is how 
MallocChecker currently understands things. Like (A), variant (C) tries 
to do what we usually do by only doing Pre-thingy after sub-thingies 
were evaluated (except evaluating CXXNewExpr before its child 
CXXConstructExpr, but we kinda agreed that they're not actually 
parent-child-related). However, because this time Pre-thingy is not a 
PreStmt-thingy, i find variant (D) fancier: the new callback is saying 
"we begin unleashing the operator new hell" and "we're done with the 
operator new hell nice and clean". Still, both (C) and (D) have an 
unpleasant downside of being unable to reliably read the value of 
CXXNewExpr at PostStmt<CXXNewExpr>, as the expression is not yet 
evaluated. This can probably be fixed, but it brings us back to the 
problem of whether we want to keep the value of the fake `_this` 
variable in the Environment as the value of CXXNewExpr this whole time. 
We probably do.

Variant (E) is kinda a combination of all approaches, as it provides the 
"big" callback like (B) and (D), while still saying that CXXNewExpr is 
the big new-expression, while not messing up the usual PreStmt/PostStmt 
order like (B) did. It still has the problem of 
check::PreStmt<CXXNewExpr> being confusingly late, but otherwise seems 
usable.

We can delay the choices between (A) and (E) and also between (C) and 
(D) until we actually want to introduce the new callbacks, but i guess 
it's reasonable to try to agree on what do we want to mean by CXXNewExpr.

I personally feel that (D) is the friendliest approach. Checkers 
subscribe for new - they get operator new. Checkers subscribe for 
constructor - they get their constructor. Checkers want the whole thing 
- sure, here's our fancy custom callback. They may probably never even 
think about this whole hassle.

My second favorite is (E), which has the strength-slash-weakness of 
consistency with the AST's CXXNewExpr being a parent of 
CXXConstructExpr. I believe that the reorder in the analysis order 
compared to the AST, even if it's not an actual reorder, would be quite 
intuitively acceptable. While the positioning of the new operator after 
the constructor wouldn't be.

Then (A) is simply a trimmed-down variant of (E), (B) is just weird, and 
(C) doesn't seem to be anyhow better than (D).

We can also combine (C) and (D), which is even better:

(F)

   -> check::PreWholeNewThing  // new callback (needs better name)

     -> check::PreStmt<CXXNewExpr>
       -> check::PreCall
         1. CFGNewAllocator
       <- check::PostCall
     <- check::PostStmt<CXXNewExpr>

     -> check::PreStmt<CXXConstructExpr>
       -> check::PreCall
         2. CXXConstructExpr
       <- check::PostCall
     <- check::PostStmt<CXXConstructExpr>

     -> check::PreBigNewExpr  // another new callback (needs better name)
       3. CXXNewExpr
     <- check::PostBigNewExpr  // new callback

   <- check::PostWholeNewThing  // new callback

Which by far seems to be the best (not necessarily good) idea i could 
have come up with on this subject.

Thoughts are welcome!~