[PATCH] D123773: [clang][analyzer][ctu] Make CTU a two phase analysis

Gábor Horváth via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue May 17 09:00:21 PDT 2022


xazax.hun accepted this revision.
xazax.hun added inline comments.


================
Comment at: clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp:446
+    }
+    const bool BState = State->contains<CTUDispatchBifurcationSet>(D);
+    if (!BState) { // This is the first time we see this foreign function.
----------------
martong wrote:
> xazax.hun wrote:
> > martong wrote:
> > > xazax.hun wrote:
> > > > xazax.hun wrote:
> > > > > martong wrote:
> > > > > > xazax.hun wrote:
> > > > > > > So if we see the same foreign function called in multiple contexts, we will only queue one of the contexts for the CTU. Is this the intended design? 
> > > > > > > 
> > > > > > > So if I see:
> > > > > > > ```
> > > > > > > foreign(true);
> > > > > > > foreign(false);
> > > > > > > ```
> > > > > > > 
> > > > > > > The new CTU will only evaluate `foreign(true)` but not `foreign(false)`. 
> > > > > > This is intentional.
> > > > > > ```
> > > > > > foreign(true);
> > > > > > foreign(false);
> > > > > > ```
> > > > > > The new CTU will evaluate the following paths in the exploded graph:
> > > > > > ```
> > > > > > foreign(true); // conservative evaluated
> > > > > > foreign(false); // conservative evaluated
> > > > > > foreign(true); // inlined
> > > > > > foreign(false); // inlined
> > > > > > ```
> > > > > > The point is to keep bifurcation to a minimum and avoid the exponential blow up.
> > > > > > So, we will never have a path like this:
> > > > > > ```
> > > > > > foreign(true); // conservative evaluated
> > > > > > foreign(false); // inlined
> > > > > > ```
> > > > > > 
> > > > > > Actually, this is the same strategy that we use during the dynamic dispatch of C++ virtual calls. See `DynamicDispatchBifurcationMap`.
> > > > > > 
> > > > > > The conservative evaluation happens in the first phase, the inlining in the second phase (assuming the phase1 inlining option is set to none).
> > > > > > The new CTU will evaluate the following paths in the exploded graph:
> > > > > > ```
> > > > > > foreign(true); // conservative evaluated
> > > > > > foreign(false); // conservative evaluated
> > > > > > foreign(true); // inlined
> > > > > > foreign(false); // inlined
> > > > > > ```
> > > > > 
> > > > > When we encounter `foreign(true)`, we would add the decl to `CTUDispatchBifurcationSet`. So the second time we see a call to the function `foreign(false);`, we will just do the conservative evaluation and will not add the call to the CTU worklist. So how will `foreign(false);` be inlined in the second pass? Do I miss something? 
> > > > > 
> > > > Oh, I think I now understand. Do you expect `foreign(false);` to be inlined after we return from `foreign(true);` in the second pass? Sorry for the confusion, in that case it looks good to me.
> > > > Oh, I think I now understand. Do you expect `foreign(false);` to be inlined after we return from `foreign(true);` in the second pass?
> > > 
> > > Yes, that's right.
> > Actually, in this case I wonder whether we really need a set of declarations. Wouldn't a boolean be sufficient for each path?
> > 
> > So in case of:
> > ```
> > foreign1(true);
> > foreign2(false);
> > ```
> > 
> > Why would we want to add `foreign2` to the CTU worklist? More specifically, why is it important whether a foreign call is actually the same declaration that we saw earlier on the path? Isn't being foreign the only important information in this case?
> Good point. 
> 
> By using the set of declarations we might have a path where `foreign1` is conservatively evaluated and then `foreign2` is inlined. I was having a hard time to find any use-cases where this would be useful, but could not find one. 
> 
> On the other hand, by using a `bool`, as you suggest, no such superfluous path would exist. Plus, the dependent child patch (D123784) is becoming unnecessary because the CTUWorklist will have at most one element during the first phase.
> 
> Besides, I've made measurements comparing the `bool` and the `set` implementation. The results are quite similar. Both have lost the same amount of bugreports compared to the single-tu analysis and the average length of the bugpaths are also similar. {F23090628}
Great news! :) Looks like this was a nice simplification, thanks for looking into it!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123773/new/

https://reviews.llvm.org/D123773



More information about the cfe-commits mailing list