[PATCH] D130306: [clang][dataflow] Analyze calls to in-TU functions

Thu Jul 21 16:15:29 PDT 2022

xazax.hun added a comment.

There are many ways to introduce context sensitivity into the framework, this patch seems to take the "inline substitution" approach, the same approach the Clang Static Analyzer is taking. While this approach is relatively easy to implement and has great precision, it also has some scalability problems. Did you also consider a summary-based approach? In general, I believe the inline substitution approach results in an easier to use interface for the users of the framework, but I am a bit concerned about the scalability problems.

Some other related questions:

- Why call noop analysis? As far as I understand, this would only update the environment but not the lattice of the current analysis, i.e., if the analysis is computing some information like liveness, that information would not be context sensitive. Do I miss something?
- Why limit the call depth to 1? The patch mentions recursive functions. In case of the Clang Static Analyzer, the call depth is 4. I think if we go with the inline substitution approach, we want this parameter to be tunable, because different analyses might have different sweet spots for the call stack depth.
- The CSA also has other tunables, e.g., small functions are always inlined and large functions are never inlined.
- Currently, it looks like the framework assumes functions that cannot be inlined are not doing anything. This is an unsound assumption, and I wonder if we should change that before we try to settle on the best values for the tunable parameters.
- The current code might do a bit too much work. E.g. consider:

  while (...) {
    inlinableCall();
  }

As far as I understand, the current approach would start the analysis of `inlinableCall` from scratch in each iteration. I wonder if we actually want to preserve the state between the iterations, so we do not always reevaluate the call from scratch. Currently, it might not be a big deal as the fixed-point iteration part is disabled. But this could be a perf problem in the future, unless I miss something.

- I think the environment currently is not really up to context-sensitive evaluation. Consider a recursive function:

  void f(int a) {
    ++a;
    if (a < 10)
        f(a);
  }

Here, when the recursive call is evaluated, it is ambiguous what `a` refers to. Is it the argument of the caller or the callee? To solve this ambiguity, the calling context needs to be the part of the values in the environment.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130306/new/

https://reviews.llvm.org/D130306