[PATCH] D121796: [clang][dataflow] Add an API for dataflow "models" -- reusable analysis components.

Wed Mar 16 08:45:53 PDT 2022

xazax.hun added inline comments.

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:150-151
+///    should relate to each other -- that is, how they should compose. Open
+///    questions include: Do we want to enable composition of models that have
+///    different lattice types? Do we want to support models with no lattices
+///    that only use the Environment?
----------------
I think supporting different lattice types is a must for a good composability model. E.g., in an ideal world when we have a lattice for constant propagation of integers and we have a model for `std::optional`, it would be great if we could do constant propagation to/from non-empty optionals of integers. 

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:152
+///    different lattice types? Do we want to support models with no lattices
+///    that only use the Environment?
+///
----------------
The Clang Static Analyzer is full of modeling that has no dedicated state, it will just update the equivalent of an environment (e.g., adding a range to the returned values): https://github.com/llvm/llvm-project/blob/main/clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp#L1208

Once we have a framework that can reason about integers, I think having something similar (or even better, somehow being able to reuse these summaries across CSA and dataflow) would be awesome.

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:159
+public:
+  virtual void transfer(const Stmt *Stmt, LatticeT &L, Environment &Env) = 0;
+};
----------------
In the Clang Static Analyzer the process of modelling is more conversational between the "modeling checks" and the framework. There, a return value indicates whether a particular function call was modeled and the first time someone wants to model a function call it will short circuit the whole process. This ensures that only one model will be applied.

I'm not 100% sure whether this would be the best model here. But I'd strongly suggest to add a `bool` return value here, because it might be useful for the framework to know in the future if some modeling was done at all (potentially skipping some default modeling based ont he return value).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121796/new/

https://reviews.llvm.org/D121796