[PATCH] D137948: [clang][dataflow] Add widening API and implement it for built-in boolean model.

Tue Nov 15 11:50:28 PST 2022

ymandel marked an inline comment as done.
ymandel added inline comments.

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:114
   bool isEqualTypeErased(const TypeErasedLattice &E1,
                          const TypeErasedLattice &E2) final {
     const Lattice &L1 = llvm::any_cast<const Lattice &>(E1.Value);
----------------
xazax.hun wrote:
> Maybe this is another case where we want to mark the whole class `final` instead of the individual methods? It is fine to address this in a separate PR.
Unfortunately not, since this class is used for CRTP.

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:135
+  // between overloads.
+  struct Rank1 {};
+  struct Rank0 : Rank1 {};
----------------
xazax.hun wrote:
> Is there ever a reason to instantiate `Rank1`? If no, should we do something (like make its ctor private and friends with `Rank0`?). Although possibly not important enough to worth the hassle.
No reason to instantiate it that I can see, but since it's a already private to this class, I think it's good enough. But, I just borrowed this from Eric, so I'm not an expert in this pattern. Happy to take suggested improvements.

================
Comment at: clang/include/clang/Analysis/FlowSensitive/DataflowAnalysis.h:108

+  LatticeJoinEffect widenTypeErased(TypeErasedLattice &Current,
+                                    const TypeErasedLattice &Previous) final {
----------------
xazax.hun wrote:
> ymandel wrote:
> > xazax.hun wrote:
> > > I wonder if `LatticeJoinEffect` is the right name if we also use this for widening. (Also, are we in the process of removing these?)
> > I had the same thought. How about just `LatticeEffect`? Open to other suggestions as well. Either way, though, I should update in a separate patch in case that breaks something unexpected.
> > 
> > As for removing -- yes, we should remove from join, because we never use it. But, it actually makes sense here.
> `LatticeEffect` sounds good to me.
Added FIXME

================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:424-425
+  }
+  if (WidenedEnv.DeclToLoc.size() != PrevEnv.DeclToLoc.size() ||
+      WidenedEnv.ExprToLoc.size() != PrevEnv.ExprToLoc.size() ||
+      WidenedEnv.LocToVal.size() != PrevEnv.LocToVal.size() ||
----------------
xazax.hun wrote:
> Isn't some of these tautologically false, as we only read from them?
Well, `Model`'s `widen` function could change them, as much as that seems wrong. I suppose we could specify in the API that it shouldn't, or we could further restructure the environment so that models don't have access to these fields. Either way, if we want these to be assertable, we'll need an API change.

But, more to the point, since I've changed it to update-in-place, we'll now also capture any changes that may have occured between Prev and Current. Those should converge pretty quickly, because they related to static features of the code, so I think this is a reasonable check now.

================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:374-378
+  Environment WidenedEnv(*DACtx);
+
+  WidenedEnv.CallStack = CallStack;
+  WidenedEnv.ReturnLoc = ReturnLoc;
+  WidenedEnv.ThisPointeeLoc = ThisPointeeLoc;
----------------
xazax.hun wrote:
> ymandel wrote:
> > xazax.hun wrote:
> > > Shouldn't we have a simple copy ctor from PrefEnv that could populate all these fields?
> > > 
> > > Also, this makes me wonder if we actually want to move some of these out from environment, since these presumably would not change between basic blocks.
> > > Shouldn't we have a simple copy ctor from PrefEnv that could populate all these fields?
> > We *do* have such a copy constructor. The problem is that it's overkill because we want to build LocToVal ourselves by point-wise widening of the elements. For that matter, I wish we could share, rather than copy, `DeclToLoc` and `ExprToLoc` but I don't know of any way to do that. Alternatively: is it possible that we can update in place and don't need a separate `WidenedEnv`?
> > 
> > > Also, this makes me wonder if we actually want to move some of these out from environment, since these presumably would not change between basic blocks.
> > Agreed. Added a fixme with an associated issue to track.
> > 
> > 
> > For that matter, I wish we could share, rather than copy, DeclToLoc and ExprToLoc.
> 
> One way to maximize sharing (and making copies free) is to use the immutable datastructures in LLVM: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/ADT/ImmutableMap.h
> 
> Unfortunately, these have their own costs so it might not be easy to decide when it is worth to use them. The Clang Static Analyzer was designed from the ground up with immutable data structures in mind. 
> 
> On the other hand, I see that you want a more ad-hoc/opportunistic sharing. And I actually wonder if we need any sharing at all. If we are 100% sure we will do the widening, do we actually need to preserve the previous environment? Couldn't we "consume" it and move certain fields out instead of doing the more expensive copy operation when it makes sense?
We considered immutable data structures early on and decided against them, based on guesses about costs. I'm sure this is worth revisiting, but probably not outside of a comprehensive performance analysis of the framework.

Regarding this API in particular, I was basing off the needs of join, which very much requires a distinct new environment. But, I think that's the wrong model, as I've better understood what we're doing here.  That catch22 for join is that you need to create new flow condition which joins the 2 incoming FCs, but you also need to (potentially) read the old FC and update the new FC during the pointwise merge of the store. So, if you update in place, neither update order (store and FC) works. If you update the FC first, then the old FC is inaccesible, but if you update the store first, then the additions to the FC end up mishandled when you create the new FC from the old ones.

AFAICT, we don't have that issue here, since we're not applying any widening to the FC. So, we can modify the FC during store widening without ill affect.  I went ahead and made this change, but let me know if you see a flaw in my reasoning.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D137948/new/

https://reviews.llvm.org/D137948