[PATCH] D125931: [clang][dataflow] Add support for correlated branches to optional model

Mon Jun 13 16:12:01 PDT 2022

sgatev marked 3 inline comments as done.
sgatev added inline comments.

================
Comment at: clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp:596-598
+      MergedEnv.makeOr(
+          MergedEnv.makeAnd(Env1.getFlowConditionToken(), *HasValueVal1),
+          MergedEnv.makeAnd(Env2.getFlowConditionToken(), *HasValueVal2)));
----------------
ymandel wrote:
> xazax.hun wrote:
> > ymandel wrote:
> > > What's the plan for loops? This will increase the size of the constraint on every iteration, preventing termination.
> > I have nothing to add, just a brain dump. In this particular case, the analysis domain has infinite chains, the size of the Boolean formulas can increase indefinitely (at least syntactically). The canonical solution to deal with infinite chains is to do a widening operation such that the widened state over approximates both the state of the old and the new iteration. A very simple way of doing this would be to simply erase all knowledge we accumulated about the Boolean value. On the other hand, I wonder if it is often possible to do something smarter. There might be cases when the value is syntactically larger in the new iteration but actually it has the exact same truth table. The solver might be able to show this. On the other hand, if the loop generates new Boolean symbols in each iteration, we have little hope fore that.
> > I have nothing to add, just a brain dump. In this particular case, the analysis domain has infinite chains, the size of the Boolean formulas can increase indefinitely (at least syntactically). The canonical solution to deal with infinite chains is to do a widening operation such that the widened state over approximates both the state of the old and the new iteration. A very simple way of doing this would be to simply erase all knowledge we accumulated about the Boolean value. On the other hand, I wonder if it is often possible to do something smarter. There might be cases when the value is syntactically larger in the new iteration but actually it has the exact same truth table. The solver might be able to show this. On the other hand, if the loop generates new Boolean symbols in each iteration, we have little hope fore that.
> 
> Agreed. In our internal implementation, we have something like this -- it consults the solver to see if the two are logically equivalent.  That loses the (subtle) support for correlated conditions. However, until we have more precise support for widening (that is, separate from `merge`), I think this is the only way to go to ensure that loop analysis terminates.
After discussing this, we decided to go for the internal implementation which is tested on a large codebase and commit to smarter merging that preserves flow condition information only after testing it extensively. I have updated the patch accordingly, please take a look.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125931/new/

https://reviews.llvm.org/D125931