[PATCH] D122838: [clang][dataflow] Add support for correlation of boolean (tracked) values

Gábor Horváth via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 1 08:00:06 PDT 2022


xazax.hun accepted this revision.
xazax.hun added inline comments.


================
Comment at: clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp:79
+    for (BoolValue *Constraint : Env1.getFlowConditionConstraints()) {
+      Expr1 = &Env1.makeAnd(*Expr1, *Constraint);
+    }
----------------
ymandel wrote:
> sgatev wrote:
> > xazax.hun wrote:
> > > Hmm, interesting.
> > > I think we view every boolean formula at a certain program point implicitly as `FlowConditionAtThatPoint && Formula`. And the flow condition at a program point should already be a disjunction of its predecessors.
> > > 
> > > So it would be interpreted as: `(FlowConditionPred1 || FlowConditionPred2) && (FormulaAtPred1 || FormulaAtPred2)`.
> > > While this is great, this is not the strongest condition we could derive. 
> > > `(FlowConditionPred1 && FormulaAtPred1)  || (FormulaAtPred2 && FlowConditionPred2)` created by this code snippet is stronger which is great.
> > > 
> > > My main concern is whether we would end up seeing an exponential explosion in the size of these formulas in the number of branches following each other in a sequence.
> > > 
> > Yeap, I agree this is suboptimal and I believe I'm the one to blame for introducing it downstream.
> > 
> > I wonder if we can represent the flow condition of each environment using a bool atom and have a mapping of bi-conditionals between flow condition atoms and flow condition constraints. Something like:
> > 
> > ```
> > FC1 <=> C1 ^ C2
> > FC2 <=> C2 ^ C3 ^ C4
> > FC3 <=> (FC1 v FC2) ^ C5
> > ... 
> > ```
> > 
> > We can use that to simplify the formulas here and in `joinConstraints`. The mapping can be stored in `DataflowAnalysisContext`. We can track dependencies between flow conditions (e.g. above `FC3` depends on `FC1` and `FC2`) and modify `flowConditionImplies` to construct a formula that includes the bi-conditionals for all flow condition atoms in the transitive set before invoking the solver.
> > 
> > I suggest putting the optimization in its own patch. I'd love to look into it right after this patch is submitted if both of you think it makes sense on a high level.
> This sounds good to me. That said, I'm not sure how often we'd expect this to be an issue in practice, since, IIUC, this specialized merge only occurs when the value is handled differently in the two branches. So, a series of branches alone won't trigger the exponential behavior.
Fair point. It might never cause a problem in practice. Since we already have one proposal how to try to fix this if a problem ever surfaces, I'd love to have a comment about this in the code. But I think this is something that probably does not need to be addressed in the near future. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122838/new/

https://reviews.llvm.org/D122838



More information about the cfe-commits mailing list