[PATCH] D45517: [analyzer] False positive refutation with Z3

Wed May 30 10:32:50 PDT 2018

george.karpenkov added inline comments.

================
Comment at: lib/StaticAnalyzer/Core/BugReporterVisitors.cpp:2382
+    // Reset the solver
+    RefutationMgr.reset();
+  }
----------------
xazax.hun wrote:
> george.karpenkov wrote:
> > george.karpenkov wrote:
> > > (apologies in advance for nitpicking not on your code).
> > > 
> > > Currently, this is written in a stateful way: we have a solver, at each iteration we add constraints, and at the end we reset it. To me it would make considerably more sense to write the code in a functional style: as we go, generate a vector of formulas, then once we reach the path end, create the solver object, check satisfiability, and then destroy the entire solver.
> > Elaborating more: we are already forced to have visitor object state, let's use that. `RefutationMgr` is essentially a wrapper around a Z3 solver object, let's just create one when visitor is constructed (directly or in unique_ptr) and then rely on the destructor to destroy it.
> > Then no `reset` is necessary.
> Note that while constructing the constraint solver here might make perfect sense now, it also inhibits incremental solving.  If we do not plan to experiment with incremental solvers anytime soon I am fine with this direction as well.
@xazax.hun Right, I see.
However, we should not optimize prematurely --- IF we decide to have incremental solving, then we would change our design to support it.

Now I don't think incremental solving would help, and I don't think that having a global solver object would be helpful for it.

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1249
+bool Z3ConstraintManager::isModelFeasible() {
+  return Solver.check() != Z3_L_FALSE;
+}
----------------
rnkovacs wrote:
> george.karpenkov wrote:
> > solver can also return "unknown", what happens then?
> If it returns `Z3_L_UNDEF`, e.g. in case of a timeout, this assumes that the state was feasible because we couldn't prove the opposite. In that case the report won't be invalidated.
@rnkovacs that's possibly valid (though the exact behavior might need to be behind an option), but the current implementation is wrong and the decision should be made at a different stack level.
This method is responsible for "whether the model is valid", and it should not say "yes" when the solver returns "unknown". We could return an `Optional<bool>` here, or a tri-value logic type (IIRC LLVM had one, which could represent true/false/unknown)

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1259
+    return;
+  ConstraintRangeTy CR = OnlyPurged ? PrevCR : SuccCR;
+
----------------
rnkovacs wrote:
> george.karpenkov wrote:
> > TBH I'm really confused here. Why does the method take two constraint ranges? What's `OnlyPurged`? From reading the code it seems it's set by seeing whether the program point only purges dead symbols, but at least a comment should be added as to why this affects behavior.
> The logic was:
>   - add every constraint from the last node (first visited),
>   - for other nodes on the path, only add those that disappear in the next step.
> 
> So `OnlyPurged` is meant to signal that we only want to add those symbols to the solver that are getting purged from the program state.
@rnkovacs right, so that's an optimization not to add extra constraints?

1. I would not do it at all, all formulas in Z3 are hash-consed, so a new formula would not even be *constructed* if it's already asserted in the solver.

2. Even if we do do it (which we shouldn't), this logic does not belong to Z3ConstraintManager. The method should have simple semantics: take a state, add all constraints to solver.

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1264
+
+    if (OnlyPurged && SuccCR.contains(Sym))
+      continue;
----------------
xazax.hun wrote:
> george.karpenkov wrote:
> > I would guess that this is an optimization done in order not to re-add the constraints we already have.
> > I think we should really not bother doing that, as Z3 will do a much better job here then we can.
> Note that we are using lots of domain knowledge here like we have the most info about a symbol just before it dies. Also This optimization is a single lookup on the symbol level. I am not sure if Z3 could deal with this on the symbol level. It might need to do this on the constraint level.
> My point is, I am perfectly fine removing this optimization but I would like to see some performance numbers first either on a project that exercises refutation quite a bit or on some synthetic test cases.
> 
> This optimization is a single lookup on the symbol level. I am not sure if Z3 could deal with this on the symbol level

What do you mean? I'm positive adding redundant constraints to Z3 would not slow solving down.

> I would like to see some performance numbers first either on a project that exercises refutation quite a bit or on some synthetic test cases.

"Premature optimization is a root of (most) evil".
It should totally be the other way around -- a simple correct solution should be implemented first, and then a benchmark could be used to justify adding an optimization.

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1278
+      // a valid APSIntType.
+      if (RangeTy.isNull())
+        continue;
----------------
rnkovacs wrote:
> george.karpenkov wrote:
> > I'm really curious where does it happen and why.
> I encountered some 1-bit `APSInt`s that wouldn't work together with any other integer-handling logic. As @ddcc mentioned, he has a fix for that in D35450 (`Z3ConstraintManager::fixAPSInt()`).
@rnkovacs right, so this is a workaround for an existing bug? In that case a FIXME with a link to the revision you have mentioned should be added.
Ideally, a test with a FIXME should be added as well, but I understand if that's too complicated.

https://reviews.llvm.org/D45517