[PATCH] D45517: [analyzer] False positive refutation with Z3

Wed May 30 11:56:59 PDT 2018

george.karpenkov added inline comments.

================
Comment at: lib/StaticAnalyzer/Core/BugReporterVisitors.cpp:2382
+    // Reset the solver
+    RefutationMgr.reset();
+  }
----------------
xazax.hun wrote:
> george.karpenkov wrote:
> > xazax.hun wrote:
> > > george.karpenkov wrote:
> > > > george.karpenkov wrote:
> > > > > (apologies in advance for nitpicking not on your code).
> > > > > 
> > > > > Currently, this is written in a stateful way: we have a solver, at each iteration we add constraints, and at the end we reset it. To me it would make considerably more sense to write the code in a functional style: as we go, generate a vector of formulas, then once we reach the path end, create the solver object, check satisfiability, and then destroy the entire solver.
> > > > Elaborating more: we are already forced to have visitor object state, let's use that. `RefutationMgr` is essentially a wrapper around a Z3 solver object, let's just create one when visitor is constructed (directly or in unique_ptr) and then rely on the destructor to destroy it.
> > > > Then no `reset` is necessary.
> > > Note that while constructing the constraint solver here might make perfect sense now, it also inhibits incremental solving.  If we do not plan to experiment with incremental solvers anytime soon I am fine with this direction as well.
> > @xazax.hun Right, I see.
> > However, we should not optimize prematurely --- IF we decide to have incremental solving, then we would change our design to support it.
> > 
> > Now I don't think incremental solving would help, and I don't think that having a global solver object would be helpful for it.
> Just a bit of context and to have some expectation management regarding this patch. The main purpose of this implementation was to back a thesis. It was made under a very serious time pressure and the main goal was to be able to measure on real world projects as soon as possible and in the meantime to be flexible so we can measure multiple configurations (like incremental solving).
> 
> So the goal was a flexible proof of concept that is sensible to measure in the shortest possible time. After the thesis was done, Reka started to work an another GSoC project, so she had no time to review the code with the requirements of upstreaming in mind. Nevertheless we found that sharing the proof of concept could be useful for the community.  So it is perfectly reasonable if you disagree with some design decisions behind this patch, because the requirements for the thesis (in the short time frame) was very different from the requirements of upstreaming this work. In a different context these decisions made perfect sense. 
@xazax.hun of course. My comments are for @mikhail.ramalho who is now working on this patch.

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1259
+    return;
+  ConstraintRangeTy CR = OnlyPurged ? PrevCR : SuccCR;
+
----------------
xazax.hun wrote:
> george.karpenkov wrote:
> > rnkovacs wrote:
> > > george.karpenkov wrote:
> > > > TBH I'm really confused here. Why does the method take two constraint ranges? What's `OnlyPurged`? From reading the code it seems it's set by seeing whether the program point only purges dead symbols, but at least a comment should be added as to why this affects behavior.
> > > The logic was:
> > >   - add every constraint from the last node (first visited),
> > >   - for other nodes on the path, only add those that disappear in the next step.
> > > 
> > > So `OnlyPurged` is meant to signal that we only want to add those symbols to the solver that are getting purged from the program state.
> > @rnkovacs right, so that's an optimization not to add extra constraints?
> > 
> > 1. I would not do it at all, all formulas in Z3 are hash-consed, so a new formula would not even be *constructed* if it's already asserted in the solver.
> > 
> > 2. Even if we do do it (which we shouldn't), this logic does not belong to Z3ConstraintManager. The method should have simple semantics: take a state, add all constraints to solver.
> We would not just add the very same constraints over and over again. We would first add the strongest possible constraint that the analyzer could infer first and later on add weaker and weaker ones. Does Z3 do some special handling for that as well?
@xazax.hun Since all subformulas are hash-consed, I would be extremely surprised if this heuristic affected performance in a large way.

However, your approach could be still useful in order to get readable dumps from the solver.
I would still argue it should be done in a visitor, so something along the lines of:

```
// a comment explaining the logic
if (State.succ_size() == 0 || State.getLocation().isPurged())
   solver.addConstraints(state)
```

================
Comment at: lib/StaticAnalyzer/Core/Z3ConstraintManager.cpp:1264
+
+    if (OnlyPurged && SuccCR.contains(Sym))
+      continue;
----------------
xazax.hun wrote:
> george.karpenkov wrote:
> > xazax.hun wrote:
> > > george.karpenkov wrote:
> > > > I would guess that this is an optimization done in order not to re-add the constraints we already have.
> > > > I think we should really not bother doing that, as Z3 will do a much better job here then we can.
> > > Note that we are using lots of domain knowledge here like we have the most info about a symbol just before it dies. Also This optimization is a single lookup on the symbol level. I am not sure if Z3 could deal with this on the symbol level. It might need to do this on the constraint level.
> > > My point is, I am perfectly fine removing this optimization but I would like to see some performance numbers first either on a project that exercises refutation quite a bit or on some synthetic test cases.
> > > 
> > > This optimization is a single lookup on the symbol level. I am not sure if Z3 could deal with this on the symbol level
> > 
> > What do you mean? I'm positive adding redundant constraints to Z3 would not slow solving down.
> > 
> > > I would like to see some performance numbers first either on a project that exercises refutation quite a bit or on some synthetic test cases.
> > 
> > "Premature optimization is a root of (most) evil".
> > It should totally be the other way around -- a simple correct solution should be implemented first, and then a benchmark could be used to justify adding an optimization.
> While I do agree that we should not optimize prematurely this is not just an optimization. Having a minimal set of constraints is also useful for debugging when dumping the set of constraints that Z3 tries to solve. 
> 
> Also, I think this optimization is quite simple, so I do not see removing it making the code much simpler. This is the reason why I would live to see some performance numbers first.
OK your second point is valid, I think I've replied to it in my comment above.

https://reviews.llvm.org/D45517