[PATCH] D28952: [analyzer] Add new Z3 constraint manager backend

Sun Jan 22 20:23:08 PST 2017

ddcc added a comment.

I'd like to note that one of the main challenges with this implementation was needing to perform type reinference and implicit casting within the constraint manager, since the constraints are sometimes implicit. (e.g. `(int) x` instead of `x == 0`). It also didn't help that the Sema methods for handling type conversion aren't directly usable.

> Can you talk a bit about your motivation for working on this project and what your goals are?

The original goal was to add support for symbolic execution on floating-point types, and analyze some real-world programs to look for interesting bugs. Unfortunately, it turns out that most floating-point bugs in the past tend to originate lower in the compiler toolchain, within the register allocator (e.g. non-determinism involving x87 FPU 80-bit registers), intermediate optimization passes (see Alive-FP), string conversions involving floating-point types (which may be too complex to model easily), or unexpected interactions between various 1st/3rd-party code when modifying the C99 thread-local floating-point environment (from fpenv.h). But the latter isn't really well modeled by Clang/LLVM, resulting in false-positives when real-world program use floating-point exception handlers to catch underflow/overflow/etc conditions. There's also the floating-point rounding mode that can be changed at runtime, but again, it's not something that's currently modeled by Clang/LLVM. I have seen a set of patches on the mailing list (by Sergey, I believe) that improve support for this, but to my knowledge they haven't been merged yet.

> Have you compared the performance when using Z3 vs the current builtin solver? I saw that you mention performance issues on large codebases, but could you give some specific numbers of the tradeoffs between performance, code coverage, and the quality of reported bugs. (One very rough idea to use a stronger solver while still keeping the analyzer performant would be to only use it for false positive refutation.)

On a virtualized i5-2500k system, I get the following performance numbers on the testsuite, with a Clang/LLVM BUILD_SHARED_LIBS release build and Z3 https://github.com/z3prover/z3/commits/1bfd09e16bd9aaeae8a6b2841a2e8da615fdcda4 (post-4.5.0):
RangeConstraintManager: 13.46 sec
Z3ConstraintManager (plugin): 416.5 sec
Z3ConstraintManager (built-in): 414.6 sec
Z3ConstraintManager (built-in, with below changes): 303.92 sec
Z3ConstraintManager (built-in, with below changes and no simplify): 250.71 sec

In addition to the caveats above, the other more practical problem is that the performance is really bad on medium to large-sized codebases. I did some testing on gmp, mpfr, libvpx, daala, libsdl, and opus, and for all of these projects, with static analysis enabled (via scan-build, and a release mode Clang/LLVM), the builds (in parallel) wouldn't finish even after a week on a Xeon E5-2620v3 workstation, and there were a lot of false positives that were hard to dig through. I did end up finding and reporting one bug in mpfr (https://sympa.inria.fr/sympa/arc/mpfr/2017-01/msg00012.html), but it's not floating-point related, and I didn't check if it could also be found by vanilla Clang/LLVM. With regards to the performance, I believe that this was due to multiple factors, and correct me if I'm wrong:

1. The Z3 solver timeout is set to 15 sec, which is probably too large (for an example that fails, see https://github.com/Z3Prover/z3/issues/834). But when timeouts do occur, they can cause runtime failures in the static analyzer: e.g. the assertion to avoid overconstrained states (in ConstraintManager.h), or underconstrained states (in BugReporterVisitors.cpp).
2. The symbolic expression complexity parameter (`MaxComp`) is set to 10000, which is probably also too large. I'm guessing that overly-complex constraints are being generated, which can't be meaningfully solved and just bog down the analyzer.
3. Within the analyzer, states are removed and iterated one-by-one off of the internal worklist, in a single-thread in a single-process. I'm guessing that this means the analyzer ends up waiting on Z3 most of the time. This is also why I started running analyses in parallel on different projects, because the analysis for each project would occupy just one core.
4. The same queries end up being sent to the constraint manager despite no meaningful change in the program state, through either `assume()` or `getSymVal()`. But for expensive queries, this means that a lot of time/compute is just wasted. I think this could be improved by implementing a SMT query cache, and potentially by only adding relevant constraints to the solver state. I actually have a draft attempt at a simple query cache implementation (https://github.com/ddcc/clang/commit/e0236ff8f7b9c16dd29e8529420ab14b4309c3e6), but it's very hacky, isn't entirely memory-safe, caused some testsuite failures, and I ran out of time to dig further into it.

> How different the results are? Is it the case that you get more warnings with Z3 in most cases? Would it be possible to divide the tests into the ones that work similarly with both solvers and the ones that are only expected to report warnings with Z3? I know this won't work in general, but might be much cleaner than polluting every test with a ton of #ifdefs.

In general, there are more warnings that appear because the number of false negatives are reduced. Ignoring changes to the testcases caused by infrastructure-related issues, this entire series of patches modifies around 15 testcases, so there won't be a lot that need the conditional handling.

> I'd also like to second Ryan's suggestion to look at his patch to add support for STP. It was very close to landing other than some build system integration issues.

I actually looked through his patch while writing this one, I think the general recursive approach is pretty similar, as well as the plugin-stub implementation.

> I would definitely split the Z3Expr into expr/solver/model classes. Several advantages: plugging in new solvers becomes easier and reference counting can be handled more safely and automatically. Right now your solver+model related code handled ref-counting "by hand", which can easily lead to bugs.

The base ref-counting implementation for expr/Z3Expr is essentially the same as the one in the Z3 C++ API, except that there is no non-move/copy/named constructor. It would be good to have some abstraction for the solver and model as well, but since they are used in only two or three functions, I originally used manually ref-counting to save time.

> I would add an rvalue constructor to the expr class to avoid ref-counting when unneeded.

Done (see new revision).

> Function z3Expr::fromData() seems to be able to create arbitrarily-sized bit-vectors. I would limit to e.g. 512 bits or something like that. Z3 is super slow with huge bit-vectors. If you feel fancy (for a following version of this feature), you could encode this information lazily (based on the model).

That's true, but in practice I don't think Clang has any integer types larger than 128-bit. The created bitvector size should be equivalent to the underlying type size for the target platform.

> Z3Expr::fromInt() -- why take a string as input and not a uin64_t?

To avoid any limitations on the type size of the input value, I ended up using strings. Practically, 128-bit integers may be the only reason not to use uint64_t, although I admit I haven't tested with 128-bit integers.

> Why call simplify before asserting a constraint? That's usually a pretty slow thing to do (in my experience).

Done. This has been my first time using Z3, my original thought was to simplify constraints as early as possible, because they are stored in the program state and propagate as symbolic execution continues, but the numbers show (at least for small programs), that this is less efficient.

> You should replace Z3_mk_solver() with something fancier. If you don't feel fancy, just use Z3_mk_simple_solver() (simple is short for simplifying btw). If you know the theory of the constraints (e.g., without floats, it would be QF_BV), you should use Z3_mk_solver_for_theory() (or whatever the name was). If you feel really fancy, you could roll your own tactic (in my tools I often have a tactic class to build my own tactics; Z3 even allows you to case split on the theory after simplifications)

Done. My earlier understanding from https://github.com/Z3Prover/z3/issues/546 was to stick with `Z3_mk_solver()`, since I'm not familiar with tactics.

> Instead of asserting multiple constraints to the solver, I would and() them and assert just once (plays nicer with some preprocessors of Z3)

Done.

> Timeout: SMT solvers sometimes go crazy and take forever. I would advise to add a timeout to the solver to avoid getting caught on a trap. This timeout can be different for the two use cases: check if a path is feasible (lower timeout), or check if you actually have a bug (higher timeout)

There's currently a 15 sec timeout in the Z3ConstraintManager constructor, but that'd be a useful improvement to implement.

> Also, I didn't see anything regarding memory (I might have missed yet). Are you using the array theory? If so, I would probably advise benchmarking it carefully since Z3's support for arrays is not great (even though it has improved a bit in the last year or so).

No, all the memory modeling is handled through Clang's StoreManager (the default implementation is RegionStore). I'm not familiar with how that part works, but there are definitely some false positives there; I recall making a mailing list post about it a few months ago. As future work, it'd be interesting to compare that with Z3's array theory.

> BTW, adding small amount of constant folding in the expression builder is usually good, since you save time in Z3 preprocessor. Especially if you want to go for smallish timeouts (likely). If you want to be fancy, you can also canonicalize expressions like Z3 likes (e.g., only ule/sle, no slt, sgt, etc).

That'd definitely be useful. It seems that using Z3_simplify() is expensive, so my code should do the simpler things like constant folding and canonicalization.

https://reviews.llvm.org/D28952