[PATCH] D152732: [clang][dataflow] Support limits on the SAT solver to force timeouts.

Mon Jun 12 13:06:24 PDT 2023

mboehme added a comment.

In D152732#4414707 <https://reviews.llvm.org/D152732#4414707>, @ymandel wrote:

> In D152732#4414661 <https://reviews.llvm.org/D152732#4414661>, @xazax.hun wrote:
>
>> Huge +1, I think most solvers need to have some resource limits in place as the runtime can explode. I am just not 100% what is the best approach here, putting a global limit on the solver instance vs having a limit per query. Any thoughts on that?
>
> Excellent question. Ultimately what matters for a user is the global limit. So, in that sense, a global limit makes sense. But, it also makes it harder (in principle) to pinpoint the problem, because you could have it timeout on a small query right after a large query that was actually responsible for consuming the resources. That said, I'm guessing in practice it's binary, because of the exponential: either a single call exhausts all resources or it barely uses them. I suspect we'll ~never hit the case I described. So, I'm inclined to start global (to keep it simple) and then refine if necessary. As you probably noticed, this patch actually has both -- the user specifies the global limit, but the implementation is local. So, changing this would be easy.

I'm a bit late to this discussion but still wanted to chime in.

I would actually argue that a local limit more accurately reflects what we want to limit. The functions we analyze will be distributed across a fairly broad range of size and complexity. It seems reasonable to allow more resources to be used to analyze a large and complex function than a small and simple function, and I think this is aligned with users' expectations. So I think it would be reasonable to allow the analysis to use an amount of resources that's proportional to the number of `solve()` invocations; we just want to limit the amount of resources consumed in a given `solve()` invocation.

I do follow your argument that "local versus global" likely won't make much of a difference in practice -- the number of `solve()` invocations is polynomial in the size of the function (I believe?), and that pales against the exponential blowup that can potentially occur inside `solve()`.

However, I don't follow the argument that you want to "start global (to keep it simple)". I think a "local" limit would be simpler: `WatchedLiteralsSolverImpl::solve()` wouldn't need to return the final value of its parameter `MaxIterations`, and `WatchedLiteralsSolver::solve()` wouldn't need to write that back into its member variable `MaxIterations` (which would instead be `const`).

I don't think, in any case, that we should have a global _and_ a local limit -- that would really be overcomplicating things.

================
Comment at: clang/unittests/Analysis/FlowSensitive/SolverTest.cpp:378
+  // A ^ B
+  EXPECT_EQ(solver.solve({A, B}).getStatus(), Solver::Result::Status::TimedOut);
+}
----------------
Do we really need such a complex formula to test this? Couldn't we make the formula simpler (can we get as simple as "a && !a"?) and reduce the maximum number of iterations accordingly so we still get a timeout?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152732/new/

https://reviews.llvm.org/D152732