[PATCH] D103317: [Analyzer][engine][solver] Simplify complex constraints

Fri Jun 4 03:07:29 PDT 2021

vsavchenko added a comment.

In D103317#2794099 <https://reviews.llvm.org/D103317#2794099>, @ASDenysPetrov wrote:

>> ! In D103317#2793797 <https://reviews.llvm.org/D103317#2793797>, @vsavchenko wrote:
>
>
>
>> I replied to you earlier that assignments are not producing constraints.  The analyzer has some sort of SSA (not really, but anyways), so every time we reassign the variable it actually becomes a different symbol. So, from this perspective `DeclRefExpr` `x` and symbol `x` are two different things:
>>
>>   int x = foo(); // after this DeclRefExpr and VarDecl x are associated with conjured symbol #1 (aka conj#1)
>>   int c = x + 10; // c is associated with `$conj#1 + 10`
>>   x = a + 1; // DeclRefExpr x is now associated with `$a + 1`
>>   x = c + x; // and now it is `conj#1 + 11`
>>
>> There is no symbolic assignment in the static analyzer.  Symbols are values, values don't change.  We can only obtain more information of what particular symbolic value represents.
>>
>> I hope it makes it more clear.
>>
>> As for equality, we already track equivalence classes of symbols and they can have more than 2 symbols in them.  Actually, this whole data structure mostly makes sense for cases when the class has more than 2 elements in it.
>
> Thanks for the clarifications.
>
> I'm trying to say that we need to keep our attention on assignments/initializations as well if we want to solve this problem efficiently. I didn't say that assignments produce constraints. Of course they do not :-). I want we produce more //bindings(associations)// on //assignments// to use them as mapping for further substitutions and traversing to get the right range.
>
> As for the equivalence classes. I think we should go away from special cases. I mean what about lower-then classes or greater-or-equal classes. I think you got what I mean. I'm trying to find a general approach which can cover almost all individual improvements around CSA codebase(about ranges).
>
> I started working on this when saw your discussion. Just what you also consider the direction I'm moving. A lot of corner-examples, like you gave me, are welcome. Thank you for the feedbacks!

Assignments and initializations don't affect anything, we shouldn't care about them at all because they don't affect symbolic values in any way.  You keep bringing up assignments/initializations when I repeatedly tell you that these are orthogonal to symbolic values, they simply associate certain symbolic values/expressions with memory regions (aka bind them).

I want to make sure you understand that there is no silver bullet here.  There is no one magic method that will resolve everything in polynomial let alone linear time (unless someone proves that NP = P).  This problem is inherently exponential.  We try to avoid it at all costs (and, thus, suck sometimes).
In general, when SMT-solvers need to solve a set of bit-vector equations, they use a technique called **bit blasting** - just turning each n-bit integer into n-separate boolean variables, and using SAT solver after that.  More advanced solvers use special theories before to reduce dimensionality of the problem, one of them is equality theory.  These theories are not limitations or "special cases", they are pieces of fundamental knowledge about the domain.  Those are as generic as you can go.  One symbolic expression can be equal to multiple symbolic expressions at the same time, it is the fact, not a special case.

I'm sorry for being too harsh here, but this problem IS harder than you think.  If you want to approach it correctly, and not waste your time, you should get these fundamentals straight.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103317/new/

https://reviews.llvm.org/D103317