[clang] [analyzer][solver] On SymSym RelOps, check EQClass members for contradictions (PR #71284)

Mon Nov 6 04:13:28 PST 2023

steakhal wrote:

> @DonatNagyE  The most straightforward issue that I see is that (if I understand the code correctly) the intersected constraint (the value of the variable `Constraint` at the end of `handleEquivalentAlternativeSymOperands()`) is just discarded after checking that it's not empty. It's plausible that this discarded information will be re-calculated by this process in the follow-up range calculations and don't have a concrete testcase where it's relevant that we "leave food on the table" (I can try to find one if you'd find it helpful) but I think that it may be useful to think about "preserving" this `Constraint` that we calculated. (This is not a blocking issue: I can support merging the current logic if you don't see a reasonably simple improvement in this area.)

I tried adding this hunk at the end of the function, but the assertion never fired:
```c++
  // If we learned something, save it.
  if (Constraint != OriginalConstraint) {
    assert(false);
    State = assign(EquivalenceClass::find(State, SymSym), Constraint);
    return static_cast<bool>(State);
  }
```

> There is also the choice that you limited the runtime to `|eqclass(X)| + |eqclass(Y)|` by only checking the alternative constraints where one of the two representatives is the original one. I think that here it's a good decision to ensure a reasonable runtime by potentially discarding some information, but I feel that it'd be good to document this decision with a testcase. For example, I think
> 
> ```
> void gh_62215_left_and_right(int x, int y, int z, int w) {
>   if (x != y) return; // x == y
>   if (z != w) return; // z == w
>   if (z <= x) return; // z > x
>   if (w >= y) return; // w < y
> }
> ```

Fair point. Actually, originally I wanted a full cross-product of the alternatives, which I refined to a half-cross-product by following Gábor's comments. We shall come back here and do something more refined if there is a practical need.
I added a FIXME comment explaining our options, along with the test case you proposed. Thanks.

> Finally there is the "why do we need this trickery at all?" question that was also raised by @Xazax-hun. I agree with him that on a longer term, it'd be good to "strengthen" the equality classes and ensure that we assign the same RangeSet to each member of an EQClass (or, in other words, we assign the RangeSet to an EQClass instead of an individual symbol). This would potentially improve the runtime (although I'm not sure that performance is especially relevant here) and handle the left-and-right testcase adequately.

I think we haven't discussed yet the approach of applying the constraint to every eqclass member. That would feel like defeating the purpose of eqclasses at all.
To me, our best option so far is to keep the representative symbols alive and canonicalize (replace) all sub-symbols in constraints with representatives whenever we merge eqclasses or add constraints.

Thanks for the reviews folks!

https://github.com/llvm/llvm-project/pull/71284