[PATCH] D48283: [SCEV] Properly solve quadratic equations

Wed Jul 25 11:49:04 PDT 2018

kparzysz marked 2 inline comments as done.
kparzysz added inline comments.

================
Comment at: lib/Analysis/ScalarEvolution.cpp:8513
+  unsigned BitWidth = LC->getAPInt().getBitWidth();
+  unsigned NewWidth = BitWidth + 2;
+  LLVM_DEBUG(dbgs() << __func__ << ": addrec coeff bw: "
----------------
efriedma wrote:
> I think `BitWidth + 1` should be sufficient?  At least, that should be enough to convert the AddRec to an equation without losing any information.  (Like you note in the comments later, the equation is `L + nM + n(n-1)/2 N = 0 mod BitWidth`, or `2L + 2M n + n(n-1) N = 0 mod BitWidth+1`.)
I wanted to avoid any overflows, the motivation was `2M-N`. I suppose it cannot overflow `BitWidth+1` with `M` and `N` being `BitWidth` wide.

================
Comment at: lib/Analysis/ScalarEvolution.cpp:8518
+  M = M.sext(NewWidth);
+  L = L.sext(NewWidth);

----------------
efriedma wrote:
> Do we prefer to sign-extend just so abs(A) is small in common cases?  I think the choice of sign-extension vs zero-extension is arbitrary otherwise. Probably worth explicitly noting with a comment.
Are you talking only about this?  There is another extension in `SolveQuadraticEquation` (at line 8325). Originally there was only one sign-extension, but after reorganizing this code, it has been split into two. I kept them identical to the original one (i.e. both as sign-extension).

In general the choice of sign-extension was actually to avoid "false positives" when checking for overflow.  For example, take q(x) = (x+1)(x-2) = x^2 - x - 2, in i3.  The zeros are -1 and 2, q(0) = q(1) = -2 (i.e. nothing exceeding the signed range of i3), so we can expect the calculated solution to be 2. With zero-extended values, the function would be r(x) = x^2 + 7x + 6 with both solutions negative: -6 and -1. The equation r(x) = 0 would then become r(x) = 8 to solve for an unsigned overflow, i.e. x^2 + 7x - 2 = 0. The root of that is approx. 0.27, so the returned solution would be 1.

Repository:
  rL LLVM

https://reviews.llvm.org/D48283