[PATCH] D89692: [SCEV] SCEVPtrToIntExpr simplifications

Mon Oct 26 05:27:46 PDT 2020

lebedev.ri added inline comments.

================
Comment at: llvm/test/Analysis/ScalarEvolution/ptrtoint.ll:501
 ; X32-NEXT:    %i9 = ptrtoint i32* %i7 to i64
-; X32-NEXT:    --> (zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64) U: [0,4294967296) S: [0,4294967296) Exits: (zext i32 (ptrtoint i32* ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) to i32) to i64) LoopDispositions: { %bb6: Computable }
+; X32-NEXT:    --> {(zext i32 (ptrtoint i32* %arg to i32) to i64),+,4}<nuw><%bb6> U: [0,8589934588) S: [0,8589934588) Exits: ((zext i32 (ptrtoint i32* %arg to i32) to i64) + (4 * ((zext i32* (-4 + (-1 * %arg) + %arg1) to i64) /u 4))<nuw><nsw>) LoopDispositions: { %bb6: Computable }
 ; X32-NEXT:    %i10 = sub i64 %i9, %i4
----------------
lebedev.ri wrote:
> mkazantsev wrote:
> > Regression in range estimates.
> Originally we had pointer-typed AddRec there: `{%arg,+,4}<nuw><%bb6>`, which is:
> ```
> ; X32-NEXT:    %i7 = phi i32* [ %arg, %bb3 ], [ %i15, %bb6 ]
> ; X32-NEXT:    --> {%arg,+,4}<nuw><%bb6> U: full-set S: full-set Exits: ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) LoopDispositions: { %bb6: Computable }
> ```
> It's constant range is 32-bit fullset, because we don't have any range knowledge for `%arg`.
> 
> Then we had a cast of it to pointer: `(zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64)`,
> and zero-extended it to 64-bit.
> So is obvious that for the old expression, constant range `[0,4294967296)` is correct.
> 
> But now we start with `zext i32 (ptrtoint i32* %arg to i32) to i64` (which is `[0,4294967296)`)
> and only then we have an AddRec, and in 64 bits now.
> For the AddRec, the range is `[Base + Step*max backedge-taken count, End + Step*max backedge-taken count)`.
> For 32-bit AddRec, that computed to full-set that we then zero-extended,
> but here it obviously computes to `[0, 4294967296 + 4*1073741823]`.
> 
> So that's it, i believe. ptrtoint is essentially a red herring here,
> we should have this problem every time we sink z/s-ext into an AddRec.
Edit: what i forgot to add: `SCEVPtrToIntCast` is intentionally not bitwidth-changing itself,
and sinking it does not affect the computations. The important bit is that `getZeroExtendExpr()`
decided that it was okay to sink the zero-extension into SCEVAddRecExpr,
and that is not something i changed in this patch.

It doesn't do that indiscriminately:
```
  // If the input value is a chrec scev, and we can prove that the value
  // did not overflow the old, smaller, value, we can zero extend all of the
  // operands (often constants).  This allows analysis of something like
  // this:  for (unsigned char X = 0; X < 100; ++X) { int Y = X; }
  if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Op))
```
Clearly, there was `NUW` on that SCEVAddRecExpr, so the heuristic passed.

But obviously i32 SCEVAddRecExpr w/NUW and i64 SCEVAddRecExpr w/NUW result in
different possible ranges, and we lost the knowledge that NUW was for i32 bitwidth.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89692/new/

https://reviews.llvm.org/D89692