[PATCH] D89692: [SCEV] SCEVPtrToIntExpr simplifications

Tue Oct 27 03:24:37 PDT 2020

mkazantsev added inline comments.

================
Comment at: llvm/test/Analysis/ScalarEvolution/ptrtoint.ll:501
 ; X32-NEXT:    %i9 = ptrtoint i32* %i7 to i64
-; X32-NEXT:    --> (zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64) U: [0,4294967296) S: [0,4294967296) Exits: (zext i32 (ptrtoint i32* ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) to i32) to i64) LoopDispositions: { %bb6: Computable }
+; X32-NEXT:    --> {(zext i32 (ptrtoint i32* %arg to i32) to i64),+,4}<nuw><%bb6> U: [0,8589934588) S: [0,8589934588) Exits: ((zext i32 (ptrtoint i32* %arg to i32) to i64) + (4 * ((zext i32* (-4 + (-1 * %arg) + %arg1) to i64) /u 4))<nuw><nsw>) LoopDispositions: { %bb6: Computable }
 ; X32-NEXT:    %i10 = sub i64 %i9, %i4
----------------
lebedev.ri wrote:
> lebedev.ri wrote:
> > mkazantsev wrote:
> > > Regression in range estimates.
> > Originally we had pointer-typed AddRec there: `{%arg,+,4}<nuw><%bb6>`, which is:
> > ```
> > ; X32-NEXT:    %i7 = phi i32* [ %arg, %bb3 ], [ %i15, %bb6 ]
> > ; X32-NEXT:    --> {%arg,+,4}<nuw><%bb6> U: full-set S: full-set Exits: ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) LoopDispositions: { %bb6: Computable }
> > ```
> > It's constant range is 32-bit fullset, because we don't have any range knowledge for `%arg`.
> > 
> > Then we had a cast of it to pointer: `(zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64)`,
> > and zero-extended it to 64-bit.
> > So is obvious that for the old expression, constant range `[0,4294967296)` is correct.
> > 
> > But now we start with `zext i32 (ptrtoint i32* %arg to i32) to i64` (which is `[0,4294967296)`)
> > and only then we have an AddRec, and in 64 bits now.
> > For the AddRec, the range is `[Base + Step*max backedge-taken count, End + Step*max backedge-taken count)`.
> > For 32-bit AddRec, that computed to full-set that we then zero-extended,
> > but here it obviously computes to `[0, 4294967296 + 4*1073741823]`.
> > 
> > So that's it, i believe. ptrtoint is essentially a red herring here,
> > we should have this problem every time we sink z/s-ext into an AddRec.
> Edit: what i forgot to add: `SCEVPtrToIntCast` is intentionally not bitwidth-changing itself,
> and sinking it does not affect the computations. The important bit is that `getZeroExtendExpr()`
> decided that it was okay to sink the zero-extension into SCEVAddRecExpr,
> and that is not something i changed in this patch.
> 
> It doesn't do that indiscriminately:
> ```
>   // If the input value is a chrec scev, and we can prove that the value
>   // did not overflow the old, smaller, value, we can zero extend all of the
>   // operands (often constants).  This allows analysis of something like
>   // this:  for (unsigned char X = 0; X < 100; ++X) { int Y = X; }
>   if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Op))
> ```
> Clearly, there was `NUW` on that SCEVAddRecExpr, so the heuristic passed.
> 
> But obviously i32 SCEVAddRecExpr w/NUW and i64 SCEVAddRecExpr w/NUW result in
> different possible ranges, and we lost the knowledge that NUW was for i32 bitwidth.
> 
Thanks for clarifying this. Actually that looks familiar. Just curious, could you please check if the problem goes away with https://reviews.llvm.org/D89381 and flag `scalar-evolution-use-expensive-range-sharpening` turned on?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89692/new/

https://reviews.llvm.org/D89692