[PATCH] D89692: [SCEV] SCEVPtrToIntExpr simplifications

Tue Oct 27 06:06:41 PDT 2020

lebedev.ri added inline comments.

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:1058
+  if (const SCEVNAryExpr *NaryExpr = dyn_cast<SCEVNAryExpr>(Op)) {
+    SmallVector<const SCEV *, 2> NewOps;
+    NewOps.reserve(NaryExpr->getNumOperands());
----------------
mkazantsev wrote:
> nit: could go with `SmallVector<const SCEV *, 2> NewOps(NaryExpr->getNumOperands())`
That doesn't reserve thought, but creates a vector with that many `nullptr`s.
And afterwards we'd have to do something like
```
for(int I = 0; I != NaryExpr->getNumOperands(); ++I) {
  const auto& Op = NaryExpr->getOperand(I);
  NewOps[I] = Op->getType()->isPointerTy()
                           ? getPtrToIntExpr(Op, IntPtrTy, Depth + 1)
                           : Op;
}
```
which seems strictly worse to me. So i'll keep this as-is.

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:1085
+    case scUDivExpr:
+    // NewNaryExpr = getUDivExpr(NewOps[0], NewOps[1]);
+    // break;
----------------
mkazantsev wrote:
> Is it commented on purpose?
Mainly because i was unsuccessful with acquiring a test case.
But as i have already noted in https://reviews.llvm.org/D89692#2352566,
this hand-written logic will hopefully go away afterwards.

But for now, let's just err on the safe side, and have dead/untested code.

================
Comment at: llvm/test/Analysis/ScalarEvolution/ptrtoint.ll:501
 ; X32-NEXT:    %i9 = ptrtoint i32* %i7 to i64
-; X32-NEXT:    --> (zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64) U: [0,4294967296) S: [0,4294967296) Exits: (zext i32 (ptrtoint i32* ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) to i32) to i64) LoopDispositions: { %bb6: Computable }
+; X32-NEXT:    --> {(zext i32 (ptrtoint i32* %arg to i32) to i64),+,4}<nuw><%bb6> U: [0,8589934588) S: [0,8589934588) Exits: ((zext i32 (ptrtoint i32* %arg to i32) to i64) + (4 * ((zext i32* (-4 + (-1 * %arg) + %arg1) to i64) /u 4))<nuw><nsw>) LoopDispositions: { %bb6: Computable }
 ; X32-NEXT:    %i10 = sub i64 %i9, %i4
----------------
mkazantsev wrote:
> lebedev.ri wrote:
> > lebedev.ri wrote:
> > > mkazantsev wrote:
> > > > Regression in range estimates.
> > > Originally we had pointer-typed AddRec there: `{%arg,+,4}<nuw><%bb6>`, which is:
> > > ```
> > > ; X32-NEXT:    %i7 = phi i32* [ %arg, %bb3 ], [ %i15, %bb6 ]
> > > ; X32-NEXT:    --> {%arg,+,4}<nuw><%bb6> U: full-set S: full-set Exits: ((4 * ((-4 + (-1 * %arg) + %arg1) /u 4))<nuw> + %arg) LoopDispositions: { %bb6: Computable }
> > > ```
> > > It's constant range is 32-bit fullset, because we don't have any range knowledge for `%arg`.
> > > 
> > > Then we had a cast of it to pointer: `(zext i32 (ptrtoint i32* {%arg,+,4}<nuw><%bb6> to i32) to i64)`,
> > > and zero-extended it to 64-bit.
> > > So is obvious that for the old expression, constant range `[0,4294967296)` is correct.
> > > 
> > > But now we start with `zext i32 (ptrtoint i32* %arg to i32) to i64` (which is `[0,4294967296)`)
> > > and only then we have an AddRec, and in 64 bits now.
> > > For the AddRec, the range is `[Base + Step*max backedge-taken count, End + Step*max backedge-taken count)`.
> > > For 32-bit AddRec, that computed to full-set that we then zero-extended,
> > > but here it obviously computes to `[0, 4294967296 + 4*1073741823]`.
> > > 
> > > So that's it, i believe. ptrtoint is essentially a red herring here,
> > > we should have this problem every time we sink z/s-ext into an AddRec.
> > Edit: what i forgot to add: `SCEVPtrToIntCast` is intentionally not bitwidth-changing itself,
> > and sinking it does not affect the computations. The important bit is that `getZeroExtendExpr()`
> > decided that it was okay to sink the zero-extension into SCEVAddRecExpr,
> > and that is not something i changed in this patch.
> > 
> > It doesn't do that indiscriminately:
> > ```
> >   // If the input value is a chrec scev, and we can prove that the value
> >   // did not overflow the old, smaller, value, we can zero extend all of the
> >   // operands (often constants).  This allows analysis of something like
> >   // this:  for (unsigned char X = 0; X < 100; ++X) { int Y = X; }
> >   if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(Op))
> > ```
> > Clearly, there was `NUW` on that SCEVAddRecExpr, so the heuristic passed.
> > 
> > But obviously i32 SCEVAddRecExpr w/NUW and i64 SCEVAddRecExpr w/NUW result in
> > different possible ranges, and we lost the knowledge that NUW was for i32 bitwidth.
> > 
> Thanks for clarifying this. Actually that looks familiar. Just curious, could you please check if the problem goes away with https://reviews.llvm.org/D89381 and flag `scalar-evolution-use-expensive-range-sharpening` turned on?
I just tried, and that (the patch + flag) does not help here.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89692/new/

https://reviews.llvm.org/D89692