[PATCH] D109821: [ScalarEvolution] Infer loop max trip count from array accesses

Mon Sep 27 15:47:07 PDT 2021

reames added inline comments.

================
Comment at: llvm/include/llvm/Analysis/ScalarEvolution.h:794
+  /// Returns 0 if the trip count is unknown.
+  unsigned getConstantMaxTripCountFromArray(const Loop *L);
+
----------------
bin.cheng wrote:
> mkazantsev wrote:
> > I think it's a bad API for this. The methods above has an excuse of using word "small", assuming that they don't care about "big" trip counts (for whatever reason that is).
> > 
> > Your method sounds like it just computes an upper bound of trip count from array, whatever this value is. What if this value does not fit in unsigned? Any reasons we can't return an `Optional<APInt>`?
> Right, it's possible the computed trip count can't be expressed by unsigned type.  Either we should return Optional<APInt> or rename it as getSmallConstantMaxTripCountFromArray and only returns non-zero if trip count can be expressed by unsigned type.
> I can see two advantages with the latter choice:
> 1\ be consistent with existing interfaces.
> 2\ the interface is mostly designed to be used to avoid(or enable) various loop transformations in compilation time unknown small trip count. (like vectorization, loop idiom, more accurate BPI, etc..)
> 
> Do you have any preference on this?
> 
> Thanks
I'd suggest switching this interface to simply return a SCEV - specifically a getUMinOfMismatchTypes.  

If we later want a version which only returns small constants (per the current definition), we can.

Also, please be *very* clear in the description of whether this is returning a backedge taken count (e.g. exit count) or a trip count.  The existing small constant interfaces are generally trip counts.  (In particular, zero has *very* different meanings for the two types of counts.)

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:7262
+    //  └────►Exit               └────►Exit
+    unsigned Delta = 1;
+    if (DT.dominates(BB, LoopExiting))
----------------
You're getting too complicated for this in your initial patch.  Add a bailout before the loop and simply return unknown if latch != exiting block.  You can come back and add the complexity in a following patch if warranted.  

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:7293
+      // Or repeat memory opreation.
+      if (!Step->getAPInt().isSingleWord() ||
+          Step->getAPInt().getZExtValue() !=
----------------
A more idiomatic way to write this check pattern would be:
if (Step != ElemSize || Step->isZero() || Step->getAPInt().isNegative() || Step->getAPInt()->.getActiveBits() > 32)

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:7303
+      if (!AllocateInst || !AllocateInst->isStaticAlloca() ||
+          L->contains(AllocateInst->getParent()))
+        continue;
----------------
This can't be true for a static alloca.

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:7312
+      auto *ElemType = dyn_cast<ArrayType>(Ty)->getElementType();
+      if (!ElemType || DL.getTypeAllocSize(ElemType).isScalable())
+        continue;
----------------
Its not clear to me why we care about the structure of the allocated type.  If we can get the allocated size, why do we care what the raw element type is?

OH, you've reintroduced the bug I pointed out before.  You seem to be expecting ElementType == GEPType, but you never actually check that.  Given that potential mismatch, you have a potential unit confusion.  

Please either fix the bug - if I'm correct - or add appropriate tests and asserts to make it obvious this is not possible.  

================
Comment at: llvm/lib/Analysis/ScalarEvolution.cpp:7315
+
+      // Now we can infer a max excute time by MemLength/StepLength.
+      const SCEV *MemSize =
----------------
bin.cheng wrote:
> mkazantsev wrote:
> > Is max execute time a same thing as trip count, or it's something else? If they are the same, please use "trip count" as something that is more understandable.
> IIUC, it depends on the location of array reference to exiting block.  If it doesn't dominate exiting block, then this "max execute time" could be "tripcount - 1".  So the meaning for "max execute time" is the max times that the array reference can be executed in terms of SCEV derived from loop iv, without triggering UB.
This really sounds like you're mixing up the notions of backedge taken counts and trip counts.  It's really important you get the description right here, as this API is useless to anyone else without a clear definition of what exactly it's bounding.  

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109821/new/

https://reviews.llvm.org/D109821