[llvm] Fix loop cache cost to avoid cost of zero for refgroups. (PR #88915)

Tue Apr 16 19:26:08 PDT 2024

================
@@ -299,8 +299,15 @@ CacheCostTy IndexedReference::computeRefCost(const Loop &L,
     Stride = SE.getNoopOrAnyExtend(Stride, WiderType);
     TripCount = SE.getNoopOrZeroExtend(TripCount, WiderType);
     const SCEV *Numerator = SE.getMulExpr(Stride, TripCount);
-    RefCost = SE.getUDivExpr(Numerator, CacheLineSize);
-
+    ConstantInt *One =
+        ConstantInt::get(TripCount->getType()->getContext(), APInt(32, 1));
+    bool IsZero =
+        SE.isKnownPredicate(ICmpInst::ICMP_ULT, Numerator, CacheLineSize);
+    // When result is zero, round it to one because at least one cache line must
+    // be used. It does not make sense to output the result that zero cache line
+    // is used
+    RefCost =
+        IsZero ? SE.getSCEV(One) : SE.getUDivExpr(Numerator, CacheLineSize);
----------------
nikic wrote:

I am not familiar with this code, but possibly what you are looking for is actually getUDivCeilSCEV? That is round the division up instead of down.

https://github.com/llvm/llvm-project/pull/88915