[PATCH] D63459: Loop Cache Analysis

Mon Jun 17 16:52:44 PDT 2019

Meinersbur added a comment.

What do you plan as the first use of this transformation?

================
Comment at: llvm/include/llvm/Analysis/LoopCacheAnalysis.h:30-31
+class LPMUpdater;
+using CacheCost_t = unsigned long long;
+using LoopVector = SmallVector<Loop *, 8>;
+
----------------
[style] The current code base most often uses a `Ty` suffix for typedefs.

================
Comment at: llvm/include/llvm/Analysis/LoopCacheAnalysis.h:174
+public:
+  static unsigned constexpr DefaultTripCount = 100;
+  static CacheCost_t constexpr InvalidCost = ULLONG_MAX;
----------------
[suggestion] Make these magic numbers `cl::opt` options?

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:63-64
+
+/// Retrieve the innermost loop in the given loop nest \p Loops (we are
+/// expecting to find a single innermost loop).
+static const Loop *getInnerMostLoop(const LoopVector &Loops) {
----------------
Is passing a single innermost loop a precondition or a what this function computes? It seems the only loop that this function might return is `Loops.back()` and kind-of verifies that the loop vector is in the right order?!?

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:307
+          dbgs().indent(2)
+          << "ERROR: failed to delinearize, can't identify  base pointer\n");
+      return false;
----------------
[nit] double space

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:319
+
+    if (Subscripts.size() == 0 || Sizes.size() == 0 ||
+        Subscripts.size() != Sizes.size()) {
----------------
[style] `Subscripts.empty()`. `Sizes.empty()`

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:332
+        Sizes.clear();
+        continue;
+      }
----------------
[style] Did you mean break/return here? The next iteration might add elements to it. I'd structure this as
```
if (!isOneDimensionalArray(AccessFn, L)) {
   ...
  break;
}

...
```
conforming https://llvm.org/docs/CodingStandards.html#use-early-exits-and-continue-to-simplify-code.

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:336
+
+    return std::all_of(Subscripts.begin(), Subscripts.end(),
+                       [&](const SCEV *Subscript) {
----------------
[style] `return all_of(Subcripts, ...`

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:386
+      (L.getExitingBlock())
+          ? SE.getSmallConstantTripCount(&L, L.getExitingBlock())
+          : CacheCost::DefaultTripCount;
----------------
Why not using `getBackedgeTakenCount()`?

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:523-531
+  LLVM_DEBUG(dbgs() << "\nIDENTIFIED REFERENCE GROUPS:\n");
+  unsigned n = 1;
+  for (const auto &RG : RefGroups) {
+    LLVM_DEBUG(dbgs().indent(2) << "RefGroup " << n << ":\n");
+    for (const auto &IR : *RG)
+      LLVM_DEBUG(dbgs().indent(4) << *IR << "\n");
+    n++;
----------------
Since this only has code that executes in assert-builds, it should be guarded entirely by an `LLVM_DEBUG`.

================
Comment at: llvm/lib/Analysis/LoopCacheAnalysis.cpp:590-591
+  if (getInnerMostLoop(Loops) == nullptr) {
+    LLVM_DEBUG(dbgs() << "Cannot compute cache cost of loop nest with more "
+                         "than one innermost loop\n");
+    return nullptr;
----------------
If the intention is to provide analysis for innermost loops only, why this limitation? Could it just return an analysis result for each innermost loop?

If the analysis requires a global view to determine the cost for each loop, wouldn't a FunctionPass be more appropriate? Currently, it seems users first need get the LoopCacheAnalysis for a topmost loops, the ask it for one of its nested loops.

Are such loop nests not analyzable at all?
```
  while (repeat) {
    for (int i = ...)
      for (int j = ...)
        B[i][j] = ... A[i+1][j+1] ... // stencil
    for (int i = ...)
      for (int j = ...)
        A[i][j] = ... B[i+1][j+1] ... // stencil
  }
```

================
Comment at: llvm/test/Analysis/LoopCacheAnalysis/a.ll:1
+; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 | FileCheck %s
+
----------------
[serious] Can you give a more descriptive file names that `a.ll` and `b.ll` etc?

================
Comment at: llvm/test/Analysis/LoopCacheAnalysis/a.ll:4
+target datalayout = "e-m:e-i64:64-n32:64"
+target triple = "powerpc64le-unknown-linux-gnu"
+
----------------
[suggestion] Is it possible to leave the triple unspecified to the test case?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63459/new/

https://reviews.llvm.org/D63459