[PATCH] D124926: [LoopInterchange] New cost model for loop interchange

Fri May 20 14:26:56 PDT 2022

congzhe added a comment.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:63

+static cl::opt<bool> EnableLegacyCostModel(
+    "loop-interchange-enable-legacy-cost-model", cl::init(true), cl::Hidden,
----------------
bmahjour wrote:
> Not sure if we need an option, I slightly prefer removing it, because the fact that we have an "EnableLegacy" option for it that defaults to "true" might incorrectly imply that we still use the old cost model where in fact we use the new one for majority of cases.
Thanks, that's a good point. I've removed this flag.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:512
+    // and populate the <Loop, index> pair into a map for constant time query
+    // later.
+    const auto &LoopCosts = CC->getLoopCosts();
----------------
bmahjour wrote:
> Add a comment about what the index reprsents (ie the optimal order of loops in the nest).
Thanks and updated accordingly.

================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:1180
+  int Cost = 0;
+  if (EnableLegacyCostModel) {
+    Cost = getInstrOrderCost();
----------------
bmahjour wrote:
> I think we should try to limit the use of legacy cost model only to cases were LoopCacheCost fails due to delinearization, as you mentioned in the description. Can we do it like this:
> 
> ```
> if (CostMap.find(InnerLoop) != CostMap.end() &&
>       CostMap.find(OuterLoop) != CostMap.end()) {
> ...
> } else {
>    Cost = getInstrOrderCost();
>     LLVM_DEBUG(dbgs() << "Cost = " << Cost << "\n");
>     if (Cost < -LoopInterchangeCostThreshold)
>       return true;
> }
> ```
Thanks, I updated the code as you suggested.

================
Comment at: llvm/test/Transforms/LoopInterchange/call-instructions.ll:2
 ; REQUIRES: asserts
-; RUN: opt < %s -basic-aa -loop-interchange -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
+; RUN: opt < %s -mcpu=tsv110 -basic-aa -loop-interchange -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t -S \
 ; RUN:     -verify-dom-info -verify-loop-info -stats 2>&1 | FileCheck -check-prefix=STATS %s
----------------
bmahjour wrote:
> If we use `powerpc64le-unknown-linux-gnu` as the target triple, then we wouldn't need to specify `-mcpu`, since `getCacheLineSize()` returns valid cache line size for all Power subtargets.
I've now used `powerpc64le-unknown-linux-gnu` for all tests.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124926/new/

https://reviews.llvm.org/D124926