[PATCH] D144226: [Loop-Interchange] Allow inner-loop only reductions
Michael Kruse via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 17 12:48:48 PDT 2023
Meinersbur added inline comments.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:279-284
+ struct PHIUseChain {
+ LoadInst *Load;
+ PHINode *LoopHeaderPHI;
+ PHINode *LoopExitPHI;
+ StoreInst *Store;
+ };
----------------
Can you document this data structure and its fields?
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:362
+ // Set of Removable PHI chains
+ SmallVector<PHIUseChain, 8> RemovablePHIs;
};
----------------
I think "RemovablePHI" is the wrong name. The PHIs are not just removed, but demoted. Suggestion: "SinkablePHIs" ("sink" as in "sink into loop body").
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:536
if (CC != nullptr) {
const auto &LoopCosts = CC->getLoopCosts();
for (unsigned i = 0; i < LoopCosts.size(); i++) {
----------------
In the loop optimization group call we discussed whether the CacheCost analysis will have the wrong result because it only sees the promoted reduction register, not the additional load/stores in the innermost loop.
After thinking about it, CacheCost is defined in number of touched cache lines: This does not change for the outer loop, and the inner loop only accesses the same cache line repeatedly, so I don't think it matters.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:730
}
-
return nullptr;
----------------
[nit] unrelated change
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:846
+ for (auto I = Begin; I != End; I++)
+ if ((isa<LoadInst>(I) || isa<StoreInst>(I)) &&
+ DI->depends(SI, &*I, false))
----------------
There are other instructions that can access memory, e.g. CallInst (as in `memset(...)`). `tightlyNested` might already check for such instructions; if so, please add a comment about this here.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:847
+ if ((isa<LoadInst>(I) || isa<StoreInst>(I)) &&
+ DI->depends(SI, &*I, false))
+ return true;
----------------
DI might not be necessary there. Since the idea is to undo what LICM does, you can use the same-strength analysis as LICM. LICM uses AliasSetTracker and MemorySSA, i.e. just asking AliasAnalysis/MemorySSA whether the pointers cannot alias (which DI does internally as well) should be sufficient.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:855
+ // 2. Store and instructions from L's ExitBlock to Store.
+ if (conflicts(++Load->getIterator(),
+ L->getLoopPreheader()->getTerminator()->getIterator(), Store) ||
----------------
`++` modifies the iterator. Might have consequences if `getIterator()` returns a reference.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:884
+ // Load should only be used by the PHI node.
+ assert(Load->isSimple() && "Expecting a simple load!");
+ if (!Load->hasOneUser())
----------------
If I am not mistaken, there is no previous check whether all accesses are simple already. That is, it will crash ind debug builds if someone uses e.g. `std::atomic`. I think it should also `return False` in this case.
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:889-891
+ if (PromotedVarUser == PHI) {
+ continue;
+ }
----------------
[style]
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:909
+ assert(Store->isSimple() && "Expecting a simple store!");
+ // Value *Addr = Store->getPointerOperand();
+ if (Load->getPointerOperand() == Store->getPointerOperand()) {
----------------
Leftover code
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:945
+ // Insert a copy of Load in the Loop Header
+ Instruction *LI = Load->clone();
+ LI->setName(Load->getName() + ".demoted");
----------------
It seems unnecessary to clone the instruction and create a new one? Why not using the existing instruction and move it into the loop?
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:954
+ SI->setOperand(0, LoopExitPHI->getIncomingValue(0));
+ // SI->insertAfter(dyn_cast<Instruction>(LoopExitPHI->getIncomingValue(0)));
+ SI->insertBefore(&*InnerLoop->getLoopLatch()->getTerminator());
----------------
[nit] Leftover code
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:957-959
+ Store->eraseFromParent();
+ LoopHeaderPHI->eraseFromParent();
+ Load->eraseFromParent();
----------------
Shouldn't the `LoopExitPHI` be removed as well?
================
Comment at: llvm/lib/Transforms/Scalar/LoopInterchange.cpp:1067
+ Instruction *UI = dyn_cast<Instruction>(U);
+ return !isRemovableInst(UI) && UI->getParent() &&
+ (!PN || (!OuterInnerReductions.count(PN) &&
----------------
Unless you removed it yourself, shouldn't all instructions have a parent?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144226/new/
https://reviews.llvm.org/D144226
More information about the llvm-commits
mailing list