[PATCH] D25963: [LoopUnroll] Implement profile-based loop peeling
David Li via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 28 12:05:03 PDT 2016
davidxl added inline comments.
================
Comment at: lib/Transforms/Utils/LoopUnroll.cpp:357
+ << " peeling loop %" << Header->getName() << " by "
+ << NV("PeelCount", PeelCount) << ".\n");
} else {
----------------
.\n --> iterations.\n
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:63
+unsigned llvm::getPeelCount(Loop *L, unsigned LoopSize,
+ TargetTransformInfo::UnrollingPreferences *UP) {
+ if (!canPeel(L))
----------------
Is UP parameter intended to be used?
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:121
+ BasicBlock *PreHeader, BasicBlock *Exit,
+ uint64_t &TotalLoopWeight,
+ SmallVectorImpl<BasicBlock *> &NewBlocks,
----------------
I think the following weights should be passed in:
* TotalHeaderWeight of the currently peeled loop -- it is passed in for update:
TotalHeaderWeight -= PeeledIterationHeaderWeight;
* PeeledIterationHeaderWeight
1) its initial value for the first peeled iteration is the PreheaderWeight or ExitWeight of the original loop
2) After peeling of one iteration, its value will be updated to the fallthrough weight of the peel
* TotalExitEdgeWeight -- passed in for update.
After peeling, TotalExitEdgeWeight -= PeeledIterationExitWeight'
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:125
+ ValueToValueMapTy &LVMap, LoopInfo *LI) {
+
+ BasicBlock *Header = L->getHeader();
----------------
mkuper wrote:
> Gerolf wrote:
> > Could this be implemented as
> > - distribute k: N-k (assuming trip count N)
> > - complete unroll first k iterations?
> Anyway, yes, I've originally thought about implementing it like that (it seemed like it'd be simpler, although I'm not entirely sure there's a big difference). But I think a more direct implementation fits the flow of the unroller better.
>
> I wanted to make the "unroll / runtime unroll / peel" decision at the same point (otherwise this could be a separate pass that runs before the unroller, performs the split and marks the two loops as nounroll and force-unroll). And if this lives in the unroller - splitting and then unrolling the new loop while in the middle of "unrolling" the original one seemed awkward.
for counted loops, distribute + complete unroll may be simple, but not necessarily good for general cases such as linked-list traversal loops etc.
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:168
+ uint64_t FallThruWeight =
+ TotalLoopWeight * ((float)(PeelCount - IterNumber) / PeelCount * 0.9);
+ uint64_t ExitWeight = TotalLoopWeight - FallThruWeight;
----------------
Should average trip count be used instead of PeelCount (which may be different)?
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:291
+ cloneLoopBlocks(L, Iter, PeelCount, InsertTop, InsertBot, NewPreHeader,
+ Exit, ExitWeight, NewBlocks, LoopBlocks, VMap, LVMap, LI);
+ InsertTop = InsertBot;
----------------
See my comment above about what weights need to be passed for updating.
================
Comment at: lib/Transforms/Utils/LoopUnrollPeel.cpp:323
+ MDBuilder MDB(LatchBR->getContext());
+ MDNode *WeightNode = MDB.createBranchWeights(ExitWeight, ExitWeight);
+ LatchBR->setMetadata(LLVMContext::MD_prof, WeightNode);
----------------
Both headeTotalWeight and ExitEdge Weight need to be adjusted during peeling, and the new meta data can just use the updated backedge weight and exit weight.
https://reviews.llvm.org/D25963
More information about the llvm-commits
mailing list