[PATCH] D50985: [SCEV] LoopsUsed memoization
Roman Tereshin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Aug 20 11:41:22 PDT 2018
rtereshin created this revision.
rtereshin added reviewers: sanjoy, mkazantsev, efriedma.
Herald added a subscriber: javed.absar.
Currently ScalarEvolution::getUsedLoops traverses a SCEV expression upon each call
w/o caching the results. As it's called by `addToLoopUseLists` for every new SCEV node created
it is time consuming. This patch adds a memoization map to speed up the calls and tries
to do so in the least invasive manner possible.
This partially addresses https://bugs.llvm.org/show_bug.cgi?id=32731
On the `large-SCEVs-shallow-getSCEV-stack.ll` test case attached to the Bugzilla bug I see
~70% reduction in overall (wall) time of
time ./bin/opt -slsr large-SCEVs-shallow-getSCEV-stack.ll -o /dev/null
run (the rest of the time (about 98% give or take) is still spent in `checkValidity`).
CTMark shows either a small improvement or is within noise, it's hard to tell even
on a 100 runs (x86):
| Name | Prev | Current | % | Δ | MAD | | Prev | Current | % | Δ | MAD |
| --------------------------------------------- | ------- | ------- | ------ | ------- | ------ | | ------- | ------- | ------ | ------- | ------ |
| CTMark/ClamAV/clamscan | 9.8202 | 9.8004 | -0.20% | -0.0198 | 0.0754 | | 9.9054 | 9.8899 | -0.16% | -0.0155 | 0.0754 |
| CTMark/kimwitu++/kc | 11.0440 | 11.0182 | -0.23% | -0.0258 | 0.0268 | | 11.0805 | 11.0802 | 0.00% | -0.0003 | 0.0268 |
| CTMark/tramp3d-v4/tramp3d-v4 | 11.6199 | 11.5813 | -0.33% | -0.0386 | 0.0388 | | 11.6906 | 11.6903 | 0.00% | -0.0003 | 0.0388 |
| CTMark/7zip/7zip-benchmark | 27.3075 | 27.2473 | -0.22% | -0.0602 | 0.1511 | | 27.8942 | 27.8941 | 0.00% | -0.0001 | 0.1511 |
| CTMark/sqlite3/sqlite3 | 4.7515 | 4.7446 | -0.15% | -0.0069 | 0.0278 | | 4.7894 | 4.7893 | 0.00% | -0.0001 | 0.0278 |
| CTMark/7zip/7zip-benchmark-link | 0.0443 | 0.0438 | -1.13% | -0.0005 | 0.0010 | | 0.0457 | 0.0457 | 0.00% | 0.0000 | 0.0010 |
| CTMark/Bullet/bullet-link | 0.0320 | 0.0319 | -0.31% | -0.0001 | 0.0006 | | 0.0329 | 0.0329 | 0.00% | 0.0000 | 0.0006 |
| CTMark/ClamAV/clamscan-link | 0.0234 | 0.0234 | 0.00% | 0.0000 | 0.0003 | | 0.0238 | 0.0238 | 0.00% | 0.0000 | 0.0003 |
| CTMark/SPASS/SPASS-link | 0.0227 | 0.0227 | 0.00% | 0.0000 | 0.0002 | | 0.0230 | 0.0230 | 0.00% | 0.0000 | 0.0002 |
| CTMark/consumer-typeset/consumer-typeset-link | 0.0239 | 0.0239 | 0.00% | 0.0000 | 0.0003 | | 0.0246 | 0.0246 | 0.20% | 0.0000 | 0.0003 |
| CTMark/kimwitu++/kc-link | 0.0528 | 0.0528 | 0.00% | 0.0000 | 0.0003 | | 0.0534 | 0.0534 | 0.09% | 0.0000 | 0.0003 |
| CTMark/lencod/lencod-link | 0.0238 | 0.0238 | 0.00% | 0.0000 | 0.0003 | | 0.0246 | 0.0246 | 0.00% | 0.0000 | 0.0003 |
| CTMark/mafft/pairlocalalign-link | 0.0174 | 0.0174 | 0.00% | 0.0000 | 0.0001 | | 0.0176 | 0.0176 | 0.28% | 0.0000 | 0.0001 |
| CTMark/sqlite3/sqlite3-link | 0.0141 | 0.0141 | 0.00% | 0.0000 | 0.0001 | | 0.0143 | 0.0143 | 0.00% | 0.0000 | 0.0001 |
| CTMark/tramp3d-v4/tramp3d-v4-link | 0.0238 | 0.0238 | 0.00% | 0.0000 | 0.0001 | | 0.0240 | 0.0240 | 0.00% | 0.0000 | 0.0001 |
| CTMark/consumer-typeset/consumer-typeset | 7.6772 | 7.6499 | -0.36% | -0.0273 | 0.0343 | | 7.7051 | 7.7058 | 0.01% | 0.0006 | 0.0343 |
| CTMark/mafft/pairlocalalign | 4.9020 | 4.8927 | -0.19% | -0.0093 | 0.0199 | | 4.9349 | 4.9355 | 0.01% | 0.0006 | 0.0199 |
| CTMark/SPASS/SPASS | 9.0386 | 9.0233 | -0.17% | -0.0153 | 0.0716 | | 9.1167 | 9.1181 | 0.02% | 0.0014 | 0.0716 |
| CTMark/lencod/lencod | 8.5570 | 8.5427 | -0.17% | -0.0143 | 0.0397 | | 8.5944 | 8.5960 | 0.02% | 0.0016 | 0.0397 |
| CTMark/Bullet/bullet | 20.3327 | 20.3012 | -0.15% | -0.0315 | 0.1420 | | 20.7366 | 20.7445 | 0.04% | 0.0079 | 0.1420 |
|
(the left half of the table uses minimum as an aggregate function, the right half - median, the underlying data are the same 100 samples,
rows are sorted by the absolute delta of medians)
If the memory consumption becomes a concern we can switch from having a full set of loops referenced attached to every SCEV node
to a mirrored tree (DAG really) structure in a skip-list fashion so every node contains a set of unique references to all closest AddRec
nodes. This way getUsedLoops will be able to traverse a compressed tree (DAG) containing AddRec's only for any SCEV expression.
Repository:
rL LLVM
https://reviews.llvm.org/D50985
Files:
include/llvm/Analysis/ScalarEvolution.h
lib/Analysis/ScalarEvolution.cpp
Index: lib/Analysis/ScalarEvolution.cpp
===================================================================
--- lib/Analysis/ScalarEvolution.cpp
+++ lib/Analysis/ScalarEvolution.cpp
@@ -11760,6 +11760,7 @@
ExprValueMap.erase(S);
HasRecMap.erase(S);
MinTrailingZerosCache.erase(S);
+ LoopsRefd.erase(S);
for (auto I = PredicatedSCEVRewrites.begin();
I != PredicatedSCEVRewrites.end();) {
@@ -11790,24 +11791,33 @@
ScalarEvolution::getUsedLoops(const SCEV *S,
SmallPtrSetImpl<const Loop *> &LoopsUsed) {
struct FindUsedLoops {
- FindUsedLoops(SmallPtrSetImpl<const Loop *> &LoopsUsed)
- : LoopsUsed(LoopsUsed) {}
+ FindUsedLoops(SmallPtrSetImpl<const Loop *> &LoopsUsed, ScalarEvolution &SE)
+ : LoopsUsed(LoopsUsed), SE(SE) {}
SmallPtrSetImpl<const Loop *> &LoopsUsed;
+ ScalarEvolution &SE;
+
bool follow(const SCEV *S) {
+ auto It = SE.LoopsRefd.find(S);
+ if (It != SE.LoopsRefd.end() && &It->second != &LoopsUsed) {
+ LoopsUsed.insert(It->second.begin(), It->second.end());
+ return false;
+ }
if (auto *AR = dyn_cast<SCEVAddRecExpr>(S))
LoopsUsed.insert(AR->getLoop());
return true;
}
bool isDone() const { return false; }
};
- FindUsedLoops F(LoopsUsed);
+ FindUsedLoops F(LoopsUsed, *this);
SCEVTraversal<FindUsedLoops>(F).visitAll(S);
}
void ScalarEvolution::addToLoopUseLists(const SCEV *S) {
- SmallPtrSet<const Loop *, 8> LoopsUsed;
+ assert(LoopsRefd.find(S) == LoopsRefd.end() &&
+ "addToLoopUseLists should be called exactly once per every new SCEV");
+ SmallPtrSetImpl<const Loop *> &LoopsUsed = LoopsRefd[S];
getUsedLoops(S, LoopsUsed);
for (auto *L : LoopsUsed)
LoopUsers[L].push_back(S);
Index: include/llvm/Analysis/ScalarEvolution.h
===================================================================
--- include/llvm/Analysis/ScalarEvolution.h
+++ include/llvm/Analysis/ScalarEvolution.h
@@ -1856,6 +1856,9 @@
/// This maps loops to a list of SCEV expressions that (transitively) use said
/// loop.
DenseMap<const Loop *, SmallVector<const SCEV *, 4>> LoopUsers;
+ /// The inverse of LoopUsers map; maps a SCEV expression to a set of
+ /// (transitively) referenced Loops.
+ DenseMap<const SCEV *, SmallPtrSet<const Loop *, 4>> LoopsRefd;
/// Cache tentative mappings from UnknownSCEVs in a Loop, to a SCEV expression
/// they can be rewritten into under certain predicates.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D50985.161516.patch
Type: text/x-patch
Size: 2506 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180820/71933bc1/attachment.bin>
More information about the llvm-commits
mailing list