[llvm-branch-commits] [llvm] 9e46fcc - [DSE, MSSA] Cache accesses with/without reachable read-clobbers.
Florian Hahn via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu Aug 13 05:15:12 PDT 2020
Author: Florian Hahn
Date: 2020-08-13T13:10:14+01:00
New Revision: 9e46fcc34c01386e5142e1630ad02e1284c49c67
URL: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67
DIFF: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67.diff
LOG: [DSE,MSSA] Cache accesses with/without reachable read-clobbers.
Summary:
Currently we repeatedly check the same uses for read clobbers in some
cases. We can avoid unnecessary checks by keeping track of the memory
accesses we already found read clobbers for. To do so, we just add
memory access causing read-clobbers to a set. Note that marking all
visited accesses as read-clobbers would be to pessimistic, as that might
include accesses not on any path to the actual read clobber.
If we do not find any read-clobbers, we can add all visited instructions
to another set and use that to skip the same accesses in the next call.
I did not yet measure compile-time, but below is the impact on the
number of iterations in getDomMemoryDef:
Metric: dse.NumDomMemDefChecks
Program base patch diff
test-suite...000/183.equake/183.equake.test 132580.00 26961.00 -79.7%
test-suite...T95/147.vortex/147.vortex.test 881946.00 297521.00 -66.3%
test-suite...000/255.vortex/255.vortex.test 882090.00 297594.00 -66.3%
test-suite...T2006/445.gobmk/445.gobmk.test 700940.00 247624.00 -64.7%
test-suite...ications/JM/ldecod/ldecod.test 990956.00 357584.00 -63.9%
test-suite...C/CFP2000/179.art/179.art.test 23014.00 8364.00 -63.7%
test-suite...marks/SciMark2-C/scimark2.test 20939.00 8230.00 -60.7%
test-suite.../CINT2006/403.gcc/403.gcc.test 2412386.00 951674.00 -60.6%
test-suite...006/447.dealII/447.dealII.test 1850445.00 796042.00 -57.0%
test-suite...006/453.povray/453.povray.test 1735262.00 753271.00 -56.6%
test-suite...ProxyApps-C++/CLAMR/CLAMR.test 393888.00 172514.00 -56.2%
test-suite...ProxyApps-C++/HPCCG/HPCCG.test 42350.00 18931.00 -55.3%
test-suite...000/186.crafty/186.crafty.test 371608.00 167669.00 -54.9%
test-suite...langs-C/unix-tbl/unix-tbl.test 10263.00 4763.00 -53.6%
test-suite.../CINT2000/176.gcc/176.gcc.test 1641688.00 763696.00 -53.5%
test-suite...ications/JM/lencod/lencod.test 1459213.00 679907.00 -53.4%
test-suite...FreeBench/distray/distray.test 10477.00 5113.00 -51.2%
test-suite.../Trimaran/enc-md5/enc-md5.test 2651.00 1295.00 -51.2%
test-suite...s/Rodinia/hotspot/hotspot.test 4031.00 1989.00 -50.7%
test-suite...T2006/401.bzip2/401.bzip2.test 171479.00 85496.00 -50.1%
test-suite...lowfish/security-blowfish.test 6217.00 3143.00 -49.4%
test-suite...rks/FreeBench/mason/mason.test 1386.00 712.00 -48.6%
test-suite...yApps-C++/PENNANT/PENNANT.test 135316.00 71201.00 -47.4%
test-suite...ks/McCat/04-bisect/bisect.test 3353.00 1801.00 -46.3%
test-suite...6/464.h264ref/464.h264ref.test 1500143.00 810226.00 -46.0%
test-suite...marks/7zip/7zip-benchmark.test 1278779.00 711387.00 -44.4%
test-suite...lications/viterbi/viterbi.test 11564.00 6497.00 -43.8%
test-suite...plications/d/make_dparser.test 54338.00 31158.00 -42.7%
test-suite...ternal/HMMER/hmmcalibrate.test 75317.00 43258.00 -42.6%
test-suite...lications/ClamAV/clamscan.test 447833.00 258126.00 -42.4%
test-suite...lications/SIBsim4/SIBsim4.test 32896.00 19381.00 -41.1%
test-suite...6/482.sphinx3/482.sphinx3.test 62177.00 37137.00 -40.3%
test-suite...nia/pathfinder/pathfinder.test 1322.00 795.00 -39.9%
test-suite...math/automotive-basicmath.test 146.00 88.00 -39.7%
test-suite...ngs-C/simulator/simulator.test 18187.00 10982.00 -39.6%
test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test 3040.00 1839.00 -39.5%
test-suite...T2006/473.astar/473.astar.test 37795.00 22910.00 -39.4%
test-suite...CFP2006/444.namd/444.namd.test 97627.00 59229.00 -39.3%
test-suite...-typeset/consumer-typeset.test 802759.00 488393.00 -39.2%
test-suite...T2006/456.hmmer/456.hmmer.test 96748.00 58957.00 -39.1%
test-suite...quoia/CrystalMk/CrystalMk.test 9157.00 5669.00 -38.1%
test-suite.../Benchmarks/Olden/mst/mst.test 1054.00 653.00 -38.0%
test-suite :: External/Nurbs/nurbs.test 14763.00 9177.00 -37.8%
test-suite...CFP2000/188.ammp/188.ammp.test 84072.00 53009.00 -36.9%
test-suite...chmarks/MallocBench/gs/gs.test 98221.00 61993.00 -36.9%
test-suite...rks/FreeBench/pifft/pifft.test 15560.00 9888.00 -36.5%
test-suite...ks/McCat/01-qbsort/qbsort.test 497.00 323.00 -35.0%
test-suite...INT2000/164.gzip/164.gzip.test 30328.00 19767.00 -34.8%
test-suite...lications/obsequi/Obsequi.test 44239.00 29195.00 -34.0%
test-suite.../CINT2000/252.eon/252.eon.test 340333.00 224748.00 -34.0%
test-suite.../Prolangs-C++/trees/trees.test 2573.00 1715.00 -33.3%
test-suite...TimberWolfMC/timberwolfmc.test 190315.00 126890.00 -33.3%
test-suite...nch/fourinarow/fourinarow.test 1487.00 994.00 -33.2%
test-suite...SPEC/CINT95/099.go/099.go.test 256287.00 173249.00 -32.4%
test-suite...s-C/Pathfinder/PathFinder.test 8366.00 5708.00 -31.8%
test-suite...5/124.m88ksim/124.m88ksim.test 62116.00 42425.00 -31.7%
test-suite...arks/McCat/17-bintr/bintr.test 154.00 107.00 -30.5%
test-suite...langs-C/football/football.test 17864.00 12416.00 -30.5%
test-suite...lications/minisat/minisat.test 9504.00 6676.00 -29.8%
test-suite...rks/tramp3d-v4/tramp3d-v4.test 1275472.00 896318.00 -29.7%
test-suite...s/FreeBench/neural/neural.test 1534.00 1087.00 -29.1%
test-suite...arks/mafft/pairlocalalign.test 259748.00 185421.00 -28.6%
test-suite...comm-adpcm/telecomm-adpcm.test 53.00 38.00 -28.3%
test-suite...adpcm/rawdaudio/rawdaudio.test 53.00 38.00 -28.3%
test-suite.../Benchmarks/Olden/tsp/tsp.test 2522.00 1834.00 -27.3%
test-suite...:: External/Povray/povray.test 650268.00 476041.00 -26.8%
In some cases this also increase the number of eliminated stores,
because we can explore further. Note that there is a small regression
which I should track down.
Metric: dse.NumFastStores
Program base patch diff
test-suite...T2006/445.gobmk/445.gobmk.test 82.00 123.00 50.0%
test-suite...C/CFP2000/179.art/179.art.test 6.00 7.00 16.7%
test-suite...math/automotive-basicmath.test 7.00 8.00 14.3%
test-suite...ngs-C/assembler/assembler.test 8.00 9.00 12.5%
test-suite...ks/Prolangs-C/gnugo/gnugo.test 9.00 10.00 11.1%
test-suite...INT95/132.ijpeg/132.ijpeg.test 18.00 20.00 11.1%
test-suite...langs-C/football/football.test 10.00 11.00 10.0%
test-suite...ce/Benchmarks/Olden/bh/bh.test 13.00 14.00 7.7%
test-suite...ications/JM/ldecod/ldecod.test 382.00 402.00 5.2%
test-suite...000/183.equake/183.equake.test 40.00 38.00 -5.0%
test-suite...6/482.sphinx3/482.sphinx3.test 22.00 23.00 4.5%
test-suite...T95/147.vortex/147.vortex.test 215.00 224.00 4.2%
test-suite...000/255.vortex/255.vortex.test 217.00 226.00 4.1%
test-suite...SPEC/CINT95/099.go/099.go.test 63.00 65.00 3.2%
test-suite.../Benchmarks/nbench/nbench.test 76.00 78.00 2.6%
test-suite...lications/sqlite3/sqlite3.test 153.00 157.00 2.6%
test-suite...INT2000/164.gzip/164.gzip.test 39.00 40.00 2.6%
test-suite...ications/JM/lencod/lencod.test 840.00 854.00 1.7%
test-suite...marks/7zip/7zip-benchmark.test 1211.00 1231.00 1.7%
test-suite...6/464.h264ref/464.h264ref.test 730.00 741.00 1.5%
test-suite...006/453.povray/453.povray.test 1417.00 1437.00 1.4%
test-suite...lications/ClamAV/clamscan.test 230.00 233.00 1.3%
test-suite.../Applications/SPASS/SPASS.test 156.00 158.00 1.3%
test-suite...0.perlbench/400.perlbench.test 861.00 871.00 1.2%
test-suite.../CINT2000/176.gcc/176.gcc.test 879.00 889.00 1.1%
test-suite...nsumer-lame/consumer-lame.test 100.00 101.00 1.0%
test-suite...:: External/Povray/povray.test 1220.00 1231.00 0.9%
test-suite...ocBench/espresso/espresso.test 115.00 116.00 0.9%
test-suite...chmarks/MallocBench/gs/gs.test 116.00 117.00 0.9%
test-suite...5/124.m88ksim/124.m88ksim.test 116.00 117.00 0.9%
test-suite...CI_Purple/SMG2000/smg2000.test 158.00 159.00 0.6%
test-suite...000/186.crafty/186.crafty.test 158.00 159.00 0.6%
test-suite...0/253.perlbmk/253.perlbmk.test 500.00 503.00 0.6%
test-suite.../CINT2006/403.gcc/403.gcc.test 1178.00 1185.00 0.6%
test-suite...CFP2000/188.ammp/188.ammp.test 181.00 182.00 0.6%
test-suite.../CINT2000/252.eon/252.eon.test 2672.00 2685.00 0.5%
test-suite...006/447.dealII/447.dealII.test 2117.00 2127.00 0.5%
test-suite...-typeset/consumer-typeset.test 1047.00 1051.00 0.4%
test-suite...rks/tramp3d-v4/tramp3d-v4.test 814.00 816.00 0.2%
test-suite...3.xalancbmk/483.xalancbmk.test 1265.00 1267.00 0.2%
test-suite.../Prolangs-C++/vcirc/vcirc.test 11.00 11.00 0.0%
Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea
Subscribers: Prazek, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D75025
Added:
Modified:
llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
Removed:
################################################################################
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index f5f17620e571..e426facc71f0 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -88,6 +88,7 @@ STATISTIC(NumNoopStores, "Number of noop stores deleted");
STATISTIC(NumCFGChecks, "Number of stores modified");
STATISTIC(NumCFGTries, "Number of stores modified");
STATISTIC(NumCFGSuccess, "Number of stores modified");
+STATISTIC(NumDomMemDefChecks, "Number of iterations in getDomMemoryDef");
DEBUG_COUNTER(MemorySSACounter, "dse-memoryssa",
"Controls which MemoryDefs are eliminated.");
@@ -1513,6 +1514,18 @@ struct DSEState {
/// basic block.
DenseMap<BasicBlock *, InstOverlapIntervalsTy> IOLs;
+ struct CheckCache {
+ SmallPtrSet<MemoryAccess *, 16> KnownNoReads;
+ SmallPtrSet<MemoryAccess *, 16> KnownReads;
+
+ bool isKnownNoRead(MemoryAccess *A) const {
+ return KnownNoReads.find(A) != KnownNoReads.end();
+ }
+ bool isKnownRead(MemoryAccess *A) const {
+ return KnownReads.find(A) != KnownReads.end();
+ }
+ };
+
DSEState(Function &F, AliasAnalysis &AA, MemorySSA &MSSA, DominatorTree &DT,
PostDominatorTree &PDT, const TargetLibraryInfo &TLI)
: F(F), AA(AA), MSSA(MSSA), DT(DT), PDT(PDT), TLI(TLI) {}
@@ -1743,7 +1756,8 @@ struct DSEState {
Optional<MemoryAccess *>
getDomMemoryDef(MemoryDef *KillingDef, MemoryAccess *Current,
MemoryLocation DefLoc, bool DefVisibleToCallerBeforeRet,
- bool DefVisibleToCallerAfterRet, int &ScanLimit) const {
+ bool DefVisibleToCallerAfterRet, CheckCache &Cache,
+ int &ScanLimit) const {
MemoryAccess *DomAccess;
bool StepAgain;
LLVM_DEBUG(dbgs() << " trying to get dominating access for " << *Current
@@ -1798,16 +1812,32 @@ struct DSEState {
};
PushMemUses(DomAccess);
+ // Optimistically collect all accesses we for reads. If we do not find any
+ // read clobbers, add them to the cache.
+ SmallPtrSet<MemoryAccess *, 16> KnownNoReads;
// Check if DomDef may be read.
for (unsigned I = 0; I < WorkList.size(); I++) {
MemoryAccess *UseAccess = WorkList[I];
-
- LLVM_DEBUG(dbgs() << " " << *UseAccess);
+ NumDomMemDefChecks++;
+ LLVM_DEBUG(dbgs() << " Checking use " << *UseAccess);
if (--ScanLimit == 0) {
LLVM_DEBUG(dbgs() << "\n ... hit scan limit\n");
return None;
}
+ // Check if we already visited this access.
+ if (Cache.isKnownNoRead(UseAccess)) {
+ LLVM_DEBUG(dbgs() << " ... skip, discovered that " << *UseAccess
+ << " is safe earlier.\n");
+ continue;
+ }
+ if (Cache.isKnownRead(UseAccess)) {
+ LLVM_DEBUG(dbgs() << " ... bail out, discovered that " << *UseAccess
+ << " is has a read-clobber earlier.\n");
+ return None;
+ }
+ KnownNoReads.insert(UseAccess);
+
if (isa<MemoryPhi>(UseAccess)) {
LLVM_DEBUG(dbgs() << "\n ... adding PHI uses\n");
PushMemUses(UseAccess);
@@ -1831,7 +1861,9 @@ struct DSEState {
// Uses which may read the original MemoryDef mean we cannot eliminate the
// original MD. Stop walk.
if (isReadClobber(DefLoc, UseInst)) {
- LLVM_DEBUG(dbgs() << " ... found read clobber\n");
+ LLVM_DEBUG(dbgs() << " ... found read clobber\n");
+ Cache.KnownReads.insert(UseAccess);
+ Cache.KnownReads.insert(Current);
return None;
}
@@ -1944,6 +1976,7 @@ struct DSEState {
return None;
}
+ Cache.KnownNoReads.insert(KnownNoReads.begin(), KnownNoReads.end());
// No aliasing MemoryUses of DomAccess found, DomAccess is potentially dead.
return {DomAccess};
}
@@ -2159,6 +2192,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA,
SetVector<MemoryAccess *> ToCheck;
ToCheck.insert(KillingDef->getDefiningAccess());
+ DSEState::CheckCache Cache;
// Check if MemoryAccesses in the worklist are killed by KillingDef.
for (unsigned I = 0; I < ToCheck.size(); I++) {
Current = ToCheck[I];
@@ -2167,7 +2201,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA,
Optional<MemoryAccess *> Next = State.getDomMemoryDef(
KillingDef, Current, SILoc, DefVisibleToCallerBeforeRet,
- DefVisibleToCallerAfterRet, ScanLimit);
+ DefVisibleToCallerAfterRet, Cache, ScanLimit);
if (!Next) {
LLVM_DEBUG(dbgs() << " finished walk\n");
More information about the llvm-branch-commits
mailing list