[llvm-branch-commits] [llvm] 9e46fcc - [DSE, MSSA] Cache accesses with/without reachable read-clobbers.

Florian Hahn via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Aug 13 05:15:12 PDT 2020


Author: Florian Hahn
Date: 2020-08-13T13:10:14+01:00
New Revision: 9e46fcc34c01386e5142e1630ad02e1284c49c67

URL: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67
DIFF: https://github.com/llvm/llvm-project/commit/9e46fcc34c01386e5142e1630ad02e1284c49c67.diff

LOG: [DSE,MSSA] Cache accesses with/without reachable read-clobbers.

Summary:
Currently we repeatedly check the same uses for read clobbers in some
cases. We can avoid unnecessary checks by keeping track of the memory
accesses we already found read clobbers for. To do so, we just add
memory access causing read-clobbers to a set. Note that marking all
visited accesses as read-clobbers would be to pessimistic, as that might
include accesses not on any path to  the actual read clobber.

If we do not find any read-clobbers, we can add all visited instructions
to another set and use that to skip the same accesses in the next call.

I did not yet measure compile-time, but below is the impact on the
number of iterations in getDomMemoryDef:

Metric: dse.NumDomMemDefChecks

Program                                        base       patch      diff
 test-suite...000/183.equake/183.equake.test   132580.00  26961.00  -79.7%
 test-suite...T95/147.vortex/147.vortex.test   881946.00  297521.00 -66.3%
 test-suite...000/255.vortex/255.vortex.test   882090.00  297594.00 -66.3%
 test-suite...T2006/445.gobmk/445.gobmk.test   700940.00  247624.00 -64.7%
 test-suite...ications/JM/ldecod/ldecod.test   990956.00  357584.00 -63.9%
 test-suite...C/CFP2000/179.art/179.art.test   23014.00   8364.00   -63.7%
 test-suite...marks/SciMark2-C/scimark2.test   20939.00   8230.00   -60.7%
 test-suite.../CINT2006/403.gcc/403.gcc.test   2412386.00 951674.00 -60.6%
 test-suite...006/447.dealII/447.dealII.test   1850445.00 796042.00 -57.0%
 test-suite...006/453.povray/453.povray.test   1735262.00 753271.00 -56.6%
 test-suite...ProxyApps-C++/CLAMR/CLAMR.test   393888.00  172514.00 -56.2%
 test-suite...ProxyApps-C++/HPCCG/HPCCG.test   42350.00   18931.00  -55.3%
 test-suite...000/186.crafty/186.crafty.test   371608.00  167669.00 -54.9%
 test-suite...langs-C/unix-tbl/unix-tbl.test   10263.00   4763.00   -53.6%
 test-suite.../CINT2000/176.gcc/176.gcc.test   1641688.00 763696.00 -53.5%
 test-suite...ications/JM/lencod/lencod.test   1459213.00 679907.00 -53.4%
 test-suite...FreeBench/distray/distray.test   10477.00   5113.00   -51.2%
 test-suite.../Trimaran/enc-md5/enc-md5.test   2651.00    1295.00   -51.2%
 test-suite...s/Rodinia/hotspot/hotspot.test   4031.00    1989.00   -50.7%
 test-suite...T2006/401.bzip2/401.bzip2.test   171479.00  85496.00  -50.1%
 test-suite...lowfish/security-blowfish.test   6217.00    3143.00   -49.4%
 test-suite...rks/FreeBench/mason/mason.test   1386.00    712.00    -48.6%
 test-suite...yApps-C++/PENNANT/PENNANT.test   135316.00  71201.00  -47.4%
 test-suite...ks/McCat/04-bisect/bisect.test   3353.00    1801.00   -46.3%
 test-suite...6/464.h264ref/464.h264ref.test   1500143.00 810226.00 -46.0%
 test-suite...marks/7zip/7zip-benchmark.test   1278779.00 711387.00 -44.4%
 test-suite...lications/viterbi/viterbi.test   11564.00   6497.00   -43.8%
 test-suite...plications/d/make_dparser.test   54338.00   31158.00  -42.7%
 test-suite...ternal/HMMER/hmmcalibrate.test   75317.00   43258.00  -42.6%
 test-suite...lications/ClamAV/clamscan.test   447833.00  258126.00 -42.4%
 test-suite...lications/SIBsim4/SIBsim4.test   32896.00   19381.00  -41.1%
 test-suite...6/482.sphinx3/482.sphinx3.test   62177.00   37137.00  -40.3%
 test-suite...nia/pathfinder/pathfinder.test   1322.00    795.00    -39.9%
 test-suite...math/automotive-basicmath.test   146.00      88.00    -39.7%
 test-suite...ngs-C/simulator/simulator.test   18187.00   10982.00  -39.6%
 test-suite...s/ASC_Sequoia/IRSmk/IRSmk.test   3040.00    1839.00   -39.5%
 test-suite...T2006/473.astar/473.astar.test   37795.00   22910.00  -39.4%
 test-suite...CFP2006/444.namd/444.namd.test   97627.00   59229.00  -39.3%
 test-suite...-typeset/consumer-typeset.test   802759.00  488393.00 -39.2%
 test-suite...T2006/456.hmmer/456.hmmer.test   96748.00   58957.00  -39.1%
 test-suite...quoia/CrystalMk/CrystalMk.test   9157.00    5669.00   -38.1%
 test-suite.../Benchmarks/Olden/mst/mst.test   1054.00    653.00    -38.0%
 test-suite :: External/Nurbs/nurbs.test       14763.00   9177.00   -37.8%
 test-suite...CFP2000/188.ammp/188.ammp.test   84072.00   53009.00  -36.9%
 test-suite...chmarks/MallocBench/gs/gs.test   98221.00   61993.00  -36.9%
 test-suite...rks/FreeBench/pifft/pifft.test   15560.00   9888.00   -36.5%
 test-suite...ks/McCat/01-qbsort/qbsort.test   497.00     323.00    -35.0%
 test-suite...INT2000/164.gzip/164.gzip.test   30328.00   19767.00  -34.8%
 test-suite...lications/obsequi/Obsequi.test   44239.00   29195.00  -34.0%
 test-suite.../CINT2000/252.eon/252.eon.test   340333.00  224748.00 -34.0%
 test-suite.../Prolangs-C++/trees/trees.test   2573.00    1715.00   -33.3%
 test-suite...TimberWolfMC/timberwolfmc.test   190315.00  126890.00 -33.3%
 test-suite...nch/fourinarow/fourinarow.test   1487.00    994.00    -33.2%
 test-suite...SPEC/CINT95/099.go/099.go.test   256287.00  173249.00 -32.4%
 test-suite...s-C/Pathfinder/PathFinder.test   8366.00    5708.00   -31.8%
 test-suite...5/124.m88ksim/124.m88ksim.test   62116.00   42425.00  -31.7%
 test-suite...arks/McCat/17-bintr/bintr.test   154.00     107.00    -30.5%
 test-suite...langs-C/football/football.test   17864.00   12416.00  -30.5%
 test-suite...lications/minisat/minisat.test   9504.00    6676.00   -29.8%
 test-suite...rks/tramp3d-v4/tramp3d-v4.test   1275472.00 896318.00 -29.7%
 test-suite...s/FreeBench/neural/neural.test   1534.00    1087.00   -29.1%
 test-suite...arks/mafft/pairlocalalign.test   259748.00  185421.00 -28.6%
 test-suite...comm-adpcm/telecomm-adpcm.test    53.00      38.00    -28.3%
 test-suite...adpcm/rawdaudio/rawdaudio.test    53.00      38.00    -28.3%
 test-suite.../Benchmarks/Olden/tsp/tsp.test   2522.00    1834.00   -27.3%
 test-suite...:: External/Povray/povray.test   650268.00  476041.00 -26.8%

In some cases this also increase the number of eliminated stores,
because we can explore further. Note that there is a small regression
which I should track down.

Metric: dse.NumFastStores

Program                                        base    patch   diff
 test-suite...T2006/445.gobmk/445.gobmk.test    82.00  123.00  50.0%
 test-suite...C/CFP2000/179.art/179.art.test     6.00    7.00  16.7%
 test-suite...math/automotive-basicmath.test     7.00    8.00  14.3%
 test-suite...ngs-C/assembler/assembler.test     8.00    9.00  12.5%
 test-suite...ks/Prolangs-C/gnugo/gnugo.test     9.00   10.00  11.1%
 test-suite...INT95/132.ijpeg/132.ijpeg.test    18.00   20.00  11.1%
 test-suite...langs-C/football/football.test    10.00   11.00  10.0%
 test-suite...ce/Benchmarks/Olden/bh/bh.test    13.00   14.00   7.7%
 test-suite...ications/JM/ldecod/ldecod.test   382.00  402.00   5.2%
 test-suite...000/183.equake/183.equake.test    40.00   38.00  -5.0%
 test-suite...6/482.sphinx3/482.sphinx3.test    22.00   23.00   4.5%
 test-suite...T95/147.vortex/147.vortex.test   215.00  224.00   4.2%
 test-suite...000/255.vortex/255.vortex.test   217.00  226.00   4.1%
 test-suite...SPEC/CINT95/099.go/099.go.test    63.00   65.00   3.2%
 test-suite.../Benchmarks/nbench/nbench.test    76.00   78.00   2.6%
 test-suite...lications/sqlite3/sqlite3.test   153.00  157.00   2.6%
 test-suite...INT2000/164.gzip/164.gzip.test    39.00   40.00   2.6%
 test-suite...ications/JM/lencod/lencod.test   840.00  854.00   1.7%
 test-suite...marks/7zip/7zip-benchmark.test   1211.00 1231.00  1.7%
 test-suite...6/464.h264ref/464.h264ref.test   730.00  741.00   1.5%
 test-suite...006/453.povray/453.povray.test   1417.00 1437.00  1.4%
 test-suite...lications/ClamAV/clamscan.test   230.00  233.00   1.3%
 test-suite.../Applications/SPASS/SPASS.test   156.00  158.00   1.3%
 test-suite...0.perlbench/400.perlbench.test   861.00  871.00   1.2%
 test-suite.../CINT2000/176.gcc/176.gcc.test   879.00  889.00   1.1%
 test-suite...nsumer-lame/consumer-lame.test   100.00  101.00   1.0%
 test-suite...:: External/Povray/povray.test   1220.00 1231.00  0.9%
 test-suite...ocBench/espresso/espresso.test   115.00  116.00   0.9%
 test-suite...chmarks/MallocBench/gs/gs.test   116.00  117.00   0.9%
 test-suite...5/124.m88ksim/124.m88ksim.test   116.00  117.00   0.9%
 test-suite...CI_Purple/SMG2000/smg2000.test   158.00  159.00   0.6%
 test-suite...000/186.crafty/186.crafty.test   158.00  159.00   0.6%
 test-suite...0/253.perlbmk/253.perlbmk.test   500.00  503.00   0.6%
 test-suite.../CINT2006/403.gcc/403.gcc.test   1178.00 1185.00  0.6%
 test-suite...CFP2000/188.ammp/188.ammp.test   181.00  182.00   0.6%
 test-suite.../CINT2000/252.eon/252.eon.test   2672.00 2685.00  0.5%
 test-suite...006/447.dealII/447.dealII.test   2117.00 2127.00  0.5%
 test-suite...-typeset/consumer-typeset.test   1047.00 1051.00  0.4%
 test-suite...rks/tramp3d-v4/tramp3d-v4.test   814.00  816.00   0.2%
 test-suite...3.xalancbmk/483.xalancbmk.test   1265.00 1267.00  0.2%
 test-suite.../Prolangs-C++/vcirc/vcirc.test    11.00   11.00   0.0%

Reviewers: dmgreen, rnk, efriedma, bryant, asbirlea

Subscribers: Prazek, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D75025

Added: 
    

Modified: 
    llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index f5f17620e571..e426facc71f0 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -88,6 +88,7 @@ STATISTIC(NumNoopStores, "Number of noop stores deleted");
 STATISTIC(NumCFGChecks, "Number of stores modified");
 STATISTIC(NumCFGTries, "Number of stores modified");
 STATISTIC(NumCFGSuccess, "Number of stores modified");
+STATISTIC(NumDomMemDefChecks, "Number of iterations in getDomMemoryDef");
 
 DEBUG_COUNTER(MemorySSACounter, "dse-memoryssa",
               "Controls which MemoryDefs are eliminated.");
@@ -1513,6 +1514,18 @@ struct DSEState {
   /// basic block.
   DenseMap<BasicBlock *, InstOverlapIntervalsTy> IOLs;
 
+  struct CheckCache {
+    SmallPtrSet<MemoryAccess *, 16> KnownNoReads;
+    SmallPtrSet<MemoryAccess *, 16> KnownReads;
+
+    bool isKnownNoRead(MemoryAccess *A) const {
+      return KnownNoReads.find(A) != KnownNoReads.end();
+    }
+    bool isKnownRead(MemoryAccess *A) const {
+      return KnownReads.find(A) != KnownReads.end();
+    }
+  };
+
   DSEState(Function &F, AliasAnalysis &AA, MemorySSA &MSSA, DominatorTree &DT,
            PostDominatorTree &PDT, const TargetLibraryInfo &TLI)
       : F(F), AA(AA), MSSA(MSSA), DT(DT), PDT(PDT), TLI(TLI) {}
@@ -1743,7 +1756,8 @@ struct DSEState {
   Optional<MemoryAccess *>
   getDomMemoryDef(MemoryDef *KillingDef, MemoryAccess *Current,
                   MemoryLocation DefLoc, bool DefVisibleToCallerBeforeRet,
-                  bool DefVisibleToCallerAfterRet, int &ScanLimit) const {
+                  bool DefVisibleToCallerAfterRet, CheckCache &Cache,
+                  int &ScanLimit) const {
     MemoryAccess *DomAccess;
     bool StepAgain;
     LLVM_DEBUG(dbgs() << "  trying to get dominating access for " << *Current
@@ -1798,16 +1812,32 @@ struct DSEState {
     };
     PushMemUses(DomAccess);
 
+    // Optimistically collect all accesses we for reads. If we do not find any
+    // read clobbers, add them to the cache.
+    SmallPtrSet<MemoryAccess *, 16> KnownNoReads;
     // Check if DomDef may be read.
     for (unsigned I = 0; I < WorkList.size(); I++) {
       MemoryAccess *UseAccess = WorkList[I];
-
-      LLVM_DEBUG(dbgs() << "   " << *UseAccess);
+      NumDomMemDefChecks++;
+      LLVM_DEBUG(dbgs() << "   Checking use " << *UseAccess);
       if (--ScanLimit == 0) {
         LLVM_DEBUG(dbgs() << "\n    ...  hit scan limit\n");
         return None;
       }
 
+      // Check if we already visited this access.
+      if (Cache.isKnownNoRead(UseAccess)) {
+        LLVM_DEBUG(dbgs() << " ... skip, discovered that " << *UseAccess
+                          << " is safe earlier.\n");
+        continue;
+      }
+      if (Cache.isKnownRead(UseAccess)) {
+        LLVM_DEBUG(dbgs() << " ... bail out, discovered that " << *UseAccess
+                          << " is has a read-clobber earlier.\n");
+        return None;
+      }
+      KnownNoReads.insert(UseAccess);
+
       if (isa<MemoryPhi>(UseAccess)) {
         LLVM_DEBUG(dbgs() << "\n    ... adding PHI uses\n");
         PushMemUses(UseAccess);
@@ -1831,7 +1861,9 @@ struct DSEState {
       // Uses which may read the original MemoryDef mean we cannot eliminate the
       // original MD. Stop walk.
       if (isReadClobber(DefLoc, UseInst)) {
-        LLVM_DEBUG(dbgs() << "    ... found read clobber\n");
+        LLVM_DEBUG(dbgs() << "  ... found read clobber\n");
+        Cache.KnownReads.insert(UseAccess);
+        Cache.KnownReads.insert(Current);
         return None;
       }
 
@@ -1944,6 +1976,7 @@ struct DSEState {
       return None;
     }
 
+    Cache.KnownNoReads.insert(KnownNoReads.begin(), KnownNoReads.end());
     // No aliasing MemoryUses of DomAccess found, DomAccess is potentially dead.
     return {DomAccess};
   }
@@ -2159,6 +2192,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA,
     SetVector<MemoryAccess *> ToCheck;
     ToCheck.insert(KillingDef->getDefiningAccess());
 
+    DSEState::CheckCache Cache;
     // Check if MemoryAccesses in the worklist are killed by KillingDef.
     for (unsigned I = 0; I < ToCheck.size(); I++) {
       Current = ToCheck[I];
@@ -2167,7 +2201,7 @@ bool eliminateDeadStoresMemorySSA(Function &F, AliasAnalysis &AA,
 
       Optional<MemoryAccess *> Next = State.getDomMemoryDef(
           KillingDef, Current, SILoc, DefVisibleToCallerBeforeRet,
-          DefVisibleToCallerAfterRet, ScanLimit);
+          DefVisibleToCallerAfterRet, Cache, ScanLimit);
 
       if (!Next) {
         LLVM_DEBUG(dbgs() << "  finished walk\n");


        


More information about the llvm-branch-commits mailing list