[llvm] r347204 - [llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors

Roman Lebedev via llvm-commits llvm-commits at lists.llvm.org
Mon Nov 19 05:28:41 PST 2018


Author: lebedevri
Date: Mon Nov 19 05:28:41 2018
New Revision: 347204

URL: http://llvm.org/viewvc/llvm-project?rev=347204&view=rev
Log:
[llvm-exegesis] (+final perf overview) InstructionBenchmarkClustering::rangeQuery(): reserve for the upper bound of Neighbors

Summary:
As it was pointed out in D54388+D54390, the maximal size of `Neighbors` is known,
it will contain at most Points_.size() minus one (the center of the cluster)

While that is the upper bound, meaning in the most cases, the actual count
will be much smaller, since D54390 made the allocation persistent,
we no longer have to worry about overly-optimistically `reserve()`ing.

Old: (D54393)
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6553.167456      task-clock (msec)         #    1.000 CPUs utilized            ( +-  0.21% )
...
            6.5547 +- 0.0134 seconds time elapsed  ( +-  0.20% )
```
New:
```
 Performance counter stats for './bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html' (16 runs):

       6315.057872      task-clock (msec)         #    0.999 CPUs utilized            ( +-  0.24% )
...
            6.3187 +- 0.0160 seconds time elapsed  ( +-  0.25% )
```
And that is another -~4%.


Since this is the last (as of this moment) patch in this patch series,
it is a good time to summarize:
Old: (svn trunk, as stated in D54381)
```
$ time ./bin/llvm-exegesis -mode=analysis -analysis-epsilon=100000 -benchmarks-file=/tmp/benchmarks.yaml -analysis-inconsistencies-output-file=/tmp/clusters.html &> /dev/null

real    0m24.884s
user    0m24.099s
sys     0m0.785s
```
So these patches, on a given benchmark,
has decreased llvm-exegesis analysis time by 74.62%.

There surely is more room for further improvements.
D54514 may improve thins by -11.5% more (relative to this patch).
Parallelization may improve things further significantly, too.


Reviewers: courbet, MaskRay, RKSimon, gchatelet, john.brawn

Reviewed By: courbet, MaskRay

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D54415

Modified:
    llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp
    llvm/trunk/tools/llvm-exegesis/lib/Clustering.h

Modified: llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp?rev=347204&r1=347203&r2=347204&view=diff
==============================================================================
--- llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp (original)
+++ llvm/trunk/tools/llvm-exegesis/lib/Clustering.cpp Mon Nov 19 05:28:41 2018
@@ -34,8 +34,9 @@ namespace exegesis {
 // Finds the points at distance less than sqrt(EpsilonSquared) of Q (not
 // including Q).
 void InstructionBenchmarkClustering::rangeQuery(
-    const size_t Q, llvm::SmallVectorImpl<size_t> &Neighbors) const {
+    const size_t Q, std::vector<size_t> &Neighbors) const {
   Neighbors.clear();
+  Neighbors.reserve(Points_.size() - 1); // The Q itself isn't a neighbor.
   const auto &QMeasurements = Points_[Q].Measurements;
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {
     if (P == Q)
@@ -91,7 +92,7 @@ llvm::Error InstructionBenchmarkClusteri
 }
 
 void InstructionBenchmarkClustering::dbScan(const size_t MinPts) {
-  llvm::SmallVector<size_t, 0> Neighbors; // Persistent buffer to avoid allocs.
+  std::vector<size_t> Neighbors; // Persistent buffer to avoid allocs.
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {
     if (!ClusterIdForPoint_[P].isUndef())
       continue; // Previously processed in inner loop.
@@ -136,6 +137,8 @@ void InstructionBenchmarkClustering::dbS
       }
     }
   }
+  // assert(Neighbors.capacity() == (Points_.size() - 1));
+  // ^ True, but it is not quaranteed to be true in all the cases.
 
   // Add noisy points to noise cluster.
   for (size_t P = 0, NumPoints = Points_.size(); P < NumPoints; ++P) {

Modified: llvm/trunk/tools/llvm-exegesis/lib/Clustering.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-exegesis/lib/Clustering.h?rev=347204&r1=347203&r2=347204&view=diff
==============================================================================
--- llvm/trunk/tools/llvm-exegesis/lib/Clustering.h (original)
+++ llvm/trunk/tools/llvm-exegesis/lib/Clustering.h Mon Nov 19 05:28:41 2018
@@ -104,7 +104,7 @@ private:
       const std::vector<InstructionBenchmark> &Points, double EpsilonSquared);
   llvm::Error validateAndSetup();
   void dbScan(size_t MinPts);
-  void rangeQuery(size_t Q, llvm::SmallVectorImpl<size_t> &Scratchpad) const;
+  void rangeQuery(size_t Q, std::vector<size_t> &Scratchpad) const;
 
   const std::vector<InstructionBenchmark> &Points_;
   const double EpsilonSquared_;




More information about the llvm-commits mailing list