[PATCH] D59820: [llvm-exegesis] Introduce a 'naive' clustering algorithm (PR40880)

Clement Courbet via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 27 02:57:28 PDT 2019


courbet added inline comments.


================
Comment at: tools/llvm-exegesis/lib/Clustering.cpp:57
+// Given a set of points, checks that all the points are neighbours.
+bool InstructionBenchmarkClustering::areAllNeighbours(
+    ArrayRef<size_t> Pts) const {
----------------
courbet wrote:
> This is O(N^2).  You can do it in O(N): compute the cluster centroid (O(N)), then compute distance from each point to centroid (O(N)).
> 
> This relies on the fact that if there exists `p` and `q` such that `d(p,q) > e`, then either `d(p, centroid) > e/2` or `d(q, centroid) > e/2.
> 
> Proof (ad absurdum):
> Assume both `d(p, centroid) <= e/2` and `d(q, centroid) <= e/2`. Then:
> 
> ```
> d(p, centroid)  + d(q, centroid) <= e
> ```
> 
> By symmetry:
> 
> ```
> d(p, centroid)  + d(centroid, q) <= e
> ```
> 
> By [[ https://en.wikipedia.org/wiki/Triangle_inequality#Metric_space | triangle inequality ]]:
> 
> 
> ```
> d (p, q) <= d(p, centroid)  + d(q, centroid) <= e
> ```
> 
> 
> 
Oops, I just realized that I proved the opposite direction of what I wanted. Two options:

  - We consider that this criterion is as good as the other one to decide that we should reject the cluster, or
  - We take another approach such as computing the bounding box of the cluster (O(N)), then compare its diagonal (distance between the two extremal points, O(1)) to the rejection threshold.




Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59820/new/

https://reviews.llvm.org/D59820





More information about the llvm-commits mailing list