[llvm] [ADT] Restore handwritten vector find in SmallSet (PR #110254)

Victor Campos via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 27 09:04:19 PDT 2024


https://github.com/vhscampos updated https://github.com/llvm/llvm-project/pull/110254

>From 847a5d7e9e51d2d3f480217db5fcbed60a9d9a8a Mon Sep 17 00:00:00 2001
From: Victor Campos <victor.campos at arm.com>
Date: Fri, 27 Sep 2024 11:18:13 +0100
Subject: [PATCH 1/2] [ADT] Restore handwritten vector find in SmallSet

This patch restores handwritten linear searches instead of the use of
std::find.

After PR #109412, a performance regression was observed that's caused by
the use of std::find for linear searches.

The exact cause wasn't pinpointed, but, at the time of writing, the most
likely culprit is the forced loop unrolling in the definition of
libstdc++'s std::find. Presumably this is done to optimise for larger
containers.

However for the case of small containers such as SmallVector, this
actually hurts performance.
---
 llvm/include/llvm/ADT/SmallSet.h | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/llvm/include/llvm/ADT/SmallSet.h b/llvm/include/llvm/ADT/SmallSet.h
index 97ea4c642bf556..3bc101882dab79 100644
--- a/llvm/include/llvm/ADT/SmallSet.h
+++ b/llvm/include/llvm/ADT/SmallSet.h
@@ -193,7 +193,7 @@ class SmallSet {
   bool erase(const T &V) {
     if (!isSmall())
       return Set.erase(V);
-    auto I = std::find(Vector.begin(), Vector.end(), V);
+    auto I = vfind(V);
     if (I != Vector.end()) {
       Vector.erase(I);
       return true;
@@ -221,7 +221,7 @@ class SmallSet {
   /// Check if the SmallSet contains the given element.
   bool contains(const T &V) const {
     if (isSmall())
-      return std::find(Vector.begin(), Vector.end(), V) != Vector.end();
+      return vfind(V) != Vector.end();
     return Set.find(V) != Set.end();
   }
 
@@ -237,20 +237,26 @@ class SmallSet {
       return {const_iterator(I), Inserted};
     }
 
-    auto I = std::find(Vector.begin(), Vector.end(), V);
+    auto I = vfind(V);
     if (I != Vector.end()) // Don't reinsert if it already exists.
       return {const_iterator(I), false};
     if (Vector.size() < N) {
       Vector.push_back(std::forward<ArgType>(V));
       return {const_iterator(std::prev(Vector.end())), true};
     }
-
     // Otherwise, grow from vector to set.
     Set.insert(std::make_move_iterator(Vector.begin()),
                std::make_move_iterator(Vector.end()));
     Vector.clear();
     return {const_iterator(Set.insert(std::forward<ArgType>(V)).first), true};
   }
+
+  auto vfind(const T &V) const {
+    for (auto I = Vector.begin(), E = Vector.end(); I != E; ++I)
+      if (*I == V)
+        return I;
+    return Vector.end();
+  }
 };
 
 /// If this set is of pointer values, transparently switch over to using

>From ef508a26f524acad79efcca9db248d6c4e0af995 Mon Sep 17 00:00:00 2001
From: Victor Campos <victor.campos at arm.com>
Date: Fri, 27 Sep 2024 15:06:26 +0100
Subject: [PATCH 2/2] Changes:

- Add inline comment.
- Spellout `vfind` return type for clarity.
---
 llvm/include/llvm/ADT/SmallSet.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/llvm/include/llvm/ADT/SmallSet.h b/llvm/include/llvm/ADT/SmallSet.h
index 3bc101882dab79..ed3c6bfd3418d7 100644
--- a/llvm/include/llvm/ADT/SmallSet.h
+++ b/llvm/include/llvm/ADT/SmallSet.h
@@ -251,7 +251,9 @@ class SmallSet {
     return {const_iterator(Set.insert(std::forward<ArgType>(V)).first), true};
   }
 
-  auto vfind(const T &V) const {
+  // Handwritten linear search. The use of std::find might hurt performance as
+  // its implementation may be optimized for larger containers.
+  typename SmallVector<T, N>::const_iterator vfind(const T &V) const {
     for (auto I = Vector.begin(), E = Vector.end(); I != E; ++I)
       if (*I == V)
         return I;



More information about the llvm-commits mailing list