[PATCH] D92788: [clangd] NFC: Use SmallVector<T> where possible

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Dec 8 02:29:26 PST 2020


sammccall added a comment.

Thanks for cleaning up!

In D92788#2438207 <https://reviews.llvm.org/D92788#2438207>, @njames93 wrote:

> Not sure I'm a huge fan of this, Some of these cases the size is specified because that's the upper limit we typically expect the SmallVector to use.

The idea behind the new LLVM-wide recommendation is that in principle this is true but in practice it's mostly false (at least, these estimates often aren't chosen carefully).

This seems true to me - we do try to guess a sensible value (when forced to choose something), but we probably shouldn't be specifying these where we wouldn't choose to e.g. `reserve()` in a std::vector.
There's some readability/maintenance cost and in most cases no performance difference at the level we typically care about.

I've flagged cases where it seems like it might matter - please do the same if you find any!



================
Comment at: clang-tools-extra/clangd/CompileCommands.cpp:334
   using TableTy =
-      llvm::StringMap<llvm::SmallVector<Rule, 4>, llvm::BumpPtrAllocator>;
+      llvm::StringMap<llvm::SmallVector<Rule>, llvm::BumpPtrAllocator>;
   static TableTy *Table = [] {
----------------
This is significant in size (around half a megabyte per the comment on ArgStripper). Rule is 24 bytes, so we're reducing N from 4 to 2 I believe.
Please leave this alone or check that it's a reduction in memory usage (if memory usage is roughly equal, fewer heap allocations is better).


================
Comment at: clang-tools-extra/clangd/Selection.cpp:784
 // Always returns at least one range - if no tokens touched, and empty range.
-static llvm::SmallVector<std::pair<unsigned, unsigned>, 2>
+static llvm::SmallVector<std::pair<unsigned, unsigned>>
 pointBounds(unsigned Offset, const syntax::TokenBuffer &Tokens) {
----------------
I'd prefer to keep the 2 here for readability, it's a nice hint/reminder that we *always* return 1 or 2 values


================
Comment at: clang-tools-extra/clangd/TUScheduler.cpp:1414
   History = History.take_back(15);
-  llvm::SmallVector<clock::duration, 15> Recent(History.begin(), History.end());
+  llvm::SmallVector<clock::duration> Recent(History.begin(), History.end());
   auto Median = Recent.begin() + Recent.size() / 2;
----------------
This *always* allocates and the previous version *never* allocates, please revert this one.


================
Comment at: clang-tools-extra/clangd/index/FileIndex.cpp:330
   {
-    llvm::DenseMap<SymbolID, llvm::SmallVector<Ref, 4>> MergedRefs;
+    llvm::DenseMap<SymbolID, llvm::SmallVector<Ref>> MergedRefs;
     size_t Count = 0;
----------------
this is a reduction from 4 to 2, and this is fairly performance-sensitive code (for the preamble index).
My intuition is that not that many symbols are referenced 1-2 times across all TUs, and when many are, it's a tiny project where performance doesn't matter.
Guessing about performance is probably folly. But I'd probably avoid reducing this without measuring.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D92788/new/

https://reviews.llvm.org/D92788



More information about the cfe-commits mailing list