[PATCH] D78845: [COFF] Add a fastpath for /INCLUDE: in .drective sections

Alexandre Ganea via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Apr 25 10:04:50 PDT 2020


aganea added a comment.

As for LLVMOptions, what prevents a BumpAllocator + placement new on the Arg(s)? Or is the perf. wasted somewhere else?

Side-node: I was profiling the build-time regression between Clang 9, 10 & 11 on building LLVM and building a few of our games, with and without debug info. There's a severe regression, -10% without debug info, and -15 to -18% with debug info, from Clang 9 to 10. Clang 11 adds an extra -2%. Additionally, no matter from what angle I look, allocations in clang take 10.5%-11% of the total CPU time (on Windows with the standard heap allocator). Replacing the allocator reduces that to 3.5% CPU for allocations, and improves some bandwidth/cache-sensitive functions along the way, which effectively reduce CPU usage by 15% (compared to baseline Clang 10). But at the same time it swipes the issue under the carpet. This all seems related to the amount of (small) allocations in LLVM generally.



================
Comment at: lld/COFF/Driver.h:51
+  std::vector<StringRef> exports;
+  std::vector<StringRef> includes;
+  llvm::opt::InputArgList args;
----------------
Curious: how many includes do you have per .drective? It is worth calling `.reserve()` somehow before inserting? MS-STL has geometric increase of the `std::vector` buffer. If your .drective has many tokens, we would probably allocate & move memory several times per .drective. I think `includes.reserve(tokensNum)`, `exports.reserve(tokensNum)` is better that the cost of re-alloc, even we're wasting a few extra memory. Unless you do a two-step parsing.


================
Comment at: lld/COFF/DriverUtils.cpp:869
 
   for (StringRef tok : tokenize(s)) {
     if (tok.startswith_lower("/export:") || tok.startswith_lower("-export:"))
----------------
`tokenize()` allocates, ie. the returned `std::vector` doesn't `.reserve()` when constructed with a range (at least in MS STL 2019). That would be a good candidate for further optimization. Especially since you don't need the 'saver'.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78845/new/

https://reviews.llvm.org/D78845





More information about the llvm-commits mailing list