[PATCH] D94555: [LLD][COFF] Avoid std::vector resizes during type merging

Tue Jan 19 12:53:39 PST 2021

aganea added inline comments.

================
Comment at: lld/COFF/DebugTypes.cpp:632-634
   merged.recs.resize(offset + newSize);
   auto newRec = makeMutableArrayRef(&merged.recs[offset], newSize);
   memcpy(newRec.data(), ty.data().data(), newSize);
----------------
mstorsjo wrote:
> aganea wrote:
> > dblaikie wrote:
> > > rnk wrote:
> > > > dblaikie wrote:
> > > > > I know this is a bit of a stretch - but given the somewhat non-trivial extra work required to determine the capacity, I'm wondering if this issue might be partly due to this incremental/repeated resizing.
> > > > > 
> > > > > I'm guessing that maybe the resize is thwarting the usual growth function of the vectors (eg: resize brings the size up to exactly the size specified, whereas repeated push_back hits the usual growth factor to offset the next push_backs, etc). The notes here: https://en.cppreference.com/w/cpp/container/vector/reserve discuss this sort of issue with reserve, though I'm pretty sure it applies to resize as well.
> > > > > 
> > > > > If it's easy to do, I'd be curious to know whether removing the fix in this patch, and instead changing this resize+memcpy to loop+push_back or... oh, reference invalidation concerns. What about recs.insert(recs.begin(), recs.end()) ? Hmm, yeah, seems the spec for std::vector doesn't allow that, so I guess the nearest code would be:
> > > > > ```
> > > > > for (int i = 0; i != newSize; ++i)
> > > > >   merged.recs.push_back(merged.recs[i]);
> > > > > ```
> > > > When I chose to use resize, I remember checking some reference material to see if it would be subject to the O(n^2) behavior of using reserve. I don't remember what I looked at exactly, probably libc++ source. Whatever I looked at, I assured myself that repeated resizing isn't O(n^2). Perhaps the behavior is not required by the standard, and it varies by implementation.
> > > Seems libstdc++ and libc++ at some version does indeed do the right thing:
> > > ```
> > > $ clang++-tot test.cpp && ./a.out
> > > initial capacity: 0
> > > capacity after resize(500): 500
> > > capacity after resize(501): 1000
> > > capacity after resize(500) + 1 * push_back: 1000
> > > $ clang++-tot test.cpp -stdlib=libc++ && ./a.out
> > > initial capacity: 0
> > > capacity after resize(500): 500
> > > capacity after resize(501): 1000
> > > capacity after resize(500) + 1 * push_back: 1000
> > > ```
> > > 
> > > But thinking about the code, it's always doubling anyway - so it sort of implements its own growth factor (ie: that first resize(500) in my example doesn't get any extra capacity because it's much bigger than the growth factor from the baseline size of 0, but the resize(501) does because it's less than the growth factor from the current capacity). So this code probably always gets only the size requested, because that size is always about as much as the growth factor would dictate from the previous size (ie: the resize probably produces the same number of growths as the push_back).
> > > 
> > > So maybe no benefit but no harm/no wins to be had by switching to push_back either, I guess.
> > > 
> > > Wouldn't mind the data if it's easy to gather, but yeah, I'm less confident it'll show any benefit.
> > One other way to fix this would be a growing allocator backed up by reserved, but uncommited memory pages. That would ideal in this situation I think.
> > 
> > We could reserve say 1 GB of virtual memory upfront for each container, and commit memory pages on-the-go, as the container grows. This prevents memory moves and keeps iterators stable. I'm not too sure if the `std::allocator` allows this kind of trick though, so it would have to be a custom container (or perhaps a variant of `BumpPtrAllocator`).
> > 
> > However that comes with another downside on Windows, which is that the VAD tree is deferred to a background System then when it's too large, I've mentionned that here: https://reviews.llvm.org/D86694#2277371 but in this case it's maybe less of an issue.
> > 
> > @dblaikie What is `clang++-tot`? What does ToT mean, I've seen this mentionned in GCC as well?
> Keep in mind that LLD still is usable in 32 bit builds, where one can't reserve 1 GB of memory like that. (It's a bit limited in such builds, it can't successfully link large images, but for common usage it's still quite ok.)
> 
> ToT is Top of Tree, i.e. latest git/svn version.
@mstorsjo That's a good point, thanks for raising that.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94555/new/

https://reviews.llvm.org/D94555