[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)

Erich Keane via cfe-commits cfe-commits at lists.llvm.org
Wed Mar 12 09:32:26 PDT 2025


erichkeane wrote:

> I just realized there is another optimization we could do in `push_back` (similar motivation as what @erichkeane mentioned [here](#discussion_r1982274548), but different):
> 
> We could avoid the duplication check in `push_back`, and defer it to the `contains()`/`view()` accessors, thus making `push_back` simply do `if (!Value.getMemoizationData()) { ... }`. If we do such a thing, then we'd probably want to amortize it so that the number of unprocessed entries doesn't grow unboundedly. We could keep the number of unprocessed elements within (say) 25% of the size of the processed elements.
> 
> Thoughts on this? I'm inclined to give it a try to see if it's worth the code complexity.

It sounds worth a try, and perhaps could help readability.  I think a 'remove dupes' step (aka, std::unique/erase) is much more readable than what is being attempted here.

  As far as the `unprocessed` elements, it would be interesting to do some sort of benchmarking to see if we choose a percent.  What MIGHT be useful instead is in push_back if `capacity==size`, do the 'remove dupes' step.  That would keep us necessarily bounded to the handful of powers-of-2, but would also catch us before a particularly expensive step.

https://github.com/llvm/llvm-project/pull/129934


More information about the cfe-commits mailing list