[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)

Erich Keane via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 6 06:58:35 PST 2025


erichkeane wrote:

> Just some minor nits from me so far.
> 
> I think there may be some additional performance benefits we could eek out, but I don't think they need to be done in this patch (or done at all, it's speculative). One is that I noticed `ASTContext::getTraversalScope()` is returning a vector by copy and that doesn't seem necessary. Another is that perhaps we want to use a bloom filter here instead on the assumption that the parent map will always be fairly large. But again, this is all speculation.

Hmm... would SOME duplicates be acceptable?  I THINK the original patch removed the duplicates, so they were presumably OK before then?  IF that is acceptable (and uses are duplicate-tolerant), a Bloom filter that can reduce the number of duplicates sub-1% would still be a massive-win, right?  We could perhaps do some tuning on the size of the filter to get that reasonable.

According to the Wiki article on Bloom filter's intro, 10 bits-per-element is all that is necessary for a 1% false-positive rate, but the details of the article and reference get really "LaTEX-math-stuff" to me, so I don't have a good idea what it would take to get the false positive rate down low.

BUT since the idea is to just reduce/eliminate duplicates without high memory overhead, it seems like it would be a great solution here.  

https://github.com/llvm/llvm-project/pull/129934


More information about the cfe-commits mailing list