[clang] Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (PR #129934)
Erich Keane via cfe-commits
cfe-commits at lists.llvm.org
Thu Mar 6 06:58:35 PST 2025
erichkeane wrote:
> Just some minor nits from me so far.
>
> I think there may be some additional performance benefits we could eek out, but I don't think they need to be done in this patch (or done at all, it's speculative). One is that I noticed `ASTContext::getTraversalScope()` is returning a vector by copy and that doesn't seem necessary. Another is that perhaps we want to use a bloom filter here instead on the assumption that the parent map will always be fairly large. But again, this is all speculation.
Hmm... would SOME duplicates be acceptable? I THINK the original patch removed the duplicates, so they were presumably OK before then? IF that is acceptable (and uses are duplicate-tolerant), a Bloom filter that can reduce the number of duplicates sub-1% would still be a massive-win, right? We could perhaps do some tuning on the size of the filter to get that reasonable.
According to the Wiki article on Bloom filter's intro, 10 bits-per-element is all that is necessary for a 1% false-positive rate, but the details of the article and reference get really "LaTEX-math-stuff" to me, so I don't have a good idea what it would take to get the false positive rate down low.
BUT since the idea is to just reduce/eliminate duplicates without high memory overhead, it seems like it would be a great solution here.
https://github.com/llvm/llvm-project/pull/129934
More information about the cfe-commits
mailing list