[cfe-dev] [RFC] Clearing Clang AST before running backend optimizations/codegen to save memory
cfe-dev at lists.llvm.org
Tue Sep 21 06:30:22 PDT 2021
Drive-by thought, debug-info-for-profiling retains source info, maybe that could be unconditionally on and Rpass could use it?
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of David Blaikie via cfe-dev
Sent: Monday, September 20, 2021 8:19 PM
To: Arthur Eubanks <aeubanks at google.com>
Cc: Clang Dev <cfe-dev at lists.llvm.org>
Subject: Re: [cfe-dev] [RFC] Clearing Clang AST before running backend optimizations/codegen to save memory
On Mon, Sep 20, 2021 at 3:04 PM Arthur Eubanks <aeubanks at google.com<mailto:aeubanks at google.com>> wrote:
Looking at -Rpass (and various things like warnings for inability to vectorize when we specifically request it with #pragma clang loop vectorize(enable)), it does end up using objects from the AST to approximate the source location (BackendConsumer::getBestLocationFromDebugLoc()) if it can't find the source location from debug info. So this would affect users who don't build with debug info. Without debug info or the AST, clang will print a warning/remark without a source location. This is a tradeoff we'd have to decide on.
Ah, the AST lookup is only to retrieve the location of a function by name - if we made a mapping/record of all those locations (shouldn't take up much space, I'd think) then we could use that instead and not need the AST for that callback?
There's a similar issue with backend warnings but those don't even pass debug info to the diagnostic handler (clang/test/Misc/backend-stack-frame-diagnostics.cpp). Perhaps it could be extended to do that though.
I haven't looked too much into the -disable-free stuff, but the reason it mitigates crashes is because if we clear AST objects we still have dangling references to them that we later attempt to clean up unless we -disable-free.
So disabling free is keeping more things alive - perhaps then the RAM savings aren't as much as they could be if freeing was enabled? But yeah, more to look into.
On Fri, Sep 17, 2021 at 4:41 PM David Blaikie <dblaikie at gmail.com<mailto:dblaikie at gmail.com>> wrote:
I think it'd be unfortunate if certain features don't work in this mode, unless we understand why/are pretty sure that's a fairly fundamental limitation. For instance, at Google we've got memory limitations (hence the motivation for this work) and I think we created, or at least have some interest in -Rpass - if -Rpass couldn't be composed with this feature, then we'd make it harder to investigate performance issues (because -Rpass wouldn't be available) in larger compiles that need this memory savings to fit into the memory limits we have. I'd guess the issue is that -Rpass I think traffics in Clang source locations. So it's possible the source location infrastructure/data structures would have to be kept, even though the AST/semantic pieces could be torn down. (unless that source location stuff can refer into ASTs for differentiating template specializations, etc - that'd be the tipping point for me in "OK, it may be worth the benefit to make these incompatible, or reduce the quality of -Rpass diagnostics when using this memory saving technique" - wonder if it's only the -Rpass diagnostics, or other backend diagnostics that use that infrastructure)
Which is to say I'd be /slightly/ averse to adding this feature as a Clang default or driver flag (& similarly averse to leaving it as a cc1 off-by-default flag indefinitely) without a pretty good answer to those crashing/non-functioning tests.
(lower priority, but fairly nice-to-have would be some answer to the cleaning up issues, -disable-free, etc - sort of weird that we'd have to /disable-free/ to enable freeing things earlier... that seems pretty suspicious)
but if those issues can be resolved
On Fri, Sep 17, 2021 at 2:30 PM Arthur Eubanks via cfe-dev <cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>> wrote:
We keep around the Clang AST when we do backend optimizations on the IR. This causes the peak memory usage to be more than necessary since (I believe) generally we don't need the Clang AST when running optimizations. This gives us more room to work with things like caching analyses, at least for frontend compilations.
Measuring the effects of this when building LLVM's PassBuilder.cpp (longest LLVM file to compile), I measured a drop of peak memory usage (/usr/bin/time's max rss) from ~1.3-1.4G to ~1.0G.
There are still a couple issues I haven't dug too deeply into yet, mostly to do with cleaning things up when freeing memory, so right now it's only enabled with -disable-free which works around those issues. Most clang tests pass with this patch; there are a couple things that crash (e.g. -Rpass, clang interpreter) where we can investigate further or just disable this feature.
llvm-compile-time-tracker<https://urldefense.com/v3/__https:/llvm-compile-time-tracker.com/compare.php?from=167ff5280d7fcad731810d5d2bf10561ed2adacc&to=b08fcae3a02d5ebe58afd8f8658d798b62ff8eb7&stat=max-rss__;!!JmoZiZGBv3RvKRSx!u5xov9_zHGWKWsFWg-FlBBUrKZWcBw4KyfXq4eY9dQRAIJUNxoTfDwD40IbggPFwHQ$> memory metrics
Any concerns with this?
cfe-dev mailing list
cfe-dev at lists.llvm.org<mailto:cfe-dev at lists.llvm.org>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the cfe-dev