[llvm] [Analysis] Add DebugInfoCache analysis (PR #118629)
Artem Pianykh via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 15:01:03 PST 2025
================
@@ -0,0 +1,47 @@
+//===- llvm/Analysis/DebugInfoCache.cpp - debug info cache ----------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains an analysis that builds a cache of debug info for each
+// DICompileUnit in a module.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm/Analysis/DebugInfoCache.h"
+#include "llvm/IR/Module.h"
+
+using namespace llvm;
+
+namespace {
+DebugInfoFinder processCompileUnit(DICompileUnit *CU) {
+ DebugInfoFinder DIFinder;
+ DIFinder.processCompileUnit(CU);
+
+ return DIFinder;
+}
+} // namespace
+
+DebugInfoCache::DebugInfoCache(const Module &M) {
+ for (const auto CU : M.debug_compile_units()) {
+ auto DIFinder = processCompileUnit(CU);
+ Result[CU] = std::move(DIFinder);
+ }
+}
+
+bool DebugInfoCache::invalidate(Module &M, const PreservedAnalyses &PA,
+ ModuleAnalysisManager::Invalidator &) {
+ // Check whether the analysis has been explicitly invalidated. Otherwise, it's
+ // stateless and remains preserved.
+ auto PAC = PA.getChecker<DebugInfoCacheAnalysis>();
+ return !PAC.preservedWhenStateless();
----------------
artempyanykh wrote:
Thanks for comments and questions @felipepiovezan!
> This is not really an area I'm very familiar with, but there are some things that I found odd about that PR. For example, why do we need this?
>
> ```
> // Prime DebugInfoCache.
> // TODO: Currently, the only user is CoroSplitPass. Consider running
> // conditionally.
> AM.getResult<DebugInfoCacheAnalysis>(M);
> ```
This is based on my understanding of [this part of the manual](https://llvm.org/docs/NewPassManager.html#using-analyses). Inner level passes (like CGSCC-level) can only ask for a cached result of an outer level analysis (like Module-level). They can't ask to compute outer analysis results on demand.
Unless I explicitly run the analysis there, there won't be any results to consume from inside CoroSplit.
The 'conditional' part is based on what I saw in [CoroConditionalWrapper](https://github.com/llvm/llvm-project/blob/7d38fe334bd527dfb932f1a2a481f1ac3bfdbebf/llvm/include/llvm/Transforms/Coroutines/CoroConditionalWrapper.h#L18-L20).
> > I was worried about unnecessary invalidations, but all CoroSplit passes will share the same analysis results as part of CGSCC pipeline.
>
> Isn't this dangerous though? Each CoroSplit pass will modify the Module and its metadata, so the analysis need to be invalidated after every run of CoroSplit
IIUC, although CoroSplit _technically_ modifies the module it doesn't invalidate debug info metadata attached *directly* to a CU which is what gets cached by the analysis.
> the pass ends with a return PreservedAnalyses::none();
I think this is for a specific unit of IR (SCC in this case)? In traces that I get the analysis runs once per `ModuleToPostOrderCGSCCPassAdaptor`.
> This implies we're doing a lot of unnecessary work that will be invalidated.
> Should this new analysis be lazy?
Ideally! But given the caveats [in the doc](https://llvm.org/docs/NewPassManager.html#using-analyses) about on demand computations and "future concurrency", I'm not sure what's the best way to proceed; and if it'll be worth it. In my testing even on our larger sources (that otherwise take a few minutes to compile) the pass takes 60ms.
> I.e. only visit a CU when it is queried on it.
This would also depend on how common multi-CU modules are.
> Also, why is the copy needed? A const reference would not suffice?
>
> ```
> // Copy DIFinder from cache which is primed on F's compile unit when available
> auto *PrimedDIFinder = cachedDIFinder(F, DICache);
> if (PrimedDIFinder)
> DIFinder = *PrimedDIFinder;
> ```
DIFinder is a mutable structure, it gets hydrated by visiting debug info attached to an element, but it won't revisit what's already been visited. For a function:
1. it visits debug info attached to function's DISubprogram,
2. it visits debug into attached to function's CU (see [this](https://github.com/llvm/llvm-project/blob/7d38fe334bd527dfb932f1a2a481f1ac3bfdbebf/llvm/lib/IR/DebugInfo.cpp#L335-L343)). (Note that visiting CU doesn't visit all subprograms).
We use `PrimedDIFinder` hydrated on a CU to cut the expensive p.2 but we can't pass it by reference (and copy instead) because it gets mutated down the line.
Some of this complexity is historic. It'd be great to refactor it further but it can be done on top of this patch set. At this point, it feels more important to review/validate this approach of caching debug info.
https://github.com/llvm/llvm-project/pull/118629
More information about the llvm-commits
mailing list