[PATCH] D141451: [clang] report inlining decisions with -Wattribute-{warning|error}
Nick Desaulniers via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Jan 31 17:18:27 PST 2023
nickdesaulniers added a comment.
In D141451#4064298 <https://reviews.llvm.org/D141451#4064298>, @dblaikie wrote:
> Right - I was thinking more, as above, about directly using the existing metadata generation (if it's too expensive to enable by default, then possibly under an off-by-default warning or other flag) that the inliner already knows how to read and write, rather than creating new/different metadata handling.
> Again, might be worth knowing what the cost of the debug info metadata loc tracking mode is.
For a first order approximation, I have this diff applied:
diff --git a/clang/include/clang/Basic/CodeGenOptions.def b/clang/include/clang/Basic/CodeGenOptions.def
index 436226c6f178..6d5049803188 100644
--- a/clang/include/clang/Basic/CodeGenOptions.def
+++ b/clang/include/clang/Basic/CodeGenOptions.def
@@ -386,7 +386,7 @@ VALUE_CODEGENOPT(SmallDataLimit, 32, 0)
VALUE_CODEGENOPT(SSPBufferSize, 32, 0)
/// The kind of generated debug info.
-ENUM_CODEGENOPT(DebugInfo, codegenoptions::DebugInfoKind, 4, codegenoptions::NoDebugInfo)
+ENUM_CODEGENOPT(DebugInfo, codegenoptions::DebugInfoKind, 4, codegenoptions::LocTrackingOnly)
/// Whether to generate macro debug info.
CODEGENOPT(MacroDebugInfo, 1, 0)
This changes the default value of `codegenopts.DebugInfo` from `codegenoptions::NoDebugInfo` to `codegenoptions::LocTrackingOnly`, which is what the optimization remark emitter infra uses. This emits more debug info than we need (metadata for every statement in IR rather than JUST `call` instructions). But it would give me analogous metadata I could use to solve this problem addressed by this patch, and it would guarantee that I had precise col/line info.
Comparing 30 Linux kernel x86_64 defconfig (i.e. no debug info) builds with vs without that change:
Without (baseline):
$ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
Time (mean ± σ): 61.592 s ± 0.156 s [User: 4378.360 s, System: 312.040 s]
Range (min … max): 61.283 s … 62.026 s 30 runs
With diff from above:
$ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
Time (mean ± σ): 62.228 s ± 0.523 s [User: 4433.828 s, System: 312.908 s]
Range (min … max): 61.825 s … 64.912 s 30 runs
So that's a slowdown from the mean of ~1.02% ((1 - 61.592/62.228)*100). We probably could claw some of that back if we had another level of `codegenoptions` between `NoDebugInfo` and `LocTrackingOnly` that only emitted LocTracking info for call insts (stated another way, omit `DILocation` for all instructions other than `call`). I'm guessing that would take significant work to add to clang; I wasn't able how to figure out how to do so quickly. I imagine updating the non-debug-info clang tests to also be a treat. Is a 1% compile time performance hit worth more precise backend diagnostics? Probably (IMO). Perhaps worth an RFC?
---
FWIW, here are measurements of D141451 <https://reviews.llvm.org/D141451> @ Diff 488371 at the same base as the baseline:
$ hyperfine --prepare 'make LLVM=1 -j128 -s clean' --runs 30 'make LLVM=1 -j128 -s'
Benchmark 1: make LLVM=1 -j128 -s
Time (mean ± σ): 61.702 s ± 0.456 s [User: 4395.106 s, System: 313.120 s]
Range (min … max): 61.409 s … 64.000 s 30 runs
That's ~0.17% slower (and my machine is running hot from benchmarking for the past 1.5hrs). Note this patch/approach doesn't have precise line info.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141451/new/
https://reviews.llvm.org/D141451
More information about the cfe-commits
mailing list