[PATCH] D113460: [llvm-dwarfdump] Add support for filtering by DIE tags
William Woodruff via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 9 18:09:34 PST 2021
woodruffw added a comment.
In D113460#3120137 <https://reviews.llvm.org/D113460#3120137>, @dblaikie wrote:
> I was thinking of something simpler/less invasive (no need to add a new data structure to collect all the DIEs, etc).
>
> ie: What if "filterByName" functions were renamed to "filter" and each took an extra argument, `Tags`, & did all the same stufff but also checked Tags?
I can do this, but I want to offer a bit of justification for the new structure first :-)
Maintainability-wise, I think the new approach is slightly easier to extend with additional filters (e.g. a `--type` filter for matching `DW_TAG_type`): they can either be added to `dieFilter` or, with my previous changeset, composed inside a lambda. The control flow is also a little easier to follow and makes use of `make_filter_range`, vs. calling through an overloaded `filterByName` with duplicate callsites for `Die.dump`.
Performance-wise, pre-collecting all of the DIEs is no slower than the previous approach. I've pasted some local benchmarks below; the test file is a debug build of `llvm-dwarfdump`, containing approximately 150MB of debug info. My changes are *slightly* faster, but that's probably noise/because the pre-collection simplifies the instructions needed to fetch each DIE.
Before this changeset:
Performance counter stats for './bin/llvm-dwarfdump --name Vector --regex /home/william/tmp/llvm-dwarfdump' (100 runs):
6,587.02 msec task-clock # 0.999 CPUs utilized ( +- 0.06% )
936 context-switches # 0.142 K/sec ( +- 3.60% )
61 cpu-migrations # 0.009 K/sec ( +- 2.65% )
89,214 page-faults # 0.014 M/sec ( +- 0.00% )
26,887,760,726 cycles # 4.082 GHz ( +- 0.04% ) (62.46%)
475,358,555 stalled-cycles-frontend # 1.77% frontend cycles idle ( +- 0.59% ) (62.48%)
6,463,934,156 stalled-cycles-backend # 24.04% backend cycles idle ( +- 0.16% ) (62.50%)
79,870,809,617 instructions # 2.97 insn per cycle
# 0.08 stalled cycles per insn ( +- 0.01% ) (62.52%)
16,098,031,769 branches # 2443.901 M/sec ( +- 0.01% ) (62.54%)
53,414,087 branch-misses # 0.33% of all branches ( +- 0.24% ) (62.54%)
26,475,961,150 L1-dcache-loads # 4019.413 M/sec ( +- 0.01% ) (62.50%)
99,919,546 L1-dcache-load-misses # 0.38% of all L1-dcache accesses ( +- 3.81% ) (62.47%)
<not supported> LLC-loads
<not supported> LLC-load-misses
6.59507 +- 0.00363 seconds time elapsed ( +- 0.05% )
and with DIE pre-collection:
Performance counter stats for './bin/llvm-dwarfdump --name Vector --regex /home/william/tmp/llvm-dwarfdump' (100 runs):
6,585.08 msec task-clock # 0.999 CPUs utilized ( +- 0.06% )
842 context-switches # 0.128 K/sec ( +- 2.58% )
69 cpu-migrations # 0.010 K/sec ( +- 2.59% )
128,934 page-faults # 0.020 M/sec ( +- 0.00% )
26,807,844,451 cycles # 4.071 GHz ( +- 0.05% ) (62.46%)
472,800,149 stalled-cycles-frontend # 1.76% frontend cycles idle ( +- 0.70% ) (62.49%)
4,915,697,574 stalled-cycles-backend # 18.34% backend cycles idle ( +- 0.18% ) (62.50%)
79,482,489,382 instructions # 2.96 insn per cycle
# 0.06 stalled cycles per insn ( +- 0.01% ) (62.52%)
16,069,809,100 branches # 2440.337 M/sec ( +- 0.01% ) (62.54%)
50,892,766 branch-misses # 0.32% of all branches ( +- 0.05% ) (62.53%)
26,035,545,689 L1-dcache-loads # 3953.718 M/sec ( +- 0.01% ) (62.49%)
110,040,468 L1-dcache-load-misses # 0.42% of all L1-dcache accesses ( +- 3.55% ) (62.46%)
<not supported> LLC-loads
<not supported> LLC-load-misses
6.59287 +- 0.00390 seconds time elapsed ( +- 0.06% )
Overall, I think the absence of a performance regression and easier extensibility make the more invasive changes worth it. But I'm happy to roll back to a version that uses the nested `filter` functions instead, if you think it's not worth the churn.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113460/new/
https://reviews.llvm.org/D113460
More information about the llvm-commits
mailing list