[PATCH] D63713: WIP: DataExtractor error handling
David Blaikie via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 2 15:32:26 PDT 2019
dblaikie added a comment.
In D63713#1566605 <https://reviews.llvm.org/D63713#1566605>, @labath wrote:
> Ok, I got some numbers now. I went for a micro-benchmark-like setup to show some kind of a worst case scenario. The test setup is as follows:
>
> I took the largest single .o file I had around (Registry.cpp.o in clangDynamicASTMatchers library). I then linked it to remove any relocations (otherwise, most of the time is spend applying those). Then I modified llvm-dwarfdump to parse the debug_loc section without dumping anything (to avoid measuring the time spend in printfs). Both llvm-dwarfdump and Registry.cpp.o I was dumping were built with -O3 -g, with asserts disabled (no FDO, LTO or other fancy stuff). This resulted in about 4.5 megabytes of debug_loc for parsing in Registry.cpp.o. Then I used the linux perf command to run llvm-dwarfdump -debug-loc 1000 times and dump the stats.
>
> The baseline stats are:
>
> 27.285986 task-clock:u (msec) # 0.986 CPUs utilized ( +- 0.11% )
> 0 context-switches:u # 0.000 K/sec
> 0 cpu-migrations:u # 0.000 K/sec
> 2,813 page-faults:u # 0.103 M/sec ( +- 0.24% )
> 58,831,163 cycles:u # 2.156 GHz ( +- 0.07% )
> 606,986 stalled-cycles-frontend:u # 1.03% frontend cycles idle ( +- 3.76% )
> 7,924,778 stalled-cycles-backend:u # 13.47% backend cycles idle ( +- 0.33% )
> 146,588,727 instructions:u # 2.49 insn per cycle
> # 0.05 stalled cycles per insn ( +- 0.00% )
> 29,545,620 branches:u # 1082.813 M/sec ( +- 0.00% )
> 222,276 branch-misses:u # 0.75% of all branches ( +- 0.15% )
>
> 0.027663381 seconds time elapsed ( +- 0.11% )
>
>
> The stats with this patch applied are:
>
> 27.397390 task-clock:u (msec) # 0.987 CPUs utilized ( +- 0.10% )
> 0 context-switches:u # 0.000 K/sec
> 0 cpu-migrations:u # 0.000 K/sec
> 2,833 page-faults:u # 0.103 M/sec ( +- 0.24% )
> 60,160,571 cycles:u # 2.196 GHz ( +- 0.07% )
> 584,825 stalled-cycles-frontend:u # 0.97% frontend cycles idle ( +- 3.37% )
> 10,729,974 stalled-cycles-backend:u # 17.84% backend cycles idle ( +- 0.26% )
> 156,141,836 instructions:u # 2.60 insn per cycle
> # 0.07 stalled cycles per insn ( +- 0.00% )
> 31,599,940 branches:u # 1153.392 M/sec ( +- 0.00% )
> 221,247 branch-misses:u # 0.70% of all branches ( +- 0.06% )
>
> 0.027771865 seconds time elapsed ( +- 0.10% )
>
>
> The stats for a version of this patch which additionally checks for the error flag before attempting the parse (as discussed in the inline comment) are:
>
> 27.808349 task-clock:u (msec) # 0.986 CPUs utilized ( +- 0.10% )
> 0 context-switches:u # 0.000 K/sec
> 0 cpu-migrations:u # 0.000 K/sec
> 2,839 page-faults:u # 0.102 M/sec ( +- 0.24% )
> 62,887,388 cycles:u # 2.261 GHz ( +- 0.06% )
> 575,264 stalled-cycles-frontend:u # 0.91% frontend cycles idle ( +- 3.18% )
> 14,757,888 stalled-cycles-backend:u # 23.47% backend cycles idle ( +- 0.23% )
> 167,562,307 instructions:u # 2.66 insn per cycle
> # 0.09 stalled cycles per insn ( +- 0.00% )
> 33,414,152 branches:u # 1201.587 M/sec ( +- 0.00% )
> 221,454 branch-misses:u # 0.66% of all branches ( +- 0.12% )
>
> 0.028201319 seconds time elapsed ( +- 0.10% )
>
>
> As can be seen, this patch increases the parsing time by about 0.4%. This is enough to be statistically significant in a benchmark like this, but probably not world-shattering (some slowdown is unavoidable with a change like this).
>
> If we additionally enable the early returns we get an additional 1.5% slowdown (or 1.9% above baseline). Still not bad for a "micro-benchmark", but it does make one wonder, whether it is really worth it. My feeling would be that it isn't...
Sorry I've lost some of the context here (my brain's like a sieve sometimes) - the first set of results come from this patch which, is this correct: implements a wrapper around DataExtractor that handles the offset comparison to determine validity? (if the offset isn't modified, the attempted parsing was invalid in some way)
And the second results are from adding a separate "if (already seen an error) return;" to these same wrapped function calls?
What if "if (already seen an error)" was added to the "isValidOffset" function (making all offsets invalid if there's already an error) - would that collapse the early return codepaths & perhaps make things simpler/faster?
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63713/new/
https://reviews.llvm.org/D63713
More information about the llvm-commits
mailing list