[PATCH] D63713: WIP: DataExtractor error handling

Tue Jul 2 15:32:26 PDT 2019

dblaikie added a comment.

In D63713#1566605 <https://reviews.llvm.org/D63713#1566605>, @labath wrote:

> Ok, I got some numbers now. I went for a micro-benchmark-like setup to show some kind of a worst case scenario. The test setup is as follows:
>
> I took the largest single .o file I had around (Registry.cpp.o in clangDynamicASTMatchers library). I then linked it to remove any relocations (otherwise, most of the time is spend applying those). Then I modified llvm-dwarfdump to parse the debug_loc section without dumping anything (to avoid measuring the time spend in printfs). Both llvm-dwarfdump and Registry.cpp.o I was dumping were built with -O3 -g, with asserts disabled (no FDO, LTO or other fancy stuff). This resulted in about 4.5 megabytes of debug_loc for parsing in Registry.cpp.o. Then I used the linux perf command to run llvm-dwarfdump -debug-loc 1000 times and dump the stats.
>
> The baseline stats are:
>
>     27.285986      task-clock:u (msec)       #    0.986 CPUs utilized            ( +-  0.11% )
>             0      context-switches:u        #    0.000 K/sec                  
>             0      cpu-migrations:u          #    0.000 K/sec                  
>         2,813      page-faults:u             #    0.103 M/sec                    ( +-  0.24% )
>    58,831,163      cycles:u                  #    2.156 GHz                      ( +-  0.07% )
>       606,986      stalled-cycles-frontend:u #    1.03% frontend cycles idle     ( +-  3.76% )
>     7,924,778      stalled-cycles-backend:u  #   13.47% backend cycles idle      ( +-  0.33% )
>   146,588,727      instructions:u            #    2.49  insn per cycle         
>                                              #    0.05  stalled cycles per insn  ( +-  0.00% )
>    29,545,620      branches:u                # 1082.813 M/sec                    ( +-  0.00% )
>       222,276      branch-misses:u           #    0.75% of all branches          ( +-  0.15% )
>   
>   0.027663381 seconds time elapsed                                          ( +-  0.11% )
>   
>
> The stats with this patch applied are:
>
>     27.397390      task-clock:u (msec)       #    0.987 CPUs utilized            ( +-  0.10% )
>             0      context-switches:u        #    0.000 K/sec                  
>             0      cpu-migrations:u          #    0.000 K/sec                  
>         2,833      page-faults:u             #    0.103 M/sec                    ( +-  0.24% )
>    60,160,571      cycles:u                  #    2.196 GHz                      ( +-  0.07% )
>       584,825      stalled-cycles-frontend:u #    0.97% frontend cycles idle     ( +-  3.37% )
>    10,729,974      stalled-cycles-backend:u  #   17.84% backend cycles idle      ( +-  0.26% )
>   156,141,836      instructions:u            #    2.60  insn per cycle         
>                                              #    0.07  stalled cycles per insn  ( +-  0.00% )
>    31,599,940      branches:u                # 1153.392 M/sec                    ( +-  0.00% )
>       221,247      branch-misses:u           #    0.70% of all branches          ( +-  0.06% )
>   
>   0.027771865 seconds time elapsed                                          ( +-  0.10% )
>   
>
> The stats for a version of this patch which additionally checks for the error flag before attempting the parse (as discussed in the inline comment) are:
>
>     27.808349      task-clock:u (msec)       #    0.986 CPUs utilized            ( +-  0.10% )
>             0      context-switches:u        #    0.000 K/sec                  
>             0      cpu-migrations:u          #    0.000 K/sec                  
>         2,839      page-faults:u             #    0.102 M/sec                    ( +-  0.24% )
>    62,887,388      cycles:u                  #    2.261 GHz                      ( +-  0.06% )
>       575,264      stalled-cycles-frontend:u #    0.91% frontend cycles idle     ( +-  3.18% )
>    14,757,888      stalled-cycles-backend:u  #   23.47% backend cycles idle      ( +-  0.23% )
>   167,562,307      instructions:u            #    2.66  insn per cycle         
>                                              #    0.09  stalled cycles per insn  ( +-  0.00% )
>    33,414,152      branches:u                # 1201.587 M/sec                    ( +-  0.00% )
>       221,454      branch-misses:u           #    0.66% of all branches          ( +-  0.12% )
>   
>   0.028201319 seconds time elapsed                                          ( +-  0.10% )
>   
>
> As can be seen, this patch increases the parsing time by about 0.4%. This is enough to be statistically significant in a benchmark like this, but probably not world-shattering (some slowdown is unavoidable with a change like this).
>
> If we additionally enable the early returns we get an additional 1.5% slowdown (or 1.9% above baseline). Still not bad for a "micro-benchmark", but it does make one wonder, whether it is really worth it. My feeling would be that it isn't...

Sorry I've lost some of the context here (my brain's like a sieve sometimes) - the first set of results come from this patch which, is this correct: implements a wrapper around DataExtractor that handles the offset comparison to determine validity? (if the offset isn't modified, the attempted parsing was invalid in some way)

And the second results are from adding a separate "if (already seen an error) return;" to these same wrapped function calls?

What if "if (already seen an error)" was added to the "isValidOffset" function (making all offsets invalid if there's already an error) - would that collapse the early return codepaths & perhaps make things simpler/faster?

Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63713/new/

https://reviews.llvm.org/D63713