[llvm-commits] PATCH: Switch StringMap to the new hashing infrastructure

Chandler Carruth chandlerc at google.com
Mon Mar 5 02:26:59 PST 2012


Ahem, with the patch this time. ;]

On Mon, Mar 5, 2012 at 2:17 AM, Chandler Carruth <chandlerc at google.com>wrote:

> Hello folks,
>
> This is the oh-so-controversial cut-over. ;] I'm continuing to convert
> other places which aren't quite so performance sensitive or (in most cases)
> use significantly slower functions, but I wanted to get confirmation on
> this one.
>
> I've performed as careful analysis of the performance impact of this
> change as I can. I've run the lex-only mode with before and after clang
> binaries over very large source files. The runs were under the Linux 'perf'
> tool which does hardware perf-event based instrumenting of the execution,
> and I had it perform 40 runs of each command. I've run these with a forced
> maximum kernel scheduling priority and affinity pinned to a single CPU.
> This reduces the context switches to one (the CC1 invocation), results in
> zero CPU migrations, and ensures that all measurements occur on the same
> core of the same chip with a warm cache. The timings are (consequentially)
> quite stable.
>
> Here are the numbers for single-source GCC:
>  Performance counter stats for './bin/old_clang -fsyntax-only -Xclang
> -Eonly ../tools/clang/INPUTS/gcc.c -w' (40 runs):
>
>         653.285653 task-clock                #    0.998 CPUs utilized
>        ( +-  0.19% )
>                  1 context-switches          #    0.000 M/sec
>        ( +-  0.00% )
>                  0 CPU-migrations            #    0.000 M/sec
>        ( +- 42.37% )
>             21,909 page-faults               #    0.034 M/sec
>        ( +-  0.00% )
>      1,656,298,378 cycles                    #    2.535 GHz
>        ( +-  0.18% )
>        584,610,147 stalled-cycles-frontend   #   35.30% frontend cycles
> idle     ( +-  0.52% )
>        368,002,172 stalled-cycles-backend    #   22.22% backend  cycles
> idle     ( +-  0.77% )
>      2,240,704,537 instructions              #    1.35  insns per cycle
>                                              #    0.26  stalled cycles per
> insn  ( +-  0.02% )
>        494,832,319 branches                  #  757.452 M/sec
>        ( +-  0.02% )
>         17,343,711 branch-misses             #    3.50% of all branches
>        ( +-  0.05% )
>
>        0.654568555 seconds time elapsed
>        ( +-  0.19% )
>
>
>  Performance counter stats for './bin/new_clang -fsyntax-only -Xclang
> -Eonly ../tools/clang/INPUTS/gcc.c -w' (40 runs):
>
>         652.817626 task-clock                #    0.998 CPUs utilized
>        ( +-  0.16% )
>                  1 context-switches          #    0.000 M/sec
>        ( +-  0.00% )
>                  0 CPU-migrations            #    0.000 M/sec
>        ( +- 69.80% )
>             21,913 page-faults               #    0.034 M/sec
>        ( +-  0.00% )
>      1,655,504,961 cycles                    #    2.536 GHz
>        ( +-  0.16% )
>        579,918,671 stalled-cycles-frontend   #   35.03% frontend cycles
> idle     ( +-  0.47% )
>        356,098,374 stalled-cycles-backend    #   21.51% backend  cycles
> idle     ( +-  0.72% )
>      2,240,570,707 instructions              #    1.35  insns per cycle
>                                              #    0.26  stalled cycles per
> insn  ( +-  0.02% )
>        486,263,162 branches                  #  744.868 M/sec
>        ( +-  0.02% )
>         16,554,052 branch-misses             #    3.40% of all branches
>        ( +-  0.05% )
>
>        0.654093465 seconds time elapsed
>        ( +-  0.16% )
>
>
> Here are the numbers for a single-source file with every source file in
> Lex, Parse, and Sema from Clang included into it:
>  Performance counter stats for './bin/old_clang -x c++ -fsyntax-only
> -Xclang -Eonly -I../include -Iinclude -I../tools/clang/include
> -Itools/clang/include ../tools/clang/INPUTS/all-clang.cpp' (40 runs):
>
>         249.007880 task-clock                #    0.998 CPUs utilized
>        ( +-  0.27% )
>                  1 context-switches          #    0.000 M/sec
>        ( +-  0.00% )
>                  0 CPU-migrations            #    0.000 M/sec
>        ( +- 48.04% )
>              9,013 page-faults               #    0.036 M/sec
>        ( +-  0.00% )
>        630,166,134 cycles                    #    2.531 GHz
>        ( +-  0.26% )
>        243,962,215 stalled-cycles-frontend   #   38.71% frontend cycles
> idle     ( +-  0.70% )
>        152,285,530 stalled-cycles-backend    #   24.17% backend  cycles
> idle     ( +-  0.90% )
>        776,325,373 instructions              #    1.23  insns per cycle
>
>                                              #    0.31  stalled cycles per
> insn  ( +-  0.03% )
>        180,715,948 branches                  #  725.744 M/sec
>        ( +-  0.02% )
>          6,543,840 branch-misses             #    3.62% of all branches
>        ( +-  0.03% )
>
>        0.249614202 seconds time elapsed
>        ( +-  0.27% )
>
>
>  Performance counter stats for './bin/new_clang -x c++ -fsyntax-only
> -Xclang -Eonly -I../include -Iinclude -I../tools/clang/include
> -Itools/clang/include ../tools/clang/INPUTS/all-clang.cpp' (40 runs):
>
>         247.344347 task-clock                #    0.998 CPUs utilized
>        ( +-  0.24% )
>                  1 context-switches          #    0.000 M/sec
>        ( +-  0.00% )
>                  0 CPU-migrations            #    0.000 M/sec
>        ( +-  0.00% )
>              9,018 page-faults               #    0.036 M/sec
>        ( +-  0.00% )
>        626,293,160 cycles                    #    2.532 GHz
>        ( +-  0.23% )
>        240,474,650 stalled-cycles-frontend   #   38.40% frontend cycles
> idle     ( +-  0.60% )
>        147,531,931 stalled-cycles-backend    #   23.56% backend  cycles
> idle     ( +-  0.84% )
>        767,828,567 instructions              #    1.23  insns per cycle
>
>                                              #    0.31  stalled cycles per
> insn  ( +-  0.03% )
>        173,028,132 branches                  #  699.544 M/sec
>        ( +-  0.02% )
>          6,031,206 branch-misses             #    3.49% of all branches
>        ( +-  0.04% )
>
>        0.247948747 seconds time elapsed
>        ( +-  0.24% )
>
>
> The measured changes in performance are all down into the noise floor,
> despite everything I did to lower that. From what I can tell, this change
> has no observable performance impact, and reduces the number of hashing
> functions in use. Good to commit?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120305/d9f64b43/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: switch-stringmap-hashing.patch
Type: application/octet-stream
Size: 1131 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20120305/d9f64b43/attachment.obj>


More information about the llvm-commits mailing list