[libcxx-commits] [PATCH] D125329: Replace modulus operations in std::seed_seq::generate with conditional checks.

Laramie Leavitt via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue May 10 16:01:43 PDT 2022


laramiel added a comment.

Ok, I managed to run the benchmarks with the before and after on both my x86 workstation and my mac M1 <https://reviews.llvm.org/M1> laptop. For the M1 <https://reviews.llvm.org/M1>, the new code is about the same, but the new code is dramatically faster on x86.

<Mac LAPTOP>

CPU Caches:

  L1 Data 64 KiB (x10)
  L1 Instruction 128 KiB (x10)
  L2 Unified 4096 KiB (x5)

Load Average: 3.74, 4.25, 3.59

OLD:
----

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------

BM_SeedSeq_Generate/1/1          16.9 ns         16.9 ns     39431288
BM_SeedSeq_Generate/8/1          50.1 ns         50.0 ns     13943668
BM_SeedSeq_Generate/16/1         88.4 ns         87.9 ns      7974663
BM_SeedSeq_Generate/1/8          56.1 ns         56.0 ns     12601941
BM_SeedSeq_Generate/8/8          60.7 ns         60.7 ns     11631385
BM_SeedSeq_Generate/16/8         97.0 ns         93.6 ns      7510568
BM_SeedSeq_Generate/1/64          564 ns          561 ns      1259672
BM_SeedSeq_Generate/8/64          549 ns          549 ns      1241113
BM_SeedSeq_Generate/16/64         541 ns          540 ns      1294139
BM_SeedSeq_Generate/1/256        2299 ns         2296 ns       304877
BM_SeedSeq_Generate/8/256        2298 ns         2297 ns       305722
BM_SeedSeq_Generate/16/256       2300 ns         2299 ns       305262

NEW:
----

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------

BM_SeedSeq_Generate/1/1          17.0 ns         17.0 ns     38130515
BM_SeedSeq_Generate/8/1          49.5 ns         49.5 ns     14140414
BM_SeedSeq_Generate/16/1         86.6 ns         86.1 ns      8184933
BM_SeedSeq_Generate/1/8          47.3 ns         47.2 ns     14856161
BM_SeedSeq_Generate/8/8          50.7 ns         50.7 ns     13919268
BM_SeedSeq_Generate/16/8         79.8 ns         79.6 ns      8695436
BM_SeedSeq_Generate/1/64          520 ns          520 ns      1361550
BM_SeedSeq_Generate/8/64          519 ns          519 ns      1355355
BM_SeedSeq_Generate/16/64         525 ns          525 ns      1319261
BM_SeedSeq_Generate/1/256        2198 ns         2194 ns       317959
BM_SeedSeq_Generate/8/256        2196 ns         2194 ns       320623
BM_SeedSeq_Generate/16/256       2215 ns         2201 ns       317943

<x86>
Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz
Run on (12 X 4000 MHz CPU s)
CPU Caches:

  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 15360 KiB (x1)

Load Average: 0.34, 0.38, 0.36

OLD:
----

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------

BM_SeedSeq_Generate/1/1          51.9 ns         51.9 ns     12867935
BM_SeedSeq_Generate/8/1           179 ns          179 ns      3915093
BM_SeedSeq_Generate/16/1          325 ns          325 ns      2158696
BM_SeedSeq_Generate/1/8           361 ns          361 ns      1960768
BM_SeedSeq_Generate/8/8           342 ns          342 ns      1982211
BM_SeedSeq_Generate/16/8          485 ns          484 ns      1449114
BM_SeedSeq_Generate/1/64         3010 ns         3008 ns       234187
BM_SeedSeq_Generate/8/64         2966 ns         2964 ns       237480
BM_SeedSeq_Generate/16/64        2910 ns         2909 ns       240558
BM_SeedSeq_Generate/1/256       12132 ns        12127 ns        58263
BM_SeedSeq_Generate/8/256       12051 ns        12046 ns        58368
BM_SeedSeq_Generate/16/256      12169 ns        12163 ns        58210

NEW:
----

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------

BM_SeedSeq_Generate/1/1          26.1 ns         26.1 ns     25494346
BM_SeedSeq_Generate/8/1          45.5 ns         45.5 ns     15384561
BM_SeedSeq_Generate/16/1         78.0 ns         77.9 ns      9019345
BM_SeedSeq_Generate/1/8          65.0 ns         65.0 ns     10775171
BM_SeedSeq_Generate/8/8          69.4 ns         69.4 ns     10152010
BM_SeedSeq_Generate/16/8          100 ns        100.0 ns      7041379
BM_SeedSeq_Generate/1/64          488 ns          488 ns      1426010
BM_SeedSeq_Generate/8/64          488 ns          488 ns      1434429
BM_SeedSeq_Generate/16/64         489 ns          488 ns      1322369
BM_SeedSeq_Generate/1/256        1938 ns         1938 ns       362517
BM_SeedSeq_Generate/8/256        1934 ns         1934 ns       361980
BM_SeedSeq_Generate/16/256       1936 ns         1936 ns       361482


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125329/new/

https://reviews.llvm.org/D125329



More information about the libcxx-commits mailing list