[PATCH] D86694: [scudo] Allow -fsanitize=scudo on Linux and Windows (WIP, don't land as is)

Alexandre Ganea via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Sep 13 18:20:05 PDT 2020


aganea added a comment.

Thanks for working on this @russell.gallop!

I've reproduced your tests, please see below. The only difference is that I've used a ThinLTO build for stage2:
-DCMAKE_CXX_FLAGS="/GS- -Xclang -O3 -fstrict-aliasing -march=skylake-avx512 -flto=thin -fwhole-program-vtables"
Running with `/opt:lldltojobs=all` no `/lldltocache`.
Results on a 36-core (dual mount) Xeon Gold 6140.

  (WinHeap vs. Scudo+options)
  
  D:\llvm-project>hyperfine -m 3 -w 1 "cd d:\llvm-project\buildninjaRelWinHeap3 && d:\llvm-project\buildninjaRelWinHeap2\bin\lld-link.exe @CMakeFiles\clang.rsp" "cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp"
  Benchmark #1: cd d:\llvm-project\buildninjaRelWinHeap3 && d:\llvm-project\buildninjaRelWinHeap2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     664.086 s ± 18.740 s    [User: 0.0 ms, System: 0.0 ms]                                          5
    Range (min … max):   647.070 s … 684.172 s    3 runs
  
  Benchmark #2: cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     145.619 s ±  0.140 s    [User: 0.0 ms, System: 8.1 ms]                                          0
    Range (min … max):   145.522 s … 145.779 s    3 runs
  
  Summary
    'cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp' ran
      4.56 ± 0.13 times faster than 'cd d:\llvm-project\buildninjaRelWinHeap3 && d:\llvm-project\buildninjaRelWinHeap2\bin\lld-link.exe @CMakeFiles\clang.rsp'

  (Scudo+options vs. Rpmalloc)
  
  D:\llvm-project>hyperfine -m 3 -w 1 "cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp" "cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp"
  Benchmark #1: cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     95.423 s ±  0.830 s    [User: 0.0 ms, System: 9.0 ms]                                           0
    Range (min … max):   94.886 s … 96.380 s    3 runs
  
  Benchmark #2: cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     145.266 s ±  0.387 s    [User: 4.9 ms, System: 7.6 ms]                                          6
    Range (min … max):   144.894 s … 145.666 s    3 runs
  
  Summary
    'cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp' ran
      1.52 ± 0.01 times faster than 'cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp'

  (Scudo vs. Rpmalloc)
  
  D:\llvm-project>hyperfine -m 3 -w 1 "cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp" "cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp"
  Benchmark #1: cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     95.435 s ±  0.059 s    [User: 0.0 ms, System: 8.0 ms]                                           0
    Range (min … max):   95.385 s … 95.499 s    3 runs
  
  Benchmark #2: cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp
    Time (mean ± σ):     270.967 s ±  1.366 s    [User: 4.8 ms, System: 0.0 ms]                                          0
    Range (min … max):   269.397 s … 271.887 s    3 runs
  
  Summary
    'cd d:\llvm-project\buildninjaRelRpmalloc3 && d:\llvm-project\buildninjaRelRpMalloc2\bin\lld-link.exe @CMakeFiles\clang.rsp' ran
      2.84 ± 0.01 times faster than 'cd d:\llvm-project\buildninjaRelScudo3 && d:\llvm-project\buildninjaRelScudo2\bin\lld-link.exe @CMakeFiles\clang.rsp'

Summary:

|               | Time                 | Factor |
| WinHeap       | 664.086 s ± 18.740 s | 1.0    |
| Scudo         | 270.967 s ±  1.366 s | 2.45   |
| Scudo+options | 145.619 s ±  0.140 s | 4.56   |
| Rpmalloc      | 95.423 s ±  0.830 s  | 6.95   |
|

CPU usage:
Rpmaloc - 3,944 cumulated seconds (all threads)
F12940831: image.png <https://reviews.llvm.org/F12940831>
Scudo+options - 6,337 cumulated seconds (all threads)
F12940833: image.png <https://reviews.llvm.org/F12940833>

Time spent in the allocator itself (note the different vertical scale in the graph)
(a hardware CRC or AES implemention will certainly help for Scudo)
Rpmalloc - 191 cumulated seconds
F12940851: image.png <https://reviews.llvm.org/F12940851>
Scudo+options - 1,171 cumulated seconds
F12940855: image.png <https://reviews.llvm.org/F12940855>

Memory usage:
Rpmalloc - Peaks at 11 GB commit (19 GB mapped)
F12940870: image.png <https://reviews.llvm.org/F12940870>
Scudo+options - Peaks at 5 GB commit (although 4.4 TB of mapped pages!!!)
F12940882: image.png <https://reviews.llvm.org/F12940882>


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86694/new/

https://reviews.llvm.org/D86694



More information about the llvm-commits mailing list