[PATCH] D47289: [scudo] Improve the scalability of the shared TSD model
Kostya Kortchinsky via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jun 7 14:41:43 PDT 2018
cryptoad added a comment.
Here are more detail numbers for t-test1.
The machine has 72 cores. We are using the shared TSD version with 32 caches (to exercise some contention).
The numbers are the total time (averaged and rounded over 3 consecutive runs) spent in allocation functions only with 40, then 80 concurrent threads:
- current upstream: 960s, 3315s
- with precedence, max 4 caches scanned, lock current: 810s, 3200s (current CL proposed)
- with precedence, max 4 caches scanned, lock random: 815s, 3125s
- with precedence, all caches scanned, lock current: 880s, 3940s
- with precedence, all caches scanned, lock random: 890s, 3755s
- no precedence, max 4 caches scanned, lock current: 900s, 3365s
- no precedence, max 4 caches scanned, lock random: 840s, 3300s
- no precedence, all caches scanned, lock current: 1025s, 3600s
- no precedence, all caches scanned, lock random: 890s, 3785s
Locking a random cache in the event of heavier contention seems to be beneficial, but not necessarily with lesser contention.
Since I am more interested in striking a middle ground rather than aiming for contentious applications, it looks like the precedence matters, as well as not scanning all the caches but limiting ourselves to 4.
Repository:
rCRT Compiler Runtime
https://reviews.llvm.org/D47289
More information about the llvm-commits
mailing list