[llvm] [RFC] [compiler-rt] make MSan up to 20x faster on AMD CPUs (PR #171993)
via llvm-commits
llvm-commits at lists.llvm.org
Fri Dec 12 03:24:22 PST 2025
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-compiler-rt-sanitizer
Author: Azat Khuzhin (azat)
<details>
<summary>Changes</summary>
I noticed that on AMD CPU (so far I've tested on Zen 3 and Zen 4c - AMD EPYC 9R14) a simple program under MSan is up to 20x slower:
#include <stdio.h>
#include <time.h>
#include <stdint.h>
uint64_t factorial(int n) {
if (n <= 1) return 1;
return n * factorial(n - 1);
}
int main() {
const int iterations = 100000000;
clock_t start = clock();
for (int i = 0; i < iterations; i++) {
volatile uint64_t result = factorial(20);
}
double elapsed = (double)(clock() - start) / CLOCKS_PER_SEC;
printf("Direct loop: %.3f seconds\n", elapsed);
return 0;
}
The problem here is the `volatile`, but the underlying problem apparently is that cache conflicts, `result` and it's address in shadow area has conflicts, and overwrites each other, so it has tons of cache misses:
Performance counter stats for './factorial-test-original':
212,850,471 L1-dcache-loads
200,634,333 L1-dcache-load-misses # 94.26% of all L1-dcache accesses
<not supported> L1-dcache-stores
1.232666099 seconds time elapsed
1.228437000 seconds user
0.000994000 seconds sys
To avoid this conflicts we can add size of one cache line to the shadow addresses, and here are the results - 20x improvement:
$ /usr/bin/clang++ -fsanitize=memory -O3 factorial-test.c -o factorial-test-original
$ ./factorial-test-original
Direct loop: 1.223 seconds
$ clang++ -fsanitize=memory -O3 factorial-test.c -o factorial-test-patched
$ ./factorial-test-patched
Direct loop: 0.060 seconds
I've tested performance on Intel CPUs (Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz), and it looks the same after the patch.
Curious about what you think about this!
---
Full diff: https://github.com/llvm/llvm-project/pull/171993.diff
1 Files Affected:
- (modified) llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp (+1-1)
``````````diff
diff --git a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
index 32ee16c89b4fe..1e8253276555e 100644
--- a/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+++ b/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
@@ -442,7 +442,7 @@ static const MemoryMapParams Linux_I386_MemoryMapParams = {
// x86_64 Linux
static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
0, // AndMask (not used)
- 0x500000000000, // XorMask
+ 0x500000000040, // XorMask
0, // ShadowBase (not used)
0x100000000000, // OriginBase
};
``````````
</details>
https://github.com/llvm/llvm-project/pull/171993
More information about the llvm-commits
mailing list