[clang] [compiler-rt] [llvm] [RFC] [msan] make MSan up to 20x faster on AMD CPUs (PR #171993)
Thurston Dang via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 14 10:40:51 PST 2025
================
@@ -443,8 +443,8 @@ static const MemoryMapParams Linux_I386_MemoryMapParams = {
static const MemoryMapParams Linux_X86_64_MemoryMapParams = {
0, // AndMask (not used)
0x500000000000, // XorMask
- 0, // ShadowBase (not used)
- 0x100000000000, // OriginBase
+ 0x200000, // ShadowBase (== kShadowOffset)
----------------
thurstond wrote:
@Camsyn Thanks, good point!
@azat The machine code is longer if a non-zero immediate is needed:
```
48 c7 80 00 00 20 00 00 00 00 00 mov QWORD PTR [rax+0x200000],0x0
vs.
48 c7 00 00 00 00 00 mov QWORD PTR [rax],0x0
c6 80 00 00 20 00 00 mov BYTE PTR [rax+0x200000],0x0
vs.
19: c6 00 00 mov BYTE PTR [rax],0x0
```
Even if the CPU could execute both forms at the same speed, on the same execution ports, it is still increasing code size, icache pressure, etc.
Although it's probably not a huge impact, it would still be hard to justify making codegen worse for all x86 targets when the upside is only for a subset of Zen processors.
I do want MSan to work well for Zen as well, so how about a compile-time macro (when compiling MSan itself, not the target app) that enables the 2MB offset?
https://github.com/llvm/llvm-project/pull/171993
More information about the llvm-commits
mailing list