<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/133338>133338</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Sanitizer test failures when the sanitizer runtime was built using an older glibc than the host machine
</td>
</tr>
<tr>
<th>Labels</th>
<td>
compiler-rt:msan
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
zeroomega
</td>
</tr>
</table>
<pre>
We recently ran into some MSan, ASan and TSan test failures after we upgrade our clang toolchain infra builders from Debian 11 to Debian 12. We looked into them and discover that they were caused by the symbol resolution logic on versioned symbols and the issues need further discussion.
Some background info on how we build our toolchain, for both clang itself and the runtimes built by the clang, we always uses a pre-packed sysroot from an old release of Ubuntu. The glibc shipped in this sysroot is 2.19. The reason for this is compatibility, that we would like to make sure the toolchain binary we built can be ran on most of linux machines in the wild.
We first discover test failures on `MemorySanitizer-AARCH64 :: Linux/sunrpc.cpp`, `MemorySanitizer-AARCH64 :: Linux/sunrpc_bytes.cpp` and `MemorySanitizer-AARCH64 :: Linux/sunrpc_string.cpp`. They all failed with following error when we build and test the toolchain on Debian 12:
```
fatal error: 'rpc/xdr.h' file not found
17 | #include <rpc/xdr.h>
| ^~~~~~~~~~~
1 error generated.
```
Further looking into the RUN command in the LIT test reveal that these tests were built using the host sysroot, instead of the `CMAKE_SYSROOT` defined in the build system. As newer glibc (from 2.33) dropped the sunrpc, the header files for them are no longer exist. We attempted to address issue by propagating the sysroot from build environment to the LIT test using patch https://github.com/zeroomega/llvm-project/commit/fdcbf3f9e83d8ab3cb92fafadc15db8c5e6031cc . However, the tests were still failing, but now with segfault caused by nullptr:
```
=801220==Hint: address points to the zero page.
#0 0x000000000000 (<unknown module>)
#1 0x55ee067a4b74 (/mnt/nvme_crypt/SRC/llvm-project/build/sunrpc.msan+0x9cb74) (BuildId: bdc7650508132b11eff175c4d8eaca7d9b96d109)
#2 0x55ee0679ce06 (/mnt/nvme_crypt/SRC/llvm-project/build/sunrpc.msan+0x94e06) (BuildId: bdc7650508132b11eff175c4d8eaca7d9b96d109)
#3 0x55ee0679d190 (/mnt/nvme_crypt/SRC/llvm-project/build/sunrpc.msan+0x95190) (BuildId: bdc7650508132b11eff175c4d8eaca7d9b96d109)
#4 0x7f0c820a0e1f (/lib/x86_64-linux-gnu/libc.so.6+0x3fe1f) (BuildId: ea119b374e0f8f858c82ad03a9542414f9ea1c8c)
```
This was reproduced locally in GDB:
```
This is locally reproduced in GDB:
B+>0x555555625e16 <___interceptor_xdrmem_create()+214> mov %r15,%rdi
│ 0x555555625e19 <___interceptor_xdrmem_create()+217> mov %rbx,%rsi
│ 0x555555625e1c <___interceptor_xdrmem_create()+220> mov %ebp,%edx
│ 0x555555625e1e <___interceptor_xdrmem_create()+222> mov %r14d,%ecx
│ 0x555555625e21 <___interceptor_xdrmem_create()+225> call *0x4994999(%rip) # 0x555559fba7c0 <_ZN14__interception18real_xdrmem_createE>
│ 0x555555625e27 <___interceptor_xdrmem_create()+231> mov %r15,%rax
│ 0x555555625e2a <___interceptor_xdrmem_create()+234> and %r13,%rax
(gdb) x/x 0x555559fba7c0
0x555559fba7c0 <_ZN14__interception18real_xdrmem_createE>: 0x00000000
(gdb)
```
It turned out the function pointer to the original (real) function (`xdrmem_create`) that the msan test was testing, was nullptr. We then inspect the libc.so from our pre-packed sysroot and compared it to the one from the host (Debian 12) and discovered that in glibc 2.19 (and 2.31), `xdrmem_create` symbol is `xdrmem_create@@GLIBC_2.2.5` while in glibc 2.36 , it is `xdrmem_create@GLIBC_2.2.5` . In other words, in glibc 2.36 and up, the symbol is no longer the default versioned symbol. We also inspect the code in sanitizer runtime that responsible to retrieve the address of the original function: https://github.com/llvm/llvm-project/blob/d2ac2776021def69e9d0e64c9a958e0c361a919a/compiler-rt/lib/interception/interception_linux.cpp#L36-L57
This explained why sanitizer had trouble to resolve the function addresses of versioned symbols when they are no longer set to default, as it didn't consider non-default versioned symbols.
We then did another experiment. We manually build the sunrpc.cpp using the pre-packed sysroot (glibc 2.19) without MSan and run it on Debian 12 (glibc 2.36). It ran fine. We looked into its symbol table and discovered it was linked to the versioned symbol `xdrmem_create@GLIBC_2.2.5`. Unlike the msan instrumented one, which doesn't have the symbol version info.
We think the fact that some tests are built with a sysroot that is different from the sysroot during the toolchain build is a bug, for following reasons:
* runtimes test should test the behavior of the just-built runtimes, the build environment should be consistent
* In builder bots that were set in hermetic ways, host may not have all the required include files.
This issue can be addressed by pipe through the build time sysroot argument to the lit test, in `compiler-rt/test/msan/lit.cfg.py`, as shown in https://github.com/zeroomega/llvm-project/commit/fdcbf3f9e83d8ab3cb92fafadc15db8c5e6031cc
We think it is a bug when the uninstrumented binary runs fine while the sanitizer instrumented one crashes due to symbol resolution failures. And it should be addressed properly. However, for this one, we currently don't have an optimal solution to this. Ideally, in `GetFuncAddr`, there should be an optional symbol version string in the arguments, the symbol should be resolved through the `dlvsym` instead of `dlsym`. However, the symbol versions in the so files are link time information, they might be unavailable during the codegen when the compiler sets up the interceptor and and the symbol names for the original function. In order to solve it, we probably need the compiler or the build system to generate a separate file (or a separate section in ELF) to record the symbol versions info and this file (or section) should be later read by the sanitizer runtime when resolving the function address. This is non-trival and need broader discussion.
Following sanitizer tests are affected:
```
MemorySanitizer-AARCH64 :: Linux/sunrpc.cpp
MemorySanitizer-AARCH64 :: Linux/sunrpc_bytes.cpp
MemorySanitizer-AARCH64 :: Linux/sunrpc_string.cpp
SanitizerCommon-asan-aarch64-Linux :: Linux/dn_expand.cpp
SanitizerCommon-asan-aarch64-Linux :: Linux/xdrrec.cpp
SanitizerCommon-msan-aarch64-Linux :: Linux/dn_expand.cpp
SanitizerCommon-msan-aarch64-Linux :: Linux/xdrrec.cpp
SanitizerCommon-tsan-aarch64-Linux :: Linux/dn_expand.cpp
SanitizerCommon-tsan-aarch64-Linux :: Linux/pthread_mutex.cpp
SanitizerCommon-tsan-aarch64-Linux :: Linux/xdrrec.cpp
ThreadSanitizer-aarch64 :: sunrpc.cpp
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy8WUtv47a3_zTKhoghUZZsLbJwMnEb3JkWmExR3LsJKPHIYkORuiQV213cz35xSEmWnUzatIN_kIUt8zx4Hr_zELNW7BTATZTdRtmnK9a7RpubP8Fo3cKOXZWaH29-B2KgAuXkkRimiFBOE6tbIF8emYroHdk8MkWY4uQbfnBgHamZkL0BS1jtwJA9kL7bGcaB6N6QSjK1I05rWTVMIMvaMFL2QnIwltRGt-QTlIIpkiTE6ekLXZDfgUitn4EHRVwDrZfNha30CxjiGubw8ZHswQCpWG-Bk_KIz4g9tqWWxIDVsndCKyL1TlREK_ICxgqtgA-HrGeLRMLaHixRAJzUvXENGC-ut0iwiOJNFG8e0SIlq553RvcKtas1sm30Hq_vL-cvP10bTVdrQ0rtmsEkwlmQ9STY9MqJFqynduMd_FEk3gNhcs-OlvQWTU06A9cdq579HazR2gVbMkW05MSABGaB6Jr8VvbK9QvyrQGyk6KsiG1E13mrEtcIOzEQltBFUoSjBpjVymvtDwlLKt12zIlSSOGOqJW3_x7IXveSEymeAT3Ysmcgtjfgr3ByfSkUM8fRQo5UTJESfKBpRVptHaorheoPpGVVIxTYoCOQvZB8sP7vQGphrJuFwVkYakWiPP4CrTbHR6aEE3-Cud5svt79nC9JlG6idEM-o5SIbm2vTFctqq6L8hiv9FHSp_LowA4MvDc_zME6I9RuYOGNfyRMSn8j4GQvXENqLaXeC7UjYIw2ZN-AOsWaDyI0wrnBtTplE4r25sN7hv94UzPHZOCIekV0ZboqotsDN4smoitSCwlEYWxhoEfxhhCSrEi0uiMRTYWqZM-BROndnC69DwcJ8Qenvyi7_7_pL4o3yXCXHSgwzEFw8Ey97ZCACAJ49REFyNfffsFgbJniY4B8fvgWTGDgBZicoMGCf2wDQoTA6y1yQ6oGg24If_S-UNYB4xiH-HOUx3dfNv91__T4349ff_31G7qYQy0UTHKDA-zROmgXZIPQsQczJFpE1z4p6SJNI1oQbrRPPA9P3vkhi4A0wDgYb2875ByCnUHrE6nVDgyBg7DOoyJzDtrOISdNGOcGrA3QhcDRGd2xHXPjJc_wIegL6kUYrVpQjgw2nQwYrNMxVzWkca6zGDp0G9HtTrimLxeVbiO6nUpHRLdSvrTXndF_QOUiukXXCPxQ86qs07qAdcrXrEyrsqA1qxmvkoyX6yqDPE6TqiIL8rPewwuY0R4zn1knhmQQAQvL3hGFUIuJYWFXs96jyQj-qpeyc-bNiI_ST-s4oTSO0k9R-ulnoRxG_mjDTgvl7GgSvCLp2A4WQ0BHNI1JfIhnf_hwHaV3vXpWeo8wxnsJmAO0OFElJD5kGUCcr9iyXC0DFd22Cs2kXlp4qsyxwy-PX-9eW9R77QRXrcVqfBsfiqpcLTGyIrq-xTMPHK9T8mqVZ3EWr5OUlkkCdZ2ssmrJ18AqtuJFWeQ8iYszHelJx6KCOP9hOi4hzn-QjulMR54U8Q_TMUuK-AfpuCTxYVXH1ZrGLIakHnWUokSIXOdP-fLaV7nrnerDD9XC6kXuVUlrSOrXqgBLkqJMV0uI63W9ztbVmjIep6zIlnSZLOsCWFKtq6DKZdTHm29Yw_fMEgOd0byvgBOpKyblEaHsp0-3Q77MyL4NdX88NyOdk9xG9DZK79Ez-JfTDJIci8LT05NQDkwFndPm6cBNC-1TZYA58DYpInpLk2WU3vsC0eqXYMPMJFlE7_ADF6jUPY2KZbSmhJAzMcUHxKzeEFMeBjH2XTHV3xeDyHIpBsouiAF-eE8MfEAMfctoSz7Iqd6TQ5MPyMlGORgCXs4mPiyLYlkUhT-WGdFhuI5FnqajsKIu2aqKvbD_-SVZzgQKrZK1ASbP5d6H3uG7iq_-vuJp8l5UsXftwz4gZgpe7EVGMemZmHgT0fWOl2glbPsOFwaK4s2_s1g6dFunyjSXeZHUD4643mAHo_vQMNa9qvx85Ksf9tOh_GkjdkIxiUiEklH_6SyaII_PrZJ7CB07L4LoGloKxB38MBRw_DoUad_OOGxmhbIdVIFyQMTQseAg9ca0g_b2A4lBOJraGK0gkE3tXUTXpy6YFmcTpO_FmEM4Cx0bzj9IgYfoIk28m_1UcHnVcb4U9vWvyzhaxj99fri9e6ILusjw-L7BXnomKM2Jbzndd1hc0C_IgyLa98N7bbgN7eqcG-rcd2MHdVLv1EHicw6hX7qcgUNjKa0-80SluVfajrPMOKgGuxmwnVZWlNJPfgacEfASBr-xpxp66SmaxhDCsH2nv8TK_UYBlxqrKKesoqtVHtOEQ50XUPAY8mVVsCJbQ1ylecKKpGChGe2EBHNt3FSE51l18fXJV2Y_jNH0c5pff85WswoKh04y3__vm-PMKg3jxBndT4awWg52mFJmMAh4k7xeQfiRzi8zztt-Cz64B7-he5nFqOGCq4iuHKnQAzg9KK2uv-deuyDT-OzzjQucG0NAwaEDI3Aa8FHQMtX7gh-GhdO0gmaZTU9vZCWizpRImG3YpCPQfBkXR6ZXqP18Np1TpdgqLsiD83sBHLVe7YGEs2N0O4YGv8hoEQBHCvUcJiTU9dIef51xC_KbCiuNEctwOjQ9mgnRU4HHskZUDeEabHBGwwa3D2IGuX5HdOYCoZ5DeDCfasyFRVsYfNg0q_ohh032DWhlCRd1DQbHtwnsxiO8N6ODZrsX70lhCSNlvxvXUaetQtj22KEHpJvTPsoDuG38hmfaMZTQsBehzZjcf_TWXQeFR8IRh14PnAOzEkLkWgfKDVIf1LgaJKXGQSwsmHAKBI_SDZgWnKjInh29CA_xLTv6LYU3PvYpfqMG_9sLHw_DmsIP14tZMoeJeVhDjdnpR8hOdOgjo_tdM7uFB76pAJldPx-gJRYhsMMeAePrHHrCb9swc2ylcIuq3i2647B4YhYNs1f-lv-ZsfsiGkMl8gEygRHp1VnUDzs80yvrk3Moaz7-JjC8zBNSGWYbsIT3Hh1fb2bH5d2CbJRP4FOMnPzSGd2BkcezZcG0nxzzEUjVGxN22FzPkpIpojsnWibJJNe7TtgFeeCAiHfy3U_gtr2qNpybwUGIlDBXLPDTWNMukj1s9MYV0Rgo9qI0n1gN5YKfhVyUx1y-2GOL1X-2mPLPw-NXa5NzPabtKbZSfrOEuCI98mAkIyaZloUqeBdqTyt2jUOdesVemJAeYGeQgg3BDtQpQMYgxwy1pO_CGv3UOXt0Hrfcg36Ktac11-vOILQ6hodWNFRS4QbvdkaXrJTHsKM_02DgN1_JIYNxv4gwCh3zH_1iM6Jr1O_01EKo1EKR-89b38xiLa-04d8xb62Huwk75zkwQg4nL0uGzbVBN46vKF51Vd6sIRxGi1_2DwsyzuVY750RL0x6JbxBSqP9JnH20mKoOtsJ7E9iT9WG1TVUDvibW7OPL9M_SDPbon-UcrY9jzcT0Z1uW62umWXqmjFTNfny2hNecuHqCQ4dU_wfczhwY6D6Dnn7rxX4aw7vKuD-tQJ_zaFzDUb1U9s7OPxjLmfX-OY5nmJgoBxpziJtitQrfpPyIi3YFdwkq2WaJKskz66amzivWAY15VmelTRfrzOe1ss8hzTnNMtWV-KGxjSLU7qiaUKX2SJO1nXGsnpdpDkr11m0jKFlQi6w5i602V355uEmSdM0XV9JVoK0_uUqpfOyn25CvadR9unK3PiKXfY7Gy1jKayzJ35OOAk3j2epeXqpNcHtG5DB7Nl7jfAGcHoH4RqmTsPw8F7tqjfy5sPDV3g5GtHtcOuXG_r_AQAA___dlhH1">