[compiler-rt] 4369de7 - [compiler-rt] Avoid memintrinsic calls inserted by the compiler

Marco Elver via llvm-commits llvm-commits at lists.llvm.org
Wed May 31 07:59:18 PDT 2023


Author: Marco Elver
Date: 2023-05-31T16:58:53+02:00
New Revision: 4369de7af46605522bf7dbe3bc31d00b0eb4bee6

URL: https://github.com/llvm/llvm-project/commit/4369de7af46605522bf7dbe3bc31d00b0eb4bee6
DIFF: https://github.com/llvm/llvm-project/commit/4369de7af46605522bf7dbe3bc31d00b0eb4bee6.diff

LOG: [compiler-rt] Avoid memintrinsic calls inserted by the compiler

D135716 introduced -ftrivial-auto-var-init=pattern where supported.
Unfortunately this introduces unwanted memset() for large stack arrays,
as shown by the new tests added for asan and msan (tsan already had this
test).

In general, the problem of compiler-inserted memintrinsic calls
(memset/memcpy/memmove) is not new to compiler-rt, and has been a
problem before.

To avoid introducing unwanted memintrinsic calls, we redefine
memintrinsics as __sanitizer_internal_mem* at the assembly level for
most source files automatically (where sanitizer_common_internal_defs.h
is included).

In few cases, redefining a symbol in this way causes issues for
interceptors, namely the memintrinsic interceptor themselves. For such
source files we have to selectively disable the redefinition.

Other alternatives have been considered, but simply do not work well in
the context of compiler-rt:

	1. Linker --wrap:  this does not work because --wrap only
	   applies to the final link, and would not apply when building
	   sanitizer static libraries.

	2. Changing references to memset() via objcopy:  this may work,
	   but due to the complexities of the build system, introducing
	   such a post-processing step for the right object files (in
	   particular object files defining memset cannot be touched)
	   seems infeasible.

The chosen solution works well (as shown by the tests). Other libraries
have chosen the same solution where nothing else works (see e.g. glibc's
"symbol-hacks.h").

v2:
- Fix ubsan_minimal build where compiler decides to insert
  memset/memcpy: ubsan_minimal has work without RTSanitizerCommonLibc,
  therefore do not redefine the builtins.
- Fix definition of internal_mem* functions with compilers that want the
  aliased function to already be defined before.
- Fix definition of __sanitizer_internal_mem* functions with compilers
  more pedantic about attribute placement around extern "C".

Reviewed By: vitalybuka, dvyukov

Differential Revision: https://reviews.llvm.org/D151152

Added: 
    compiler-rt/lib/sanitizer_common/sanitizer_redefine_builtins.h
    compiler-rt/test/asan/TestCases/Linux/check_memcpy.c
    compiler-rt/test/msan/Linux/check_memcpy.c

Modified: 
    compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp
    compiler-rt/lib/hwasan/hwasan_interceptors.cpp
    compiler-rt/lib/interception/tests/CMakeLists.txt
    compiler-rt/lib/memprof/memprof_interceptors_memintrinsics.cpp
    compiler-rt/lib/msan/msan_interceptors.cpp
    compiler-rt/lib/sanitizer_common/CMakeLists.txt
    compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
    compiler-rt/lib/sanitizer_common/sanitizer_common_interface.inc
    compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
    compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp
    compiler-rt/lib/tsan/rtl/tsan_interceptors_memintrinsics.cpp
    compiler-rt/lib/ubsan_minimal/CMakeLists.txt
    compiler-rt/test/tsan/Linux/check_memcpy.c

Removed: 
    


################################################################################
diff  --git a/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp b/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp
index 9d1452482d3af..4e4ea7191d320 100644
--- a/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp
+++ b/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp
@@ -11,6 +11,8 @@
 // ASan versions of memcpy, memmove, and memset.
 //===---------------------------------------------------------------------===//
 
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "asan_interceptors_memintrinsics.h"
 
 #include "asan_interceptors.h"

diff  --git a/compiler-rt/lib/hwasan/hwasan_interceptors.cpp b/compiler-rt/lib/hwasan/hwasan_interceptors.cpp
index 26109332a1dce..bffb4e092e90c 100644
--- a/compiler-rt/lib/hwasan/hwasan_interceptors.cpp
+++ b/compiler-rt/lib/hwasan/hwasan_interceptors.cpp
@@ -14,6 +14,8 @@
 // sanitizer_common/sanitizer_common_interceptors.h
 //===----------------------------------------------------------------------===//
 
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "hwasan.h"
 #include "hwasan_allocator.h"
 #include "hwasan_checks.h"

diff  --git a/compiler-rt/lib/interception/tests/CMakeLists.txt b/compiler-rt/lib/interception/tests/CMakeLists.txt
index 688edc3750f5f..f6840e194be49 100644
--- a/compiler-rt/lib/interception/tests/CMakeLists.txt
+++ b/compiler-rt/lib/interception/tests/CMakeLists.txt
@@ -17,6 +17,7 @@ set(INTERCEPTION_TEST_CFLAGS_COMMON
   -I${COMPILER_RT_SOURCE_DIR}/include
   -I${COMPILER_RT_SOURCE_DIR}/lib
   -I${COMPILER_RT_SOURCE_DIR}/lib/interception
+  -DSANITIZER_COMMON_NO_REDEFINE_BUILTINS
   -fno-rtti
   -O2
   -Werror=sign-compare)

diff  --git a/compiler-rt/lib/memprof/memprof_interceptors_memintrinsics.cpp b/compiler-rt/lib/memprof/memprof_interceptors_memintrinsics.cpp
index dae2ab5dbb9bc..56bd11614d6ac 100644
--- a/compiler-rt/lib/memprof/memprof_interceptors_memintrinsics.cpp
+++ b/compiler-rt/lib/memprof/memprof_interceptors_memintrinsics.cpp
@@ -11,6 +11,8 @@
 // MemProf versions of memcpy, memmove, and memset.
 //===---------------------------------------------------------------------===//
 
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "memprof_interceptors_memintrinsics.h"
 
 #include "memprof_interceptors.h"

diff  --git a/compiler-rt/lib/msan/msan_interceptors.cpp b/compiler-rt/lib/msan/msan_interceptors.cpp
index 96abc47305cad..6f57c33eefa97 100644
--- a/compiler-rt/lib/msan/msan_interceptors.cpp
+++ b/compiler-rt/lib/msan/msan_interceptors.cpp
@@ -14,6 +14,8 @@
 // sanitizer_common/sanitizer_common_interceptors.h
 //===----------------------------------------------------------------------===//
 
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "interception/interception.h"
 #include "msan.h"
 #include "msan_chained_origin_depot.h"

diff  --git a/compiler-rt/lib/sanitizer_common/CMakeLists.txt b/compiler-rt/lib/sanitizer_common/CMakeLists.txt
index c4fdc7aeb4e40..bfa59daf17b00 100644
--- a/compiler-rt/lib/sanitizer_common/CMakeLists.txt
+++ b/compiler-rt/lib/sanitizer_common/CMakeLists.txt
@@ -172,6 +172,7 @@ set(SANITIZER_IMPL_HEADERS
   sanitizer_procmaps.h
   sanitizer_ptrauth.h
   sanitizer_quarantine.h
+  sanitizer_redefine_builtins.h
   sanitizer_report_decorator.h
   sanitizer_ring_buffer.h
   sanitizer_signal_interceptors.inc

diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc b/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
index e6b967c48b2e1..52e489d02cda8 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_common_interceptors_memintrinsics.inc
@@ -9,6 +9,10 @@
 // Memintrinsic function interceptors for tools like AddressSanitizer,
 // ThreadSanitizer, MemorySanitizer, etc.
 //
+// These interceptors are part of the common interceptors, but separated out so
+// that implementations may add them, if necessary, to a separate source file
+// that should define SANITIZER_COMMON_NO_REDEFINE_BUILTINS at the top.
+//
 // This file should be included into the tool's memintrinsic interceptor file,
 // which has to define its own macros:
 //   COMMON_INTERCEPTOR_ENTER
@@ -20,6 +24,10 @@
 //   COMMON_INTERCEPTOR_NOTHING_IS_INITIALIZED
 //===----------------------------------------------------------------------===//
 
+#ifdef SANITIZER_REDEFINE_BUILTINS_H
+#error "Define SANITIZER_COMMON_NO_REDEFINE_BUILTINS in .cpp file"
+#endif
+
 #include "interception/interception.h"
 #include "sanitizer_platform_interceptors.h"
 

diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_common_interface.inc b/compiler-rt/lib/sanitizer_common/sanitizer_common_interface.inc
index 37efb5791d0bf..557207fe62ac6 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_common_interface.inc
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_common_interface.inc
@@ -46,3 +46,7 @@ INTERFACE_FUNCTION(__sanitizer_purge_allocator)
 INTERFACE_FUNCTION(__sanitizer_print_memory_profile)
 INTERFACE_WEAK_FUNCTION(__sanitizer_free_hook)
 INTERFACE_WEAK_FUNCTION(__sanitizer_malloc_hook)
+// Memintrinsic functions.
+INTERFACE_FUNCTION(__sanitizer_internal_memcpy)
+INTERFACE_FUNCTION(__sanitizer_internal_memmove)
+INTERFACE_FUNCTION(__sanitizer_internal_memset)

diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h b/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
index 95f4760cffd74..e5dd65a8398c4 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_internal_defs.h
@@ -13,6 +13,7 @@
 #define SANITIZER_DEFS_H
 
 #include "sanitizer_platform.h"
+#include "sanitizer_redefine_builtins.h"
 
 #ifndef SANITIZER_DEBUG
 # define SANITIZER_DEBUG 0

diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp b/compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp
index d3076f0da4891..9cbf232c6d7be 100644
--- a/compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_libc.cpp
@@ -10,6 +10,9 @@
 // run-time libraries. See sanitizer_libc.h for details.
 //===----------------------------------------------------------------------===//
 
+// Do not redefine builtins; this file is defining the builtin replacements.
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "sanitizer_allocator_internal.h"
 #include "sanitizer_common.h"
 #include "sanitizer_libc.h"
@@ -46,7 +49,10 @@ int internal_memcmp(const void* s1, const void* s2, uptr n) {
   return 0;
 }
 
-void *internal_memcpy(void *dest, const void *src, uptr n) {
+extern "C" {
+SANITIZER_INTERFACE_ATTRIBUTE void *__sanitizer_internal_memcpy(void *dest,
+                                                                const void *src,
+                                                                uptr n) {
   char *d = (char*)dest;
   const char *s = (const char *)src;
   for (uptr i = 0; i < n; ++i)
@@ -54,7 +60,8 @@ void *internal_memcpy(void *dest, const void *src, uptr n) {
   return dest;
 }
 
-void *internal_memmove(void *dest, const void *src, uptr n) {
+SANITIZER_INTERFACE_ATTRIBUTE void *__sanitizer_internal_memmove(
+    void *dest, const void *src, uptr n) {
   char *d = (char*)dest;
   const char *s = (const char *)src;
   sptr i, signed_n = (sptr)n;
@@ -72,7 +79,8 @@ void *internal_memmove(void *dest, const void *src, uptr n) {
   return dest;
 }
 
-void *internal_memset(void* s, int c, uptr n) {
+SANITIZER_INTERFACE_ATTRIBUTE void *__sanitizer_internal_memset(void *s, int c,
+                                                                uptr n) {
   // Optimize for the most performance-critical case:
   if ((reinterpret_cast<uptr>(s) % 16) == 0 && (n % 16) == 0) {
     u64 *p = reinterpret_cast<u64*>(s);
@@ -95,6 +103,14 @@ void *internal_memset(void* s, int c, uptr n) {
   }
   return s;
 }
+}  // extern "C"
+
+void *internal_memcpy(void *dest, const void *src, uptr n)
+    ALIAS(__sanitizer_internal_memcpy);
+void *internal_memmove(void *dest, const void *src, uptr n)
+    ALIAS(__sanitizer_internal_memmove);
+void *internal_memset(void *s, int c, uptr n)
+    ALIAS(__sanitizer_internal_memset);
 
 uptr internal_strcspn(const char *s, const char *reject) {
   uptr i;

diff  --git a/compiler-rt/lib/sanitizer_common/sanitizer_redefine_builtins.h b/compiler-rt/lib/sanitizer_common/sanitizer_redefine_builtins.h
new file mode 100644
index 0000000000000..13dc7d5be3142
--- /dev/null
+++ b/compiler-rt/lib/sanitizer_common/sanitizer_redefine_builtins.h
@@ -0,0 +1,27 @@
+//===-- sanitizer_redefine_builtins.h ---------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Redefine builtin functions to use internal versions. This is needed where
+// compiler optimizations end up producing unwanted libcalls!
+//
+//===----------------------------------------------------------------------===//
+#ifndef SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+#ifndef SANITIZER_REDEFINE_BUILTINS_H
+#define SANITIZER_REDEFINE_BUILTINS_H
+
+// The asm hack only works with GCC and Clang.
+#if !defined(_MSC_VER) || defined(__clang__)
+
+asm("memcpy = __sanitizer_internal_memcpy");
+asm("memmove = __sanitizer_internal_memmove");
+asm("memset = __sanitizer_internal_memset");
+
+#endif  // !_MSC_VER || __clang__
+
+#endif  // SANITIZER_REDEFINE_BUILTINS_H
+#endif  // SANITIZER_COMMON_NO_REDEFINE_BUILTINS

diff  --git a/compiler-rt/lib/tsan/rtl/tsan_interceptors_memintrinsics.cpp b/compiler-rt/lib/tsan/rtl/tsan_interceptors_memintrinsics.cpp
index 6a2a4298c217a..c8b6b2ef19483 100644
--- a/compiler-rt/lib/tsan/rtl/tsan_interceptors_memintrinsics.cpp
+++ b/compiler-rt/lib/tsan/rtl/tsan_interceptors_memintrinsics.cpp
@@ -10,6 +10,8 @@
 //
 //===----------------------------------------------------------------------===//
 
+#define SANITIZER_COMMON_NO_REDEFINE_BUILTINS
+
 #include "tsan_interceptors.h"
 #include "tsan_interface.h"
 

diff  --git a/compiler-rt/lib/ubsan_minimal/CMakeLists.txt b/compiler-rt/lib/ubsan_minimal/CMakeLists.txt
index 504dd3b7573b9..07e96a70a49dc 100644
--- a/compiler-rt/lib/ubsan_minimal/CMakeLists.txt
+++ b/compiler-rt/lib/ubsan_minimal/CMakeLists.txt
@@ -6,7 +6,9 @@ set(UBSAN_MINIMAL_SOURCES
 
 include_directories(..)
 
-set(UBSAN_CFLAGS ${SANITIZER_COMMON_CFLAGS})
+set(UBSAN_CFLAGS
+  ${SANITIZER_COMMON_CFLAGS}
+  -DSANITIZER_COMMON_NO_REDEFINE_BUILTINS)
 append_rtti_flag(OFF UBSAN_CFLAGS)
 
 set(UBSAN_LINK_FLAGS ${SANITIZER_COMMON_LINK_FLAGS})

diff  --git a/compiler-rt/test/asan/TestCases/Linux/check_memcpy.c b/compiler-rt/test/asan/TestCases/Linux/check_memcpy.c
new file mode 100644
index 0000000000000..d5fee1628fc54
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/Linux/check_memcpy.c
@@ -0,0 +1,8 @@
+// Verify runtime doesn't contain compiler-emitted memcpy/memmove calls.
+//
+// REQUIRES: shared_unwind, x86_64-target-arch
+
+// RUN: %clang_asan -O1 %s -o %t
+// RUN: llvm-objdump -d -l %t | FileCheck --implicit-check-not="{{(callq|jmpq) .*<(__interceptor_.*)?mem(cpy|set|move)>}}" %s
+
+int main() { return 0; }

diff  --git a/compiler-rt/test/msan/Linux/check_memcpy.c b/compiler-rt/test/msan/Linux/check_memcpy.c
new file mode 100644
index 0000000000000..42af2d7e7d1b1
--- /dev/null
+++ b/compiler-rt/test/msan/Linux/check_memcpy.c
@@ -0,0 +1,8 @@
+// Verify runtime doesn't contain compiler-emitted memcpy/memmove calls.
+//
+// REQUIRES: shared_unwind, x86_64-target-arch
+
+// RUN: %clang_msan -O1 %s -o %t
+// RUN: llvm-objdump -d -l %t | FileCheck --implicit-check-not="{{(callq|jmpq) .*<(__interceptor_.*)?mem(cpy|set|move)>}}" %s
+
+int main() { return 0; }

diff  --git a/compiler-rt/test/tsan/Linux/check_memcpy.c b/compiler-rt/test/tsan/Linux/check_memcpy.c
index 26f99612e50f5..21480564ff6ce 100644
--- a/compiler-rt/test/tsan/Linux/check_memcpy.c
+++ b/compiler-rt/test/tsan/Linux/check_memcpy.c
@@ -5,16 +5,9 @@
 // This could fail if using a static libunwind because that static libunwind
 // could be uninstrumented and contain memcpy/memmove calls not intercepted by
 // tsan.
-// REQUIRES: shared_unwind
+// REQUIRES: shared_unwind, x86_64-target-arch
 
 // RUN: %clang_tsan -O1 %s -o %t
-// RUN: llvm-objdump -d -l %t | FileCheck %s
-
-int main() {
-  return 0;
-}
-
-// CHECK-NOT: callq {{.*<(__interceptor_)?mem(cpy|set)>}}
-// tail calls:
-// CHECK-NOT: jmpq {{.*<(__interceptor_)?mem(cpy|set)>}}
+// RUN: llvm-objdump -d -l %t | FileCheck --implicit-check-not="{{(callq|jmpq) .*<(__interceptor_.*)?mem(cpy|set|move)>}}" %s
 
+int main() { return 0; }


        


More information about the llvm-commits mailing list