[llvm-branch-commits] [llvm] [compiler-rt] [clang] [TySan] A Type Sanitizer (Runtime Library) (PR #76261)

via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Fri Dec 22 11:19:13 PST 2023


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-compiler-rt-sanitizer

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

<details>
<summary>Changes</summary>

This patch introduces the runtime components of a type sanitizer: a sanitizer for type-based aliasing violations.

C/C++ have type-based aliasing rules, and LLVM's optimizer can exploit these given TBAA metadata added by Clang. Roughly, a pointer of given type cannot be used to access an object of a different type (with, of course, certain exceptions). Unfortunately, there's a lot of code in the wild that violates these rules (e.g. for type punning), and such code often must be built with -fno-strict-aliasing. Performance is often sacrificed as a result. Part of the problem is the difficulty of finding TBAA violations. Hopefully, this sanitizer will help.

For each TBAA type-access descriptor, encoded in LLVM's IR using metadata, the corresponding instrumentation pass generates descriptor tables. Thus, for each type (and access descriptor), we have a unique pointer representation. Excepting anonymous-namespace types, these tables are comdat, so the pointer values should be unique across the program. The descriptors refer to other descriptors to form a type aliasing tree (just like LLVM's TBAA metadata does). The instrumentation handles the "fast path" (where the types match exactly and no partial-overlaps are detected), and defers to the runtime to handle all of the more-complicated cases. The runtime, of course, is also responsible for reporting errors when those are detected.

The runtime uses essentially the same shadow memory region as tsan, and we use 8 bytes of shadow memory, the size of the pointer to the type descriptor, for every byte of accessed data in the program. The value 0 is used to represent an unknown type. The value -1 is used to represent an interior byte (a byte that is part of a type, but not the first byte). The instrumentation first checks for an exact match between the type of the current access and the type for that address recorded in the shadow memory. If it matches, it then checks the shadow for the remainder of the bytes in the type to make sure that they're all -1. If not, we call the runtime. If the exact match fails, we next check if the value is 0 (i.e. unknown). If it is, then we check the shadow for the remainder of the byes in the type (to make sure they're all 0). If they're not, we call the runtime. We then set the shadow for the access address and set the shadow for the remaining bytes in the type to -1 (i.e. marking them as interior bytes). If the type indicated by the shadow memory for the access address is neither an exact match nor 0, we call the runtime.

The instrumentation pass inserts calls to the memset intrinsic to set the memory updated by memset, memcpy, and memmove, as well as allocas/byval (and for lifetime.start/end) to reset the shadow memory to reflect that the type is now unknown. The runtime intercepts memset, memcpy, etc. to perform the same function for the library calls.

The runtime essentially repeats these checks, but uses the full TBAA algorithm, just as the compiler does, to determine when two types are permitted to alias. In a situation where access overlap has occurred and aliasing is not permitted, an error is generated.

Clang's TBAA representation currently has a problem representing unions, as demonstrated by the one XFAIL'd test. We'll update the TBAA representation to fix this, and at the same time, update the sanitizer.

As a note, this implementation does not use the compressed shadow-memory scheme discussed previously (http://lists.llvm.org/pipermail/llvm-dev/2017-April/111766.html). That scheme would not handle the struct-path (i.e. structure offset) information that our TBAA represents. I expect we'll want to further work on compressing the shadow-memory representation, but I think it makes sense to do that as follow-up work.

(This includes build fixes for Linux from Mingjie Xu)

Based on https://reviews.llvm.org/D32197.


---

Patch is 53.06 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/76261.diff


28 Files Affected:

- (modified) clang/runtime/CMakeLists.txt (+1-1) 
- (modified) compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake (+1) 
- (modified) compiler-rt/cmake/config-ix.cmake (+14-1) 
- (added) compiler-rt/lib/tysan/CMakeLists.txt (+64) 
- (added) compiler-rt/lib/tysan/lit.cfg (+35) 
- (added) compiler-rt/lib/tysan/lit.site.cfg.in (+12) 
- (added) compiler-rt/lib/tysan/tysan.cpp (+339) 
- (added) compiler-rt/lib/tysan/tysan.h (+79) 
- (added) compiler-rt/lib/tysan/tysan.syms.extra (+2) 
- (added) compiler-rt/lib/tysan/tysan_flags.inc (+17) 
- (added) compiler-rt/lib/tysan/tysan_interceptors.cpp (+250) 
- (added) compiler-rt/lib/tysan/tysan_platform.h (+93) 
- (added) compiler-rt/lib/tysan/weak_symbols.txt () 
- (added) compiler-rt/test/tysan/CMakeLists.txt (+32) 
- (added) compiler-rt/test/tysan/anon-ns.cpp (+41) 
- (added) compiler-rt/test/tysan/anon-same-struct.c (+26) 
- (added) compiler-rt/test/tysan/anon-struct.c (+27) 
- (added) compiler-rt/test/tysan/basic.c (+65) 
- (added) compiler-rt/test/tysan/char-memcpy.c (+45) 
- (added) compiler-rt/test/tysan/global.c (+31) 
- (added) compiler-rt/test/tysan/int-long.c (+21) 
- (added) compiler-rt/test/tysan/lit.cfg.py (+139) 
- (added) compiler-rt/test/tysan/lit.site.cfg.py.in (+17) 
- (added) compiler-rt/test/tysan/ptr-float.c (+19) 
- (added) compiler-rt/test/tysan/struct-offset-multiple-compilation-units.cpp (+51) 
- (added) compiler-rt/test/tysan/struct-offset.c (+26) 
- (added) compiler-rt/test/tysan/struct.c (+39) 
- (added) compiler-rt/test/tysan/union-wr-wr.c (+18) 


``````````diff
diff --git a/clang/runtime/CMakeLists.txt b/clang/runtime/CMakeLists.txt
index 65fcdc2868f031..ff2605b23d25b0 100644
--- a/clang/runtime/CMakeLists.txt
+++ b/clang/runtime/CMakeLists.txt
@@ -122,7 +122,7 @@ if(LLVM_BUILD_EXTERNAL_COMPILER_RT AND EXISTS ${COMPILER_RT_SRC_ROOT}/)
                            COMPONENT compiler-rt)
 
   # Add top-level targets that build specific compiler-rt runtimes.
-  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan ubsan ubsan-minimal)
+  set(COMPILER_RT_RUNTIMES fuzzer asan builtins dfsan lsan msan profile tsan tysan ubsan ubsan-minimal)
   foreach(runtime ${COMPILER_RT_RUNTIMES})
     get_ext_project_build_command(build_runtime_cmd ${runtime})
     add_custom_target(${runtime}
diff --git a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
index 416777171d2ca7..66ebee4392e1ff 100644
--- a/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+++ b/compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
@@ -67,6 +67,7 @@ set(ALL_PROFILE_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${PPC32} ${PPC
     ${RISCV32} ${RISCV64} ${LOONGARCH64})
 set(ALL_TSAN_SUPPORTED_ARCH ${X86_64} ${MIPS64} ${ARM64} ${PPC64} ${S390X}
     ${LOONGARCH64} ${RISCV64})
+set(ALL_TYSAN_SUPPORTED_ARCH ${X86_64} ${ARM64})
 set(ALL_UBSAN_SUPPORTED_ARCH ${X86} ${X86_64} ${ARM32} ${ARM64} ${RISCV64}
     ${MIPS32} ${MIPS64} ${PPC64} ${S390X} ${SPARC} ${SPARCV9} ${HEXAGON}
     ${LOONGARCH64})
diff --git a/compiler-rt/cmake/config-ix.cmake b/compiler-rt/cmake/config-ix.cmake
index 2dccd4954b2537..e448e71e5422a1 100644
--- a/compiler-rt/cmake/config-ix.cmake
+++ b/compiler-rt/cmake/config-ix.cmake
@@ -448,6 +448,7 @@ if(APPLE)
   set(SANITIZER_COMMON_SUPPORTED_OS osx)
   set(PROFILE_SUPPORTED_OS osx)
   set(TSAN_SUPPORTED_OS osx)
+  set(TYSAN_SUPPORTED_OS osx)
   set(XRAY_SUPPORTED_OS osx)
   set(FUZZER_SUPPORTED_OS osx)
   set(ORC_SUPPORTED_OS)
@@ -581,6 +582,7 @@ if(APPLE)
           list(APPEND FUZZER_SUPPORTED_OS ${platform})
           list(APPEND ORC_SUPPORTED_OS ${platform})
           list(APPEND UBSAN_SUPPORTED_OS ${platform})
+          list(APPEND TYSAN_SUPPORTED_OS ${platform})
           list(APPEND LSAN_SUPPORTED_OS ${platform})
           list(APPEND STATS_SUPPORTED_OS ${platform})
         endif()
@@ -630,6 +632,9 @@ if(APPLE)
   list_intersect(PROFILE_SUPPORTED_ARCH
     ALL_PROFILE_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
+  list_intersect(TYSAN_SUPPORTED_ARCH
+    ALL_TYSAN_SUPPORTED_ARCH
+    SANITIZER_COMMON_SUPPORTED_ARCH)
   list_intersect(TSAN_SUPPORTED_ARCH
     ALL_TSAN_SUPPORTED_ARCH
     SANITIZER_COMMON_SUPPORTED_ARCH)
@@ -676,6 +681,7 @@ else()
   filter_available_targets(HWASAN_SUPPORTED_ARCH ${ALL_HWASAN_SUPPORTED_ARCH})
   filter_available_targets(MEMPROF_SUPPORTED_ARCH ${ALL_MEMPROF_SUPPORTED_ARCH})
   filter_available_targets(PROFILE_SUPPORTED_ARCH ${ALL_PROFILE_SUPPORTED_ARCH})
+  filter_available_targets(TYSAN_SUPPORTED_ARCH ${ALL_TYSAN_SUPPORTED_ARCH})
   filter_available_targets(TSAN_SUPPORTED_ARCH ${ALL_TSAN_SUPPORTED_ARCH})
   filter_available_targets(UBSAN_SUPPORTED_ARCH ${ALL_UBSAN_SUPPORTED_ARCH})
   filter_available_targets(SAFESTACK_SUPPORTED_ARCH
@@ -720,7 +726,7 @@ if(COMPILER_RT_SUPPORTED_ARCH)
 endif()
 message(STATUS "Compiler-RT supported architectures: ${COMPILER_RT_SUPPORTED_ARCH}")
 
-set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)
+set(ALL_SANITIZERS asan;dfsan;msan;hwasan;tsan;tysan,safestack;cfi;scudo_standalone;ubsan_minimal;gwp_asan;asan_abi)
 set(COMPILER_RT_SANITIZERS_TO_BUILD all CACHE STRING
     "sanitizers to build if supported on the target (all;${ALL_SANITIZERS})")
 list_replace(COMPILER_RT_SANITIZERS_TO_BUILD all "${ALL_SANITIZERS}")
@@ -801,6 +807,13 @@ else()
   set(COMPILER_RT_HAS_PROFILE FALSE)
 endif()
 
+if (COMPILER_RT_HAS_SANITIZER_COMMON AND TYSAN_SUPPORTED_ARCH AND
+        OS_NAME MATCHES "Linux|Darwin")
+  set(COMPILER_RT_HAS_TYSAN TRUE)
+else()
+  set(COMPILER_RT_HAS_TYSAN FALSE)
+endif()
+
 if (COMPILER_RT_HAS_SANITIZER_COMMON AND TSAN_SUPPORTED_ARCH)
   if (OS_NAME MATCHES "Linux|Darwin|FreeBSD|NetBSD")
     set(COMPILER_RT_HAS_TSAN TRUE)
diff --git a/compiler-rt/lib/tysan/CMakeLists.txt b/compiler-rt/lib/tysan/CMakeLists.txt
new file mode 100644
index 00000000000000..859b67928f004a
--- /dev/null
+++ b/compiler-rt/lib/tysan/CMakeLists.txt
@@ -0,0 +1,64 @@
+include_directories(..)
+
+# Runtime library sources and build flags.
+set(TYSAN_SOURCES
+  tysan.cpp
+  tysan_interceptors.cpp)
+set(TYSAN_COMMON_CFLAGS ${SANITIZER_COMMON_CFLAGS})
+append_rtti_flag(OFF TYSAN_COMMON_CFLAGS)
+# Prevent clang from generating libc calls.
+append_list_if(COMPILER_RT_HAS_FFREESTANDING_FLAG -ffreestanding TYSAN_COMMON_CFLAGS)
+
+add_compiler_rt_object_libraries(RTTysan_dynamic
+  OS ${SANITIZER_COMMON_SUPPORTED_OS}
+  ARCHS ${TYSAN_SUPPORTED_ARCH}
+  SOURCES ${TYSAN_SOURCES}
+  ADDITIONAL_HEADERS ${TYSAN_HEADERS}
+  CFLAGS ${TYSAN_DYNAMIC_CFLAGS}
+  DEFS ${TYSAN_DYNAMIC_DEFINITIONS})
+
+
+# Static runtime library.
+add_compiler_rt_component(tysan)
+
+
+if(APPLE)
+  add_weak_symbols("sanitizer_common" WEAK_SYMBOL_LINK_FLAGS)
+
+  add_compiler_rt_runtime(clang_rt.tysan
+    SHARED
+    OS ${SANITIZER_COMMON_SUPPORTED_OS}
+    ARCHS ${TYSAN_SUPPORTED_ARCH}
+    OBJECT_LIBS RTTysan_dynamic
+                RTInterception
+                RTSanitizerCommon
+                RTSanitizerCommonLibc
+                RTSanitizerCommonSymbolizer
+    CFLAGS ${TYSAN_DYNAMIC_CFLAGS}
+    LINK_FLAGS ${WEAK_SYMBOL_LINK_FLAGS}
+    DEFS ${TYSAN_DYNAMIC_DEFINITIONS}
+    PARENT_TARGET tysan)
+
+  add_compiler_rt_runtime(clang_rt.tysan_static
+    STATIC
+    ARCHS ${TYSAN_SUPPORTED_ARCH}
+    OBJECT_LIBS RTTysan_static
+    CFLAGS ${TYSAN_CFLAGS}
+    DEFS ${TYSAN_COMMON_DEFINITIONS}
+    PARENT_TARGET tysan)
+else()
+  foreach(arch ${TYSAN_SUPPORTED_ARCH})
+    set(TYSAN_CFLAGS ${TYSAN_COMMON_CFLAGS})
+    append_list_if(COMPILER_RT_HAS_FPIE_FLAG -fPIE TYSAN_CFLAGS)
+    add_compiler_rt_runtime(clang_rt.tysan
+      STATIC
+      ARCHS ${arch}
+      SOURCES ${TYSAN_SOURCES}
+              $<TARGET_OBJECTS:RTInterception.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommon.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonLibc.${arch}>
+              $<TARGET_OBJECTS:RTSanitizerCommonSymbolizer.${arch}>
+      CFLAGS ${TYSAN_CFLAGS}
+      PARENT_TARGET tysan)
+  endforeach()
+endif()
diff --git a/compiler-rt/lib/tysan/lit.cfg b/compiler-rt/lib/tysan/lit.cfg
new file mode 100644
index 00000000000000..bd2bbe855529a7
--- /dev/null
+++ b/compiler-rt/lib/tysan/lit.cfg
@@ -0,0 +1,35 @@
+# -*- Python -*-
+
+import os
+
+# Setup config name.
+config.name = 'TypeSanitizer' + getattr(config, 'name_suffix', 'default')
+
+# Setup source root.
+config.test_source_root = os.path.dirname(__file__)
+
+# Setup default compiler flags used with -fsanitize=type option.
+clang_tysan_cflags = (["-fsanitize=type",
+                      "-mno-omit-leaf-frame-pointer",
+                      "-fno-omit-frame-pointer",
+                      "-fno-optimize-sibling-calls"] +
+                      [config.target_cflags] +
+                      config.debug_info_flags)
+clang_tysan_cxxflags = config.cxx_mode_flags + clang_tysan_cflags
+
+def build_invocation(compile_flags):
+  return " " + " ".join([config.clang] + compile_flags) + " "
+
+config.substitutions.append( ("%clang_tysan ", build_invocation(clang_tysan_cflags)) )
+config.substitutions.append( ("%clangxx_tysan ", build_invocation(clang_tysan_cxxflags)) )
+
+# Default test suffixes.
+config.suffixes = ['.c', '.cc', '.cpp']
+
+# TypeSanitizer tests are currently supported on Linux only.
+if config.host_os not in ['Linux']:
+  config.unsupported = True
+
+if config.target_arch != 'aarch64':
+  config.available_features.add('stable-runtime')
+
diff --git a/compiler-rt/lib/tysan/lit.site.cfg.in b/compiler-rt/lib/tysan/lit.site.cfg.in
new file mode 100644
index 00000000000000..673d04e514379b
--- /dev/null
+++ b/compiler-rt/lib/tysan/lit.site.cfg.in
@@ -0,0 +1,12 @@
+ at LIT_SITE_CFG_IN_HEADER@
+
+# Tool-specific config options.
+config.name_suffix = "@TYSAN_TEST_CONFIG_SUFFIX@"
+config.target_cflags = "@TYSAN_TEST_TARGET_CFLAGS@"
+config.target_arch = "@TYSAN_TEST_TARGET_ARCH@"
+
+# Load common config for all compiler-rt lit tests.
+lit_config.load_config(config, "@COMPILER_RT_BINARY_DIR@/test/lit.common.configured")
+
+# Load tool-specific config that would do the real work.
+lit_config.load_config(config, "@TYSAN_LIT_SOURCE_DIR@/lit.cfg")
diff --git a/compiler-rt/lib/tysan/tysan.cpp b/compiler-rt/lib/tysan/tysan.cpp
new file mode 100644
index 00000000000000..f627851d049e6a
--- /dev/null
+++ b/compiler-rt/lib/tysan/tysan.cpp
@@ -0,0 +1,339 @@
+//===-- tysan.cpp ---------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file is a part of TypeSanitizer.
+//
+// TypeSanitizer runtime.
+//===----------------------------------------------------------------------===//
+
+#include "sanitizer_common/sanitizer_atomic.h"
+#include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_flag_parser.h"
+#include "sanitizer_common/sanitizer_flags.h"
+#include "sanitizer_common/sanitizer_libc.h"
+#include "sanitizer_common/sanitizer_report_decorator.h"
+#include "sanitizer_common/sanitizer_stacktrace.h"
+#include "sanitizer_common/sanitizer_symbolizer.h"
+
+#include "tysan/tysan.h"
+
+using namespace __sanitizer;
+using namespace __tysan;
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+tysan_set_type_unknown(const void *addr, uptr size) {
+  if (tysan_inited)
+    internal_memset(shadow_for(addr), 0, size * sizeof(uptr));
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+tysan_copy_types(const void *daddr, const void *saddr, uptr size) {
+  if (tysan_inited)
+    internal_memmove(shadow_for(daddr), shadow_for(saddr), size * sizeof(uptr));
+}
+
+static const char *getDisplayName(const char *Name) {
+  if (Name[0] == '\0')
+    return "<anonymous type>";
+
+  // Clang generates tags for C++ types that demangle as typeinfo. Remove the
+  // prefix from the generated string.
+  const char TIPrefix[] = "typeinfo name for ";
+
+  const char *DName = Symbolizer::GetOrInit()->Demangle(Name);
+  if (!internal_strncmp(DName, TIPrefix, sizeof(TIPrefix) - 1))
+    DName += sizeof(TIPrefix) - 1;
+
+  return DName;
+}
+
+static void printTDName(tysan_type_descriptor *td) {
+  if (((sptr)td) <= 0) {
+    Printf("<unknown type>");
+    return;
+  }
+
+  switch (td->Tag) {
+  default:
+    DCHECK(0);
+    break;
+  case TYSAN_MEMBER_TD:
+    printTDName(td->Member.Access);
+    if (td->Member.Access != td->Member.Base) {
+      Printf(" (in ");
+      printTDName(td->Member.Base);
+      Printf(" at offset %zu)", td->Member.Offset);
+    }
+    break;
+  case TYSAN_STRUCT_TD:
+    Printf("%s", getDisplayName(
+                     (char *)(td->Struct.Members + td->Struct.MemberCount)));
+    break;
+  }
+}
+
+static tysan_type_descriptor *getRootTD(tysan_type_descriptor *TD) {
+  tysan_type_descriptor *RootTD = TD;
+
+  do {
+    RootTD = TD;
+
+    if (TD->Tag == TYSAN_STRUCT_TD) {
+      if (TD->Struct.MemberCount > 0)
+        TD = TD->Struct.Members[0].Type;
+      else
+        TD = nullptr;
+    } else if (TD->Tag == TYSAN_MEMBER_TD) {
+      TD = TD->Member.Access;
+    } else {
+      DCHECK(0);
+      break;
+    }
+  } while (TD);
+
+  return RootTD;
+}
+
+static bool isAliasingLegalUp(tysan_type_descriptor *TDA,
+                              tysan_type_descriptor *TDB) {
+  // Walk up the tree starting with TDA to see if we reach TDB.
+  uptr OffsetA = 0, OffsetB = 0;
+  if (TDB->Tag == TYSAN_MEMBER_TD) {
+    OffsetB = TDB->Member.Offset;
+    TDB = TDB->Member.Base;
+  }
+
+  if (TDA->Tag == TYSAN_MEMBER_TD) {
+    OffsetA = TDA->Member.Offset;
+    TDA = TDA->Member.Base;
+  }
+
+  do {
+    if (TDA == TDB)
+      return OffsetA == OffsetB;
+
+    if (TDA->Tag == TYSAN_STRUCT_TD) {
+      // Reached root type descriptor.
+      if (!TDA->Struct.MemberCount)
+        break;
+
+      uptr Idx = 0;
+      for (; Idx < TDA->Struct.MemberCount - 1; ++Idx) {
+        if (TDA->Struct.Members[Idx].Offset >= OffsetA)
+          break;
+      }
+
+      OffsetA -= TDA->Struct.Members[Idx].Offset;
+      TDA = TDA->Struct.Members[Idx].Type;
+    } else {
+      DCHECK(0);
+      break;
+    }
+  } while (TDA);
+
+  return false;
+}
+
+static bool isAliasingLegal(tysan_type_descriptor *TDA,
+                            tysan_type_descriptor *TDB) {
+  if (TDA == TDB || !TDB || !TDA)
+    return true;
+
+  // Aliasing is legal is the two types have different root nodes.
+  if (getRootTD(TDA) != getRootTD(TDB))
+    return true;
+
+  return isAliasingLegalUp(TDA, TDB) || isAliasingLegalUp(TDB, TDA);
+}
+
+namespace __tysan {
+class Decorator : public __sanitizer::SanitizerCommonDecorator {
+public:
+  Decorator() : SanitizerCommonDecorator() {}
+  const char *Warning() { return Red(); }
+  const char *Name() { return Green(); }
+  const char *End() { return Default(); }
+};
+} // namespace __tysan
+
+ALWAYS_INLINE
+static void reportError(void *Addr, int Size, tysan_type_descriptor *TD,
+                        tysan_type_descriptor *OldTD, const char *AccessStr,
+                        const char *DescStr, int Offset, uptr pc, uptr bp,
+                        uptr sp) {
+  Decorator d;
+  Printf("%s", d.Warning());
+  Report("ERROR: TypeSanitizer: type-aliasing-violation on address %p"
+         " (pc %p bp %p sp %p tid %llu)\n",
+         Addr, (void *)pc, (void *)bp, (void *)sp, GetTid());
+  Printf("%s", d.End());
+  Printf("%s of size %d at %p with type ", AccessStr, Size, Addr);
+
+  Printf("%s", d.Name());
+  printTDName(TD);
+  Printf("%s", d.End());
+
+  Printf(" %s of type ", DescStr);
+
+  Printf("%s", d.Name());
+  printTDName(OldTD);
+  Printf("%s", d.End());
+
+  if (Offset != 0)
+    Printf(" that starts at offset %d\n", Offset);
+  else
+    Printf("\n");
+
+  if (pc) {
+
+    bool request_fast = StackTrace::WillUseFastUnwind(true);
+    BufferedStackTrace ST;
+    ST.Unwind(kStackTraceMax, pc, bp, 0, 0, 0, request_fast);
+    ST.Print();
+  } else {
+    Printf("\n");
+  }
+}
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void
+__tysan_check(void *addr, int size, tysan_type_descriptor *td, int flags) {
+  GET_CALLER_PC_BP_SP;
+
+  bool IsRead = flags & 1;
+  bool IsWrite = flags & 2;
+  const char *AccessStr;
+  if (IsRead && !IsWrite)
+    AccessStr = "READ";
+  else if (!IsRead && IsWrite)
+    AccessStr = "WRITE";
+  else
+    AccessStr = "ATOMIC UPDATE";
+
+  tysan_type_descriptor **OldTDPtr = shadow_for(addr);
+  tysan_type_descriptor *OldTD = *OldTDPtr;
+  if (((sptr)OldTD) < 0) {
+    int i = -((sptr)OldTD);
+    OldTDPtr -= i;
+    OldTD = *OldTDPtr;
+
+    if (!isAliasingLegal(td, OldTD))
+      reportError(addr, size, td, OldTD, AccessStr,
+                  "accesses part of an existing object", -i, pc, bp, sp);
+
+    return;
+  }
+
+  if (!isAliasingLegal(td, OldTD)) {
+    reportError(addr, size, td, OldTD, AccessStr, "accesses an existing object",
+                0, pc, bp, sp);
+    return;
+  }
+
+  // These types are allowed to alias (or the stored type is unknown), report
+  // an error if we find an interior type.
+
+  for (int i = 0; i < size; ++i) {
+    OldTDPtr = shadow_for((void *)(((uptr)addr) + i));
+    OldTD = *OldTDPtr;
+    if (((sptr)OldTD) >= 0 && !isAliasingLegal(td, OldTD))
+      reportError(addr, size, td, OldTD, AccessStr,
+                  "partially accesses an object", i, pc, bp, sp);
+  }
+}
+
+Flags __tysan::flags_data;
+
+SANITIZER_INTERFACE_ATTRIBUTE uptr __tysan_shadow_memory_address;
+SANITIZER_INTERFACE_ATTRIBUTE uptr __tysan_app_memory_mask;
+
+#ifdef TYSAN_RUNTIME_VMA
+// Runtime detected VMA size.
+int __tysan::vmaSize;
+#endif
+
+void Flags::SetDefaults() {
+#define TYSAN_FLAG(Type, Name, DefaultValue, Description) Name = DefaultValue;
+#include "tysan_flags.inc"
+#undef TYSAN_FLAG
+}
+
+static void RegisterTySanFlags(FlagParser *parser, Flags *f) {
+#define TYSAN_FLAG(Type, Name, DefaultValue, Description)                      \
+  RegisterFlag(parser, #Name, Description, &f->Name);
+#include "tysan_flags.inc"
+#undef TYSAN_FLAG
+}
+
+static void InitializeFlags() {
+  SetCommonFlagsDefaults();
+  {
+    CommonFlags cf;
+    cf.CopyFrom(*common_flags());
+    cf.external_symbolizer_path = GetEnv("TYSAN_SYMBOLIZER_PATH");
+    OverrideCommonFlags(cf);
+  }
+
+  flags().SetDefaults();
+
+  FlagParser parser;
+  RegisterCommonFlags(&parser);
+  RegisterTySanFlags(&parser, &flags());
+  parser.ParseString(GetEnv("TYSAN_OPTIONS"));
+  InitializeCommonFlags();
+  if (Verbosity())
+    ReportUnrecognizedFlags();
+  if (common_flags()->help)
+    parser.PrintFlagDescriptions();
+}
+
+static void TySanInitializePlatformEarly() {
+  AvoidCVE_2016_2143();
+#ifdef TYSAN_RUNTIME_VMA
+  vmaSize = (MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1);
+#if defined(__aarch64__) && !SANITIZER_APPLE
+  if (vmaSize != 39 && vmaSize != 42 && vmaSize != 48) {
+    Printf("FATAL: TypeSanitizer: unsupported VMA range\n");
+    Printf("FATAL: Found %d - Supported 39, 42 and 48\n", vmaSize);
+    Die();
+  }
+#endif
+#endif
+
+  __sanitizer::InitializePlatformEarly();
+
+  __tysan_shadow_memory_address = ShadowAddr();
+  __tysan_app_memory_mask = AppMask();
+}
+
+namespace __tysan {
+bool tysan_inited = false;
+bool tysan_init_is_running;
+} // namespace __tysan
+
+extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __tysan_init() {
+  CHECK(!tysan_init_is_running);
+  if (tysan_inited)
+    return;
+  tysan_init_is_running = true;
+
+  InitializeFlags();
+  TySanInitializePlatformEarly();
+
+  InitializeInterceptors();
+
+  if (!MmapFixedNoReserve(ShadowAddr(), AppAddr() - ShadowAddr()))
+    Die();
+
+  tysan_init_is_running = false;
+  tysan_inited = true;
+}
+
+#if SANITIZER_CAN_USE_PREINIT_ARRAY
+__attribute__((section(".preinit_array"),
+               used)) static void (*tysan_init_ptr)() = __tysan_init;
+#endif
diff --git a/compiler-rt/lib/tysan/tysan.h b/compiler-rt/lib/tysan/tysan.h
new file mode 100644
index 00000000000000..ec6f9587e9ce58
--- /dev/null
+++ b/compiler-rt/lib/tysan/tysan.h
@@ -0,0 +1,79 @@
+//===-- tysan.h -------------------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file is a part of TypeSanitizer.
+//
+// Private TySan header.
+//===----------------------------------------------------------------------===//
+
+#ifndef TYSAN_H
+#define TYSAN_H
+
+#include "sanitizer_common/sanitizer_internal_defs.h"
+
+using __sanitizer::sptr;
+using __sanitizer::u16;
+using __sanitizer::uptr;
+
+#include "tysan_platform.h"
+
+extern "C" {
+void tysan_set_type_unknown(const void *addr, uptr size);
+void tysan_copy_types(const void *daddr, const void *saddr, uptr size);
+}
+
+namespace __tysan {
+extern bool tysan_inited;
+extern bool tysan_init_is_running;
+
+void InitializeInterceptors();
+
+enum { TYSAN_MEMBER_TD = 1, TYSAN_STRUCT_TD = 2 };
+
+struct tysan_member_type_descriptor {
+  struct tysan_type_descriptor *Base;
+  struct tysan_type_descriptor *Access;
+  uptr Offset;
+};
+
+struct tysan_struct_type_descriptor {
+  uptr MemberCount;
+  struct {
+    struct tysan_type_descriptor *Type;
+    uptr Offset;
+  } Members[1]; // Tail allocated.
+  // char Name[]; // Tail allocated.
+};
+
+struct tysan_type_descriptor {
+  uptr Tag;
+  union {
+    tysan_member_type_descriptor Member;
+    tysan_struct_type_descriptor Struct;
+  };
+};
+
+inline tysan_type_descriptor **shadow_for(const void *ptr) {
+  return (tysan_type_descriptor **)((((uptr)ptr) & AppMask()) * sizeof(ptr) +
+                                    ShadowAddr());
+}
+
+struct Flags {
+#define TYSAN_FLAG(Type, Name, Defaul...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/76261


More information about the llvm-branch-commits mailing list