[compiler-rt] c584c42 - [asan] Speed up ASan ODR indicator-based checking (#100923)

via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 1 13:58:19 PDT 2024


Author: Artem Pianykh
Date: 2024-08-01T13:58:14-07:00
New Revision: c584c4227105c2aae332ee2a2099c9df6c3a3e1c

URL: https://github.com/llvm/llvm-project/commit/c584c4227105c2aae332ee2a2099c9df6c3a3e1c
DIFF: https://github.com/llvm/llvm-project/commit/c584c4227105c2aae332ee2a2099c9df6c3a3e1c.diff

LOG: [asan] Speed up ASan ODR indicator-based checking (#100923)

**Summary**:
When ASan checks for a potential ODR violation on a global it loops over
a linked list of all globals to find those with the matching value of an
indicator. With the default setting 'detect_odr_violation=1', ASan
doesn't report violations on same-size globals but it still has to
traverse the list. For larger binaries with a ton of shared libs and
globals (and a non-trivial volume of same-sized duplicates) this gets
extremely expensive.

This patch adds an indicator indexed (multi-)map of globals to speed up
the search.

> Note: asan used to use a map to store globals a while ago which was
replaced with a list when the codebase [moved off of
STL](https://github.com/llvm/llvm-project/commit/e4bada2c946e5399fc37bd67421de01c0047ad38).

Internally we see many examples where ODR checking takes *seconds* (even
double digits). With this patch it's practically free and
`__asan_register_globals` doesn't show up prominently in the perf
profile anymore.

There are several high-level questions:
1. I understand that the intent is that we hit the slow path rarely,
ideally once before the process dies with an error. But in practice we
hit the slow path a lot. It feels reasonable to keep the amount of work
bounded even in the worst case, even if it requires a bit of extra
memory. But if not, it'd be great to learn about the tradeoffs.
2. Poisoning based ODR checking remains on the slow path. Internally we
build everything with `-fsanitize-address-use-odr-indicator` so I'm not
sure if poisoning-based check would exhibit the same behavior (looking
at the code, the shape looks very similar, so it might?).
3. Globals with an ODR indicator of `-1` need to be skipped for the
purposes of ODR checking (cf.
https://github.com/llvm/llvm-project/commit/a257639a6935a2c63377784c5c9c3b73864a2582).
But they are still getting added to the list of globals and hence take
up space and slow down the iteration over the list of globals. It would
be a good saving if we could avoid adding them to the globals list.
4. Any reason to use a linked list instead of e.g. a vector to store
globals?

**Test Plan**:

* `cmake --build build --target check-asan` looks good
* Perf-wise things look good when linking against this version of
compiler-rt.

---------

Co-authored-by: Vitaly Buka <vitalybuka at google.com>

Added: 
    compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp

Modified: 
    compiler-rt/lib/asan/asan_globals.cpp

Removed: 
    


################################################################################
diff  --git a/compiler-rt/lib/asan/asan_globals.cpp b/compiler-rt/lib/asan/asan_globals.cpp
index ef886e3d484ed..0445a1d44f682 100644
--- a/compiler-rt/lib/asan/asan_globals.cpp
+++ b/compiler-rt/lib/asan/asan_globals.cpp
@@ -21,6 +21,7 @@
 #include "asan_suppressions.h"
 #include "asan_thread.h"
 #include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_dense_map.h"
 #include "sanitizer_common/sanitizer_list.h"
 #include "sanitizer_common/sanitizer_mutex.h"
 #include "sanitizer_common/sanitizer_placement_new.h"
@@ -36,9 +37,11 @@ struct GlobalListNode {
   GlobalListNode *next = nullptr;
 };
 typedef IntrusiveList<GlobalListNode> ListOfGlobals;
+typedef DenseMap<uptr, ListOfGlobals> MapOfGlobals;
 
 static Mutex mu_for_globals;
 static ListOfGlobals list_of_all_globals;
+static MapOfGlobals map_of_globals_by_indicator;
 
 static const int kDynamicInitGlobalsInitialCapacity = 512;
 struct DynInitGlobal {
@@ -148,21 +151,25 @@ static void CheckODRViolationViaIndicator(const Global *g) {
   // Instrumentation requests to skip ODR check.
   if (g->odr_indicator == UINTPTR_MAX)
     return;
+
+  ListOfGlobals &relevant_globals =
+      map_of_globals_by_indicator[g->odr_indicator];
+
   u8 *odr_indicator = reinterpret_cast<u8 *>(g->odr_indicator);
-  if (*odr_indicator == UNREGISTERED) {
-    *odr_indicator = REGISTERED;
-    return;
-  }
-  // If *odr_indicator is DEFINED, some module have already registered
-  // externally visible symbol with the same name. This is an ODR violation.
-  for (const auto &l : list_of_all_globals) {
-    if (g->odr_indicator == l.g->odr_indicator &&
-        (flags()->detect_odr_violation >= 2 || g->size != l.g->size) &&
-        !IsODRViolationSuppressed(g->name)) {
-      ReportODRViolation(g, FindRegistrationSite(g), l.g,
-                         FindRegistrationSite(l.g));
+  if (*odr_indicator == REGISTERED) {
+    // If *odr_indicator is REGISTERED, some module have already registered
+    // externally visible symbol with the same name. This is an ODR violation.
+    for (const auto &l : relevant_globals) {
+      if ((flags()->detect_odr_violation >= 2 || g->size != l.g->size) &&
+          !IsODRViolationSuppressed(g->name))
+        ReportODRViolation(g, FindRegistrationSite(g), l.g,
+                           FindRegistrationSite(l.g));
     }
+  } else {  // UNREGISTERED
+    *odr_indicator = REGISTERED;
   }
+
+  AddGlobalToList(relevant_globals, g);
 }
 
 // Check ODR violation for given global G by checking if it's already poisoned.

diff  --git a/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp b/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp
new file mode 100644
index 0000000000000..091ecfae5e5ca
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp
@@ -0,0 +1,76 @@
+// Test that captures the current behavior of indicator-based ASan ODR checker
+// when globals get unregistered.
+//
+// RUN: %clangxx_asan -g -O0 -DSHARED_LIB -DSIZE=1 %s -fPIC -shared -o %t-so-1.so
+// RUN: %clangxx_asan -g -O0 -DSHARED_LIB -DSIZE=2 %s -fPIC -shared -o %t-so-2.so
+// RUN: %clangxx_asan -g -O0 %s %libdl -Wl,--export-dynamic -o %t
+// RUN: %env_asan_opts=report_globals=2:detect_odr_violation=1 %run %t 2>&1 | FileCheck %s
+
+#include <cstdlib>
+#include <dlfcn.h>
+#include <stdio.h>
+#include <string>
+
+#ifdef SHARED_LIB
+namespace foo {
+char G[SIZE];
+}
+#else // SHARED_LIB
+void *dlopen_or_die(std::string &path) {
+  void *handle = dlopen(path.c_str(), RTLD_NOW);
+  if (handle) {
+    printf("Successfully called dlopen() on %s\n", path.c_str());
+  } else {
+    printf("Error in dlopen(): %s\n", dlerror());
+    std::exit(1);
+  }
+
+  return handle;
+}
+
+void dlclose_or_die(void *handle, std::string &path) {
+  if (!dlclose(handle)) {
+    printf("Successfully called dlclose() on %s\n", path.c_str());
+  } else {
+    printf("Error in dlclose(): %s\n", dlerror());
+    std::exit(1);
+  }
+}
+
+namespace foo {
+char G[1];
+}
+
+// main has its own version of foo::G
+// CHECK: Added Global[[MAIN_G:[^\s]+]] size=1/32 name=foo::G {{.*}}
+int main(int argc, char *argv[]) {
+  std::string base_path = std::string(argv[0]);
+
+  std::string path1 = base_path + "-so-1.so";
+  // dlopen() brings another foo::G but it matches MAIN_G in size so it's not a
+  // violation
+  //
+  //
+  // CHECK: Added Global[[SO1_G:[^\s]+]] size=1/32 name=foo::G {{.*}}
+  // CHECK-NOT: ERROR: AddressSanitizer: odr-violation
+  void *handle1 = dlopen_or_die(path1);
+  // CHECK: Removed Global[[SO1_G]] size=1/32 name=foo::G {{.*}}
+  dlclose_or_die(handle1, path1);
+
+  // At this point the indicator for foo::G is switched to UNREGISTERED for
+  // **both** MAIN_G and SO1_G because the indicator value is shared.
+
+  std::string path2 = base_path + "-so-2.so";
+  // CHECK: Added Global[[SO2_G:[^\s]+]] size=2/32 name=foo::G {{.*}}
+  //
+  // This brings another foo::G but now 
diff erent in size from MAIN_G. We
+  // should've reported a violation, but we actually don't because of what's
+  // described on line60
+  //
+  // CHECK-NOT: ERROR: AddressSanitizer: odr-violation
+  void *handle2 = dlopen_or_die(path2);
+  // CHECK: Removed Global[[MAIN_G]] size=1/32 name=foo::G {{.*}}
+  // CHECK: Removed Global[[SO2_G]] size=2/32 name=foo::G {{.*}}
+}
+
+#endif // SHARED_LIB


        


More information about the llvm-commits mailing list