[compiler-rt] c584c42 - [asan] Speed up ASan ODR indicator-based checking (#100923)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 1 13:58:19 PDT 2024
Author: Artem Pianykh
Date: 2024-08-01T13:58:14-07:00
New Revision: c584c4227105c2aae332ee2a2099c9df6c3a3e1c
URL: https://github.com/llvm/llvm-project/commit/c584c4227105c2aae332ee2a2099c9df6c3a3e1c
DIFF: https://github.com/llvm/llvm-project/commit/c584c4227105c2aae332ee2a2099c9df6c3a3e1c.diff
LOG: [asan] Speed up ASan ODR indicator-based checking (#100923)
**Summary**:
When ASan checks for a potential ODR violation on a global it loops over
a linked list of all globals to find those with the matching value of an
indicator. With the default setting 'detect_odr_violation=1', ASan
doesn't report violations on same-size globals but it still has to
traverse the list. For larger binaries with a ton of shared libs and
globals (and a non-trivial volume of same-sized duplicates) this gets
extremely expensive.
This patch adds an indicator indexed (multi-)map of globals to speed up
the search.
> Note: asan used to use a map to store globals a while ago which was
replaced with a list when the codebase [moved off of
STL](https://github.com/llvm/llvm-project/commit/e4bada2c946e5399fc37bd67421de01c0047ad38).
Internally we see many examples where ODR checking takes *seconds* (even
double digits). With this patch it's practically free and
`__asan_register_globals` doesn't show up prominently in the perf
profile anymore.
There are several high-level questions:
1. I understand that the intent is that we hit the slow path rarely,
ideally once before the process dies with an error. But in practice we
hit the slow path a lot. It feels reasonable to keep the amount of work
bounded even in the worst case, even if it requires a bit of extra
memory. But if not, it'd be great to learn about the tradeoffs.
2. Poisoning based ODR checking remains on the slow path. Internally we
build everything with `-fsanitize-address-use-odr-indicator` so I'm not
sure if poisoning-based check would exhibit the same behavior (looking
at the code, the shape looks very similar, so it might?).
3. Globals with an ODR indicator of `-1` need to be skipped for the
purposes of ODR checking (cf.
https://github.com/llvm/llvm-project/commit/a257639a6935a2c63377784c5c9c3b73864a2582).
But they are still getting added to the list of globals and hence take
up space and slow down the iteration over the list of globals. It would
be a good saving if we could avoid adding them to the globals list.
4. Any reason to use a linked list instead of e.g. a vector to store
globals?
**Test Plan**:
* `cmake --build build --target check-asan` looks good
* Perf-wise things look good when linking against this version of
compiler-rt.
---------
Co-authored-by: Vitaly Buka <vitalybuka at google.com>
Added:
compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp
Modified:
compiler-rt/lib/asan/asan_globals.cpp
Removed:
################################################################################
diff --git a/compiler-rt/lib/asan/asan_globals.cpp b/compiler-rt/lib/asan/asan_globals.cpp
index ef886e3d484ed..0445a1d44f682 100644
--- a/compiler-rt/lib/asan/asan_globals.cpp
+++ b/compiler-rt/lib/asan/asan_globals.cpp
@@ -21,6 +21,7 @@
#include "asan_suppressions.h"
#include "asan_thread.h"
#include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_dense_map.h"
#include "sanitizer_common/sanitizer_list.h"
#include "sanitizer_common/sanitizer_mutex.h"
#include "sanitizer_common/sanitizer_placement_new.h"
@@ -36,9 +37,11 @@ struct GlobalListNode {
GlobalListNode *next = nullptr;
};
typedef IntrusiveList<GlobalListNode> ListOfGlobals;
+typedef DenseMap<uptr, ListOfGlobals> MapOfGlobals;
static Mutex mu_for_globals;
static ListOfGlobals list_of_all_globals;
+static MapOfGlobals map_of_globals_by_indicator;
static const int kDynamicInitGlobalsInitialCapacity = 512;
struct DynInitGlobal {
@@ -148,21 +151,25 @@ static void CheckODRViolationViaIndicator(const Global *g) {
// Instrumentation requests to skip ODR check.
if (g->odr_indicator == UINTPTR_MAX)
return;
+
+ ListOfGlobals &relevant_globals =
+ map_of_globals_by_indicator[g->odr_indicator];
+
u8 *odr_indicator = reinterpret_cast<u8 *>(g->odr_indicator);
- if (*odr_indicator == UNREGISTERED) {
- *odr_indicator = REGISTERED;
- return;
- }
- // If *odr_indicator is DEFINED, some module have already registered
- // externally visible symbol with the same name. This is an ODR violation.
- for (const auto &l : list_of_all_globals) {
- if (g->odr_indicator == l.g->odr_indicator &&
- (flags()->detect_odr_violation >= 2 || g->size != l.g->size) &&
- !IsODRViolationSuppressed(g->name)) {
- ReportODRViolation(g, FindRegistrationSite(g), l.g,
- FindRegistrationSite(l.g));
+ if (*odr_indicator == REGISTERED) {
+ // If *odr_indicator is REGISTERED, some module have already registered
+ // externally visible symbol with the same name. This is an ODR violation.
+ for (const auto &l : relevant_globals) {
+ if ((flags()->detect_odr_violation >= 2 || g->size != l.g->size) &&
+ !IsODRViolationSuppressed(g->name))
+ ReportODRViolation(g, FindRegistrationSite(g), l.g,
+ FindRegistrationSite(l.g));
}
+ } else { // UNREGISTERED
+ *odr_indicator = REGISTERED;
}
+
+ AddGlobalToList(relevant_globals, g);
}
// Check ODR violation for given global G by checking if it's already poisoned.
diff --git a/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp b/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp
new file mode 100644
index 0000000000000..091ecfae5e5ca
--- /dev/null
+++ b/compiler-rt/test/asan/TestCases/Linux/odr_indicator_unregister.cpp
@@ -0,0 +1,76 @@
+// Test that captures the current behavior of indicator-based ASan ODR checker
+// when globals get unregistered.
+//
+// RUN: %clangxx_asan -g -O0 -DSHARED_LIB -DSIZE=1 %s -fPIC -shared -o %t-so-1.so
+// RUN: %clangxx_asan -g -O0 -DSHARED_LIB -DSIZE=2 %s -fPIC -shared -o %t-so-2.so
+// RUN: %clangxx_asan -g -O0 %s %libdl -Wl,--export-dynamic -o %t
+// RUN: %env_asan_opts=report_globals=2:detect_odr_violation=1 %run %t 2>&1 | FileCheck %s
+
+#include <cstdlib>
+#include <dlfcn.h>
+#include <stdio.h>
+#include <string>
+
+#ifdef SHARED_LIB
+namespace foo {
+char G[SIZE];
+}
+#else // SHARED_LIB
+void *dlopen_or_die(std::string &path) {
+ void *handle = dlopen(path.c_str(), RTLD_NOW);
+ if (handle) {
+ printf("Successfully called dlopen() on %s\n", path.c_str());
+ } else {
+ printf("Error in dlopen(): %s\n", dlerror());
+ std::exit(1);
+ }
+
+ return handle;
+}
+
+void dlclose_or_die(void *handle, std::string &path) {
+ if (!dlclose(handle)) {
+ printf("Successfully called dlclose() on %s\n", path.c_str());
+ } else {
+ printf("Error in dlclose(): %s\n", dlerror());
+ std::exit(1);
+ }
+}
+
+namespace foo {
+char G[1];
+}
+
+// main has its own version of foo::G
+// CHECK: Added Global[[MAIN_G:[^\s]+]] size=1/32 name=foo::G {{.*}}
+int main(int argc, char *argv[]) {
+ std::string base_path = std::string(argv[0]);
+
+ std::string path1 = base_path + "-so-1.so";
+ // dlopen() brings another foo::G but it matches MAIN_G in size so it's not a
+ // violation
+ //
+ //
+ // CHECK: Added Global[[SO1_G:[^\s]+]] size=1/32 name=foo::G {{.*}}
+ // CHECK-NOT: ERROR: AddressSanitizer: odr-violation
+ void *handle1 = dlopen_or_die(path1);
+ // CHECK: Removed Global[[SO1_G]] size=1/32 name=foo::G {{.*}}
+ dlclose_or_die(handle1, path1);
+
+ // At this point the indicator for foo::G is switched to UNREGISTERED for
+ // **both** MAIN_G and SO1_G because the indicator value is shared.
+
+ std::string path2 = base_path + "-so-2.so";
+ // CHECK: Added Global[[SO2_G:[^\s]+]] size=2/32 name=foo::G {{.*}}
+ //
+ // This brings another foo::G but now
diff erent in size from MAIN_G. We
+ // should've reported a violation, but we actually don't because of what's
+ // described on line60
+ //
+ // CHECK-NOT: ERROR: AddressSanitizer: odr-violation
+ void *handle2 = dlopen_or_die(path2);
+ // CHECK: Removed Global[[MAIN_G]] size=1/32 name=foo::G {{.*}}
+ // CHECK: Removed Global[[SO2_G]] size=2/32 name=foo::G {{.*}}
+}
+
+#endif // SHARED_LIB
More information about the llvm-commits
mailing list