[compiler-rt] [asan] Speed up ASan ODR indicator-based checking (PR #100923)

via llvm-commits llvm-commits at lists.llvm.org
Sun Jul 28 03:11:16 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Artem Pianykh (artempyanykh)

<details>
<summary>Changes</summary>

**Summary**:
When ASan checks for ODR violation on a global it loops over a linked list of all globals to find those with the matching value of an indicator. For larger binaries with a ton of globals this gets extremely expensive: O(N^2) iterations + potentially poor memory locality from the linked list.

This patch adds an indicator indexed (multi-)map of globals to speed up the search. With a bit of extra memory this massively speeds up indicator based ODR checks.

Poisoning based ODR checking remains on the slow path.

Internally we see many examples where ODR checking takes *seconds* (even double digits). With this
patch, the same binaries where ODR checking used to take seconds, now it's practically free and
`__asan_register_globals` doesn't show up prominently in the perf profile anymore.

*Notes and considerations*:
1. The map maintains a list of globals with the same indicator and it's not clear if we need the whole
    list or just the first instance of a global. With the default 'halt_on_error=1' just the first one
    would be enough, but with 'halt_on_error=0' we may need all (?)
2. For poisoning based ODR check we still loop over all globals. We could do a similar map-based
    lookup in a different map here, but given that it's an older impl of ODR checking it's not clear how
    useful/important it'd be to speed it up. We build everything with `-fsanitize-address-use-odr-indicator`, 
    so not terribly important to us, but perhaps a good follow up.
3. GetGlobalsForAddress needs to loop over all globals. This functionality flows into
    __asan_decribe_address which is part of asan_interface. Perhaps switching from a linked list to a
    vector to store all globals could be a good middle-ground to speed up p.2 and p.3 iterations.
4. Globals with an ODR indicator of `-1` need to be skipped for the purposes of ODR
    checking (cf. https://github.com/llvm/llvm-project/commit/a257639a6935a2c63377784c5c9c3b73864a2582). But     they are still getting added to the list of globals.
        a. We often see that the majority of inserted globals have an ODR indicator of `-1` 
            and hence take up space and slow down the iteration over the list of globals. 
            If we could skip them it would be a good saving, but given p.3 above it's not clear if we can. 
        b. In the context of this patch, it means that the effect of introducing an indicator based map
            is two-fold: firstly, we're now dealing with only a smaller number of globals that *should* be
            checked for ODR, and secondly we can do a lookup in constant time rather than having to do a linear scan.

**Test Plan**:

* `cmake --build build --target check-asan` looks good
* Perf-wise things look good when linking against this version of compiler-rt.

---
Full diff: https://github.com/llvm/llvm-project/pull/100923.diff


1 Files Affected:

- (modified) compiler-rt/lib/asan/asan_globals.cpp (+34-9) 


``````````diff
diff --git a/compiler-rt/lib/asan/asan_globals.cpp b/compiler-rt/lib/asan/asan_globals.cpp
index d413b1ebc9fc0..26518c8ce4b98 100644
--- a/compiler-rt/lib/asan/asan_globals.cpp
+++ b/compiler-rt/lib/asan/asan_globals.cpp
@@ -21,6 +21,7 @@
 #include "asan_suppressions.h"
 #include "asan_thread.h"
 #include "sanitizer_common/sanitizer_common.h"
+#include "sanitizer_common/sanitizer_dense_map.h"
 #include "sanitizer_common/sanitizer_mutex.h"
 #include "sanitizer_common/sanitizer_placement_new.h"
 #include "sanitizer_common/sanitizer_stackdepot.h"
@@ -35,8 +36,11 @@ struct ListOfGlobals {
   ListOfGlobals *next;
 };
 
+typedef DenseMap<uptr, ListOfGlobals *> MapOfGlobals;
+
 static Mutex mu_for_globals;
 static ListOfGlobals *list_of_all_globals;
+static MapOfGlobals map_of_globals_by_indicator;
 
 static const int kDynamicInitGlobalsInitialCapacity = 512;
 struct DynInitGlobal {
@@ -73,6 +77,18 @@ ALWAYS_INLINE void PoisonRedZones(const Global &g) {
 
 const uptr kMinimalDistanceFromAnotherGlobal = 64;
 
+static void AddGlobalToList(ListOfGlobals *&list, const Global *g) {
+  ListOfGlobals *l = new (GetGlobalLowLevelAllocator()) ListOfGlobals;
+  l->g = g;
+  l->next = list;
+  list = l;
+}
+
+static void AddGlobalToMap(MapOfGlobals &map, const Global *g) {
+  ListOfGlobals *&in_map = map[g->odr_indicator];
+  AddGlobalToList(in_map, g);
+}
+
 static bool IsAddressNearGlobal(uptr addr, const __asan_global &g) {
   if (addr <= g.beg - kMinimalDistanceFromAnotherGlobal) return false;
   if (addr >= g.beg + g.size_with_redzone) return false;
@@ -147,14 +163,20 @@ static void CheckODRViolationViaIndicator(const Global *g) {
     *odr_indicator = REGISTERED;
     return;
   }
+  // Fetch globals with the same ODR indicator
+  auto *relevant_globals_lookup =
+      map_of_globals_by_indicator.find(g->odr_indicator);
+  if (!relevant_globals_lookup) {
+    return;
+  }
+  ListOfGlobals *relevant_globals = relevant_globals_lookup->second;
   // If *odr_indicator is DEFINED, some module have already registered
   // externally visible symbol with the same name. This is an ODR violation.
-  for (ListOfGlobals *l = list_of_all_globals; l; l = l->next) {
-    if (g->odr_indicator == l->g->odr_indicator &&
-        (flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
+  for (ListOfGlobals *l = relevant_globals; l; l = l->next) {
+    if ((flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
         !IsODRViolationSuppressed(g->name))
-      ReportODRViolation(g, FindRegistrationSite(g),
-                         l->g, FindRegistrationSite(l->g));
+      ReportODRViolation(g, FindRegistrationSite(g), l->g,
+                         FindRegistrationSite(l->g));
   }
 }
 
@@ -225,10 +247,13 @@ static void RegisterGlobal(const Global *g) {
   }
   if (CanPoisonMemory())
     PoisonRedZones(*g);
-  ListOfGlobals *l = new (GetGlobalLowLevelAllocator()) ListOfGlobals;
-  l->g = g;
-  l->next = list_of_all_globals;
-  list_of_all_globals = l;
+
+  AddGlobalToList(list_of_all_globals, g);
+
+  if (UseODRIndicator(g) && g->odr_indicator != UINTPTR_MAX) {
+    AddGlobalToMap(map_of_globals_by_indicator, g);
+  }
+
   if (g->has_dynamic_init) {
     if (!dynamic_init_globals) {
       dynamic_init_globals = new (GetGlobalLowLevelAllocator()) VectorOfGlobals;

``````````

</details>


https://github.com/llvm/llvm-project/pull/100923


More information about the llvm-commits mailing list