[PATCH] D95734: Use alias analysis to remove redundant instrumentation for Asan

Chijin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Jan 30 04:47:23 PST 2021


ChijinZ created this revision.
ChijinZ added a reviewer: kcc.
ChijinZ added a project: Sanitizers.
Herald added a subscriber: hiraditya.
ChijinZ requested review of this revision.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

Motivation
----------

Sometimes, Asan will check multiple times for the same pointer. For example:

  c++
    for (int i = 0; i < MAXSIZE; i++) {
      a[i] = i; // check
      for (int j = 0; j < MAXSIZE; j++) {
        a[i] += sin(j); // check
      }
      printf("%d", a[i]); // check
    }

Asan will insert instrumentation before every *load* and *store*. In this case, Asan will check a[i] multiple times, which is redundant.

Approach
--------

The approach of this commit is very simple: maintain a meta structure for instrumented pointers; before instrumenting a Asan check for a pointer, we check if current pointer *alias* previously-instrumented pointers; if current pointer *must alias* previous pointers, then we do not need to check it.

Correctness
-----------

The correctness is self-evident: if a pointer *must alias* previously-instrumented pointers, then its memory safety has been checked before, then we do not need to check it. So if the Alias Analysis provided by llvm is correct, then this optimization will not introduce false-negative. In addition, I use Juliet Test Cases for C/C++ <https://samate.nist.gov/SRD/testsuite.php> to see if it introduces additional false-negative. The results show that the optimized one and original ASan have identical ability in memory-related CWEs (CWE-121, 122, 124, 126, 127, 415, 416).

Evaluation
----------

The optimization can remove ~30% instrumentation points in real-world settings. I choose 6 real-world projects from Google Fuzzer Test Suite <https://github.com/google/fuzzer-test-suite>. And the number of Asan check shows below:

Project	    |   Asan    |   Asan-opt    |   Removed         |
sqlite      |   53563   |   35154       |   18409 (34.4%)   |
libjpeg	    |   37166   |	29481       |	7685 (20.7%)    |
guetzli     |   11993   |	8262        |	3731 (31.1%)    |
lcms	    |   10786   |	8176        |	2610 (24.2%)    |
libpng      |	13186   |	9349        |	3837 (29.0%)    |
freetype2   |  	28958   |	18131       |	10827 (37.4%)   |


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D95734

Files:
  llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp


Index: llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
===================================================================
--- llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
+++ llvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
@@ -28,6 +28,7 @@
 #include "llvm/Analysis/MemoryBuiltins.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/ValueTracking.h"
+#include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/BinaryFormat/MachO.h"
 #include "llvm/IR/Argument.h"
 #include "llvm/IR/Attributes.h"
@@ -399,6 +400,8 @@
 
 static cl::opt<int> ClDebugMax("asan-debug-max", cl::desc("Debug max inst"),
                                cl::Hidden, cl::init(-1));
+static cl::opt<bool> AliasOpt("alias-opt-on", cl::desc("alias optimization"),
+                              cl::Hidden, cl::init(false));
 
 STATISTIC(NumInstrumentedReads, "Number of instrumented reads");
 STATISTIC(NumInstrumentedWrites, "Number of instrumented writes");
@@ -409,6 +412,12 @@
 
 namespace {
 
+AAResults *GlobalAA = nullptr;
+// record instrumented pointers
+std::vector<std::pair<Value *, uint64_t>> InstrumentedPointers;
+// record the number of removed redundant instrumentation
+uint64_t AliasCounter = 0;
+
 /// This struct defines the shadow mapping using the rule:
 ///   shadow = (mem >> Scale) ADD-or-OR Offset.
 /// If InGlobal is true, then
@@ -719,9 +728,16 @@
   void getAnalysisUsage(AnalysisUsage &AU) const override {
     AU.addRequired<ASanGlobalsMetadataWrapperPass>();
     AU.addRequired<TargetLibraryInfoWrapperPass>();
+    AU.addRequired<AAResultsWrapperPass>();
   }
 
   bool runOnFunction(Function &F) override {
+    if (AliasOpt) {
+      // initialize variables
+      GlobalAA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
+      InstrumentedPointers.clear();
+      AliasCounter = 0;
+    }
     GlobalsMetadata &GlobalsMD =
         getAnalysis<ASanGlobalsMetadataWrapperPass>().getGlobalsMD();
     const TargetLibraryInfo *TLI =
@@ -1170,6 +1186,12 @@
 
 PreservedAnalyses AddressSanitizerPass::run(Function &F,
                                             AnalysisManager<Function> &AM) {
+  if (AliasOpt) {
+    // initialize variables
+    GlobalAA = &AM.getResult<AAManager>(F);
+    InstrumentedPointers.clear();
+    AliasCounter = 0;
+  }
   auto &MAMProxy = AM.getResult<ModuleAnalysisManagerFunctionProxy>(F);
   Module &M = *F.getParent();
   if (auto *R = MAMProxy.getCachedResult<ASanGlobalsMetadataAnalysis>(M)) {
@@ -1646,6 +1668,27 @@
                                          uint32_t TypeSize, bool IsWrite,
                                          Value *SizeArgument, bool UseCalls,
                                          uint32_t Exp) {
+  // Perform AA optimization to remove redundant instrumentation
+  if (AliasOpt && GlobalAA) {
+    for (auto &OtherAddr : InstrumentedPointers) {
+      if (Addr == nullptr || OtherAddr.first == nullptr) {
+        continue;
+      }
+      // Perform AA to the current pointer and previous pointer
+      AliasResult Results = GlobalAA->alias(
+          MemoryLocation(OtherAddr.first, LocationSize(OtherAddr.second)),
+          MemoryLocation(Addr, LocationSize((uint64_t)TypeSize)));
+      // If the current pointer alias one of instrumented pointers, will not
+      // instrument it
+      if (Results == AliasResult::MustAlias) {
+        AliasCounter += 1;
+        return;
+      }
+    }
+    // The current pointer did not show up before. Instrument it.
+    InstrumentedPointers.push_back(
+        std::pair<Value *, uint64_t>(Addr, TypeSize));
+  }
   bool IsMyriad = TargetTriple.getVendor() == llvm::Triple::Myriad;
 
   IRBuilder<> IRB(InsertBefore);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D95734.320293.patch
Type: text/x-patch
Size: 3705 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210130/8f3b61e9/attachment.bin>


More information about the llvm-commits mailing list