[compiler-rt] [Fuzzer] Optimize UpdateFeatureFrequency (PR #65288)

Arseny Kapoulkine via llvm-commits llvm-commits at lists.llvm.org
Mon Sep 4 20:02:10 PDT 2023


https://github.com/zeux created https://github.com/llvm/llvm-project/pull/65288:

Instead of a linear scan, use a bitset to track rarity of features. This improves fuzzer throughput rather dramatically (close to 2x) in early exploratory phases; in steady state this seems to improve fuzzing throughput by ~15% according to perf.

The benchmarks are done on an executable with ~100k features, so the results may change based on the executable that's being fuzzed.

kFeatureSetSize is 2M so the bitset is adding 256 KB to sizeof(InputCorpus), but this should be fine since there's already three arrays indexed by feature index for a total of 200 MB.

>From 6a9c2b153ef2e702d845f03d1f765c4b3dbd07e8 Mon Sep 17 00:00:00 2001
From: Arseny Kapoulkine <arseny.kapoulkine at gmail.com>
Date: Thu, 29 Jun 2023 10:38:48 -0700
Subject: [PATCH] [Fuzzer] Optimize UpdateFeatureFrequency

Instead of a linear scan, use a bitset to track rarity of features. This
improves fuzzer throughput rather dramatically (close to 2x) in early
exploratory phases; in steady state this seems to improve fuzzing
throughput by ~15% according to perf.

The benchmarks are done on an executable with ~100k features, so the
results may change based on the executable that's being fuzzed.

kFeatureSetSize is 2M so the bitset is adding 256 KB to
sizeof(InputCorpus), but this should be fine since there's already three
arrays indexed by feature index for a total of 200 MB.
---
 compiler-rt/lib/fuzzer/FuzzerCorpus.h | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/compiler-rt/lib/fuzzer/FuzzerCorpus.h b/compiler-rt/lib/fuzzer/FuzzerCorpus.h
index 912082be8fbaea..48b5a2cff02e25 100644
--- a/compiler-rt/lib/fuzzer/FuzzerCorpus.h
+++ b/compiler-rt/lib/fuzzer/FuzzerCorpus.h
@@ -18,6 +18,7 @@
 #include "FuzzerSHA1.h"
 #include "FuzzerTracePC.h"
 #include <algorithm>
+#include <bitset>
 #include <chrono>
 #include <numeric>
 #include <random>
@@ -382,6 +383,7 @@ class InputCorpus {
       }
 
       // Remove most abundant rare feature.
+      IsRareFeature[Delete] = false;
       RareFeatures[Delete] = RareFeatures.back();
       RareFeatures.pop_back();
 
@@ -397,6 +399,7 @@ class InputCorpus {
 
     // Add rare feature, handle collisions, and update energy.
     RareFeatures.push_back(Idx);
+    IsRareFeature[Idx] = true;
     GlobalFeatureFreqs[Idx] = 0;
     for (auto II : Inputs) {
       II->DeleteFeatureFreq(Idx);
@@ -450,9 +453,7 @@ class InputCorpus {
     uint16_t Freq = GlobalFeatureFreqs[Idx32]++;
 
     // Skip if abundant.
-    if (Freq > FreqOfMostAbundantRareFeature ||
-        std::find(RareFeatures.begin(), RareFeatures.end(), Idx32) ==
-            RareFeatures.end())
+    if (Freq > FreqOfMostAbundantRareFeature || !IsRareFeature[Idx32])
       return;
 
     // Update global frequencies.
@@ -581,6 +582,7 @@ class InputCorpus {
   uint16_t FreqOfMostAbundantRareFeature = 0;
   uint16_t GlobalFeatureFreqs[kFeatureSetSize] = {};
   std::vector<uint32_t> RareFeatures;
+  std::bitset<kFeatureSetSize> IsRareFeature;
 
   std::string OutputCorpus;
 };



More information about the llvm-commits mailing list