[PATCH] D100161: Redistribute energy for Corpus

Matt Morehouse via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Apr 12 10:37:34 PDT 2021


morehouse added a subscriber: metzman.
morehouse added a comment.

Thanks for sharing your data.  Took a quick look and seems promising.

I would like to try this on FuzzBench before accepting the patch though.  FuzzBench has a very nice experimental framework for evaluating changes like this.

> It seems that FuzzBench does not accept this parallel mode evaluation.

I talked to @metzman who manages FuzzBench.  Sounds like you're correct, FuzzBench uses only one worker process in fork mode.  @metzman said we could probably run a special experiment with more workers to evaluate this patch.

Another approach that might be worth doing, is to make the patch effective even for a single worker.  For example, maybe we randomly pick from a subset of the corpus for that single worker.

Also, I'm curious how the number of fork-mode workers affects efficacy.  I can imagine with lots of workers that this patch could perform much worse.  Specifically if we have a small number of corpus elements per worker, the crossover mutation becomes quite limited.



================
Comment at: compiler-rt/lib/fuzzer/FuzzerFork.cpp:143
       for (size_t i = 0; i < CorpusSubsetSize; i++) {
-        auto &SF = Files[Rand->SkewTowardsLast(Files.size())];
-        Seeds += (Seeds.empty() ? "" : ",") + SF;
-        CollectDFT(SF);
+        size_t j = Rand->SkewTowardsLast(AverageSize);
+        size_t m = j + StartIndex;
----------------
Does keeping this skew within each subset of the corpus help?  Would a uniform-random approach change efficacy?


================
Comment at: compiler-rt/lib/fuzzer/FuzzerFork.cpp:150
+        }
+        else  {
+                auto &SF = Files[Rand->SkewTowardsLast(Files.size())];
----------------
So we would still use the current strategy on the rare occasion when we get an `m` index past the end of `Files`.  I imagine there's always some cases where the original strategy will be beneficial at least for the crossover mutation.  Maybe we could make the probability of using the current strategy more explicit. Something like

```
if (RandInt(100) < 90)
  // Use new strategy
else
  // Use old strategy
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100161/new/

https://reviews.llvm.org/D100161



More information about the cfe-commits mailing list