[PATCH] D73776: Entropic: Boosting LibFuzzer Performance

marcel via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 14 07:27:33 PDT 2020


marcel marked 11 inline comments as done.
marcel added a comment.

Just completed a few tests with the revised patch in FuzzBench. Going to upload the revision soon.



================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:242
 
+  bool DeleteFeatureFreq(InputInfo *II, uint32_t Idx) {
+    if (II->FeatureFreqs.empty())
----------------
vitalybuka wrote:
> DeleteFeatureFreq -> InputInfo::DeleteFeatureFreq
Moved directly into the InputInfo struct.


================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:388
+  void UpdateEnergy(InputInfo *II, size_t GlobalNumberOfFeatures) {
+    long double Energy = 0.0L;
+    size_t SumIncidence = 0;
----------------
vitalybuka wrote:
> "long double" is still there?
> 
Yes, keeping maximum precision during the processing to minimize the cumulative FP arithmetic error, and downcast to double once the processing is done.


================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:278
+      // Remove most abundant rare feature.
+      RareFeatures.erase(remove(RareFeatures.begin(), RareFeatures.end(),
+                                ST_mostAbundantRareFeatureIdx),
----------------
Dor1s wrote:
> marcel wrote:
> > Dor1s wrote:
> > > assuming this code gets executed quite often, and the order inside `RareFeatures` isn't important, we can avoid erase-remove and do something like:
> > > 
> > > ```
> > > RareFeatures[index_from_the_loop] = RareFeatures.back();
> > > RareFeatures.resize(RareFeatures.size() - 1);
> > > ```
> > > 
> > > but the loop on line 269 would have to use index in the vector (from 1 to `< RareFeatures.size()`) instead of the iterator
> > > 
> > > feel free to ignore though, it's just a suggestion which may or may not be a good one :)
> > With the subsequent push_back (Line 292), do you mean a swap and pop_back here?
> yes, `swap` and `pop_back` would have the same effect
Implemented your swap and pop_back idea. Cheers!


================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:380
+  // of the seed. Since we do not know the entropy of a seed that has
+  // never been executed we assign fresh seeds maximum entropy and
+  // let II->Energy approach the true entropy from above.
----------------
Dor1s wrote:
> marcel wrote:
> > Dor1s wrote:
> > > From the code below it seems like `Energy` represents entropy and the max value is 0, which we reduce depending on the actual feature frequencies. Is that correct understanding?
> > Yes, we estimate the entropy over the probabilities of the features in the neighborhood of the seed. Entropy is positive. The maximum entropy is `logl(GlobalNumberOfFeatures)`.
> sorry, I don't understand. Below are the code lines changing `Energy` value:
> 
> ```
>     II->Energy = 0.0;
>     II->SumIncidence = 0;
> 
>     // Apply add-one smoothing to locally discovered features.
>     for (auto F : II->FeatureFreqs) {
>       size_t LocalIncidence = F.second + 1;
>       Energy -= LocalIncidence * logl(LocalIncidence);
>       SumIncidence += LocalIncidence;
>     }
> 
>     <...>
> 
>     // Add a single locally abundant feature apply add-one smoothing.
>     size_t AbdIncidence = II->NumExecutedMutations + 1;
>     Energy -= AbdIncidence * logl(AbdIncidence);
>     <...>
> 
>     // Normalize.
>     if (SumIncidence != 0)
>       Energy = (Energy / SumIncidence) + logl(SumIncidence);
> 
>     II->Energy = (double)Energy;
>   <...>
>   }
> ```
> 
> as I read this, I see that `Energy` should be negative in many cases?
> 
Sorry for the brevity. This is why `II->Energy` is positive. Entropy is computed as $-\sum_{i=1}^S p_i \log(p_i)$ where $p_i$ is the probability that fuzzing `II` generates an input that exercises feature $i$ and $S$ is the total number of rare features. We could estimate the probability $p_i$ as the proportion of generated inputs that exercise $i$, i.e., $\hat p_i = LocalIncidence_i / SumIncidence$. If you plug this proportion into the formula for entropy, you can compute entropy as $[-\sum_{i=1}^S LocalIncidence_i \log(LocalIncidence_i)] / SumIncidence + log(SumIncidence)$. While Energy is certainly negative before `// Normalize.`, it is positive after.

Just drop me a DM. I'll send you the write up.


================
Comment at: compiler-rt/lib/fuzzer/FuzzerFlags.def:156
      "will choose the focus functions automatically.")
+FUZZER_FLAG_INT(entropic, 0, "Experimental. Enables entropic power schedule.")
+FUZZER_FLAG_INT(considered_rare, 0xFF, "Experimental. If entropic is enabled, "
----------------
vitalybuka wrote:
> vitalybuka wrote:
> > entropic -> focus_rare_features
> > 
> > 
> > Not sure how, it would be nice to rename sparse_energy_updates as something meaningful to libfuzzer user, to make it explain behavior change, not implementation details like now.
> many of comments are marked as "Done" but I see no changes.
> 
Tried to address all comments either inline or in the summary. In this case, I wrote
>>! In D73776#1921184, @marcel wrote:
> * We keep the entropic option, though. Hope this is okay.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73776/new/

https://reviews.llvm.org/D73776





More information about the llvm-commits mailing list