[PATCH] D73776: Entropic: Boosting LibFuzzer Performance
Max Moroz via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Mar 18 22:15:10 PDT 2020
Dor1s added inline comments.
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:69
delete II;
+ RareFeatures.clear();
}
----------------
since this is a vector, I don't think we need to manually clear it in the destructor
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:120
+ // Assign maximal energy to the new seed.
+ II.Energy = RareFeatures.empty() ? 1.0 : logl(RareFeatures.size());
+ II.SumIncidence = RareFeatures.size();
----------------
nit: seems like `log` should be sufficient, as Energy is `double`, not `long double`
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:269
+
+ for (uint32_t Idx2 : RareFeatures) {
+ if (GlobalFeatureFreqs[Idx2] >=
----------------
nit: I'd rather use `auto` or `decltype(RareFeatures)::value_type` to avoid type mismatch if we ever change `RareFeatures` definition and forget to change the type here
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:278
+ // Remove most abundant rare feature.
+ RareFeatures.erase(remove(RareFeatures.begin(), RareFeatures.end(),
+ ST_mostAbundantRareFeatureIdx),
----------------
assuming this code gets executed quite often, and the order inside `RareFeatures` isn't important, we can avoid erase-remove and do something like:
```
RareFeatures[index_from_the_loop] = RareFeatures.back();
RareFeatures.resize(RareFeatures.size() - 1);
```
but the loop on line 269 would have to use index in the vector (from 1 to `< RareFeatures.size()`) instead of the iterator
feel free to ignore though, it's just a suggestion which may or may not be a good one :)
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:292
+ // Add rare feature and handle collisions.
+ RareFeatures.push_back(Idx);
+ GlobalFeatureFreqs[Idx] = 0;
----------------
since we always do this, `resize()` in my suggestion above might not be even needed, but that's a minor thing
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:299
+ // undiscovered feature.
+ for (auto II : Inputs) {
+ if (II->Energy > 0.0) {
----------------
can this and the loop on the line 294 be combined?
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:300
+ for (auto II : Inputs) {
+ if (II->Energy > 0.0) {
+ II->SumIncidence += 1;
----------------
please consider adding a comment why we skip inputs with zero energy
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:355
+ // Update local frequencies.
+ if (II) {
+ if (II->FeatureFreqs.empty()) {
----------------
invert the condition to reduce indentation:
```
if (!II)
return;
```
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:357
+ if (II->FeatureFreqs.empty()) {
+ II->FeatureFreqs.push_back(std::pair<uint32_t, uint16_t>(Idx32, 1));
+ } else {
----------------
can do return after this line and avoid `else` with extra indentation below
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:357
+ if (II->FeatureFreqs.empty()) {
+ II->FeatureFreqs.push_back(std::pair<uint32_t, uint16_t>(Idx32, 1));
+ } else {
----------------
Dor1s wrote:
> can do return after this line and avoid `else` with extra indentation below
what is `1` here? would it make sense to have a static const variable with a descriptive name, e.g. `kDefaultSomething`?
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:362
+ std::lower_bound(II->FeatureFreqs.begin(), II->FeatureFreqs.end(),
+ std::pair<uint32_t, uint16_t>(Idx32, 0));
+
----------------
same point, what is `0`? a constant with a descriptive name or (less preferred) comment would be really helpful for others who come to read the code in future
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:380
+ // of the seed. Since we do not know the entropy of a seed that has
+ // never been executed we assign fresh seeds maximum entropy and
+ // let II->Energy approach the true entropy from above.
----------------
>From the code below it seems like `Energy` represents entropy and the max value is 0, which we reduce depending on the actual feature frequencies. Is that correct understanding?
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:445
+ // i.e., randomly do not skip.
+ if (!DistributionNeedsUpdate && Rand(kSparseEnergyUpdates))
+ return;
----------------
it seems like the `Rand(kSparseEnergyUpdates)` clause is applicable to the `Entropic` case only, is that correct? Do we really need it in the vanilla case?
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:473
+ for (size_t i = 0; i < N; i++) {
+ Weights[i] =
+ Inputs[i]->NumFeatures == 0 ||
----------------
could you please rewrite this in a more readable if-else form?
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:475
+ Inputs[i]->NumFeatures == 0 ||
+ Inputs[i]->NumExecutedMutations / 20 >
+ NumExecutedMutations / Inputs.size() ||
----------------
why `20`? a constant with a descriptive name or a comment would be appreciated
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:487
+ if (Inputs[i]->HasFocusFunction) {
+ VanillaSchedule = true;
+ break;
----------------
so if there is at least one input that touches focus function, we will be always wasting time in this loop (starting on the line 472) and then falling back to the `VanillaSchedule` case? In such case I think we should just check `Options.FocusFunction` and use vanilla schedule if it's set, just because almost always there will be input(s) touching the focus function
================
Comment at: compiler-rt/lib/fuzzer/FuzzerLoop.cpp:712
}
+ II.NeedsUpdate = true;
}
----------------
does it always need update, even when new coverage wasn't observed?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73776/new/
https://reviews.llvm.org/D73776
More information about the llvm-commits
mailing list