[PATCH] D49621: [libFuzzer] Initial implementation of weighted mutation leveraging during runtime.
Max Moroz via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jul 31 11:26:23 PDT 2018
Dor1s added inline comments.
================
Comment at: lib/fuzzer/FuzzerMutate.cpp:23
+const double kDefaultMutationWeight = 1;
+const double kDefaultMutationStat = 1 / (100 * 1000);
----------------
metzman wrote:
> Dor1s wrote:
> > kodewilliams wrote:
> > > Dor1s wrote:
> > > > metzman wrote:
> > > > > Please add a comment to explain the significance of `100` and `1000` (frankly i don't know what the purpose is of either since we don't actually round anything).
> > > > +1, what is it for?
> > > It is just there to represent a usefulness ratio that is near to but not entirely useless. So that it still gets weight instead of calculating to 0.
> > That doesn't seem right to me. Let's say we have a tough target -- where it's hard to reach any new coverage. In fact, we have a lot of them, when we use a full corpus.
> >
> > So, after running for a while, it finally finds a couple useful mutations, all of them get weights, say, 10^(-6) (again, totally possible), while all "useless" mutations get the default weight of 10^(-5). That would mean, "useless" mutations would be chosen much more often than "useful" ones.
> >
> > IMO, the better approach would be something like what we've discussed in the past:
> >
> > 1) make random decision whether we use weighted mutations or default selection, 80% vs 20% should be fine. Once start testing, can change to 90 vs 10 or any other proportion
> > 2) if weighted mutations was chosen, call WeightedIndex(); otherwise, use default case
> >
> > That approach would be universal as it doesn't use any magic numbers which correspond to a particular target, as your 100*1000 can behave very differently with targets having different speed.
> I think it will be harder to determine if this technique is useful with the 80/20 strategy.
> If the concern is that the weight is too high, why don't we use the smallest positive double as the default?
> I think it will be harder to determine if this technique is useful with the 80/20 strategy.
Why will it be harder? That percentage would just control the factor of how strongly our strategy affects fuzzing process. We should still be able to see either a negative a positive impact. Changing the distribution (e.g. use 90/10 after 80/20) would multiple that impact a little bit.
> If the concern is that the weight is too high, why don't we use the smallest positive double as the default?
I can't think of a value that would work well for both cases when useful mutations have stats like 10^(-3) and 10^(-6). We should avoid relying on anything that depends on fuzz targets speed / complexity of finding a new input. We already have magic threshold of 10000, let's not add more of those.
================
Comment at: lib/fuzzer/FuzzerMutate.cpp:527
for (int Iter = 0; Iter < 100; Iter++) {
- auto M = &Mutators[Rand(Mutators.size())];
+ if (Options.UseWeightedMutations)
+ M = &Mutators[WeightedIndex()];
----------------
With my proposal it will be something like:
```
// Even when use weighted mutations, fallback to the default selection in 20% of cases.
if (Options.UseWeightedMutations && Rand(100) < 80)
M = &Mutators[WeightedIndex()];
else
M = &Mutators[Rand(Mutators.size())];
```
Repository:
rCRT Compiler Runtime
https://reviews.llvm.org/D49621
More information about the llvm-commits
mailing list