[PATCH] D100161: Redistribute energy for Corpus

Mon Apr 12 18:47:44 PDT 2021

gtt1995 added a comment.

In D100161#2683622 <https://reviews.llvm.org/D100161#2683622>, @morehouse wrote:

> Thanks for sharing your data.  Took a quick look and seems promising.
>
> I would like to try this on FuzzBench before accepting the patch though.  FuzzBench has a very nice experimental framework for evaluating changes like this.
>
>> It seems that FuzzBench does not accept this parallel mode evaluation.
>
> I talked to @metzman who manages FuzzBench.  Sounds like you're correct, FuzzBench uses only one worker process in fork mode.  @metzman said we could probably run a special experiment with more workers to evaluate this patch.
>
> Another approach that might be worth doing, is to make the patch effective even for a single worker.  For example, maybe we randomly pick from a subset of the corpus for that single worker.
>
> Also, I'm curious how the number of fork-mode workers affects efficacy.  I can imagine with lots of workers that this patch could perform much worse.  Specifically if we have a small number of corpus elements per wOorker, the crossover mutation becomes quite limited.

OK, Thanks for your work.
There are some  thoughtful tips to tell you :

1. The effect of Grouping corpus energy  may be similar to the effect of entropic (distribute energy for single seeds)on some goals, but there are also differences. So you should enable -entropic=0 when you evalutate them . Of course , At the same time, Enable -entropic and -NumCorpus will also have a certain effect .If you are interested, you can test four groups of subjects

(1).-entropic=0,NumCorpuses=1;
(2),-entropic=1,NumCorpuses=1
(3),-entropic=0, NumCorpuses=N (i set 30, others are also possible, I think this changes with the total number of seeds, it should change dynamically,)
(4),-entropic=1,NumCorpuses=30

2. I set -fork=30,-NumCorpuses=30,-entropic=0 in my evaluation. But the -fork value can not be equal to the -NumCorpuses value, because each job will execute each corpus in turn from small to large.
3. According to 2, it should be worked well in the single core mode,  Single process executes each corpus in turn from small to large. for in-process libfuzzer, frequent interaction with fs brings additional overhead. Therefore, it is still suitable for energy scheduling in parallel fuzzing  when each child process maintains the same coverage bitmap in time .
4. The degree of parallelism depends on how long you want to get results .I think more grouping and -fork equal to -NumCorpuses will be much better. They can be regarded as: traversing all corpora in one loop.If this is not the case, the corpues that are biased to the back will not be fully tested, because the merged results of the previous jobs will be written back to fs (if there is a large seed generated by small jobs ) and will be taken out again, which will have some negative effects.
5. I am sorry  I haven't tried it on -workers .
6. Can you share the official results to me？
7. Thanks for your work once again！

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D100161/new/

https://reviews.llvm.org/D100161