[PATCH] D49621: [libFuzzer] Initial implementation of weighted mutation leveraging during runtime.

Max Moroz via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jul 27 13:44:04 PDT 2018


Dor1s added a comment.

Haven't looked at the test yet, as the code needs a lot of clean up.



================
Comment at: lib/fuzzer/FuzzerLoop.cpp:41
 static const size_t kMaxUnitSizeToPrint = 256;
+const size_t kRunsBeforeWeightedMutations = 10000;
 
----------------
metzman wrote:
> Dor1s wrote:
> > Dor1s wrote:
> > > why not static?
> > I think it may have a better name, maybe something like `kUpdateMutationWeightRuns`
> +1, the name we originally picked together is incorrect.
Repeating my previous comment:

> why not static?



================
Comment at: lib/fuzzer/FuzzerMutate.cpp:23
+const double kDefaultMutationWeight = 1;
+const double kDefaultMutationStat = 1 / (100 * 1000);
 
----------------
metzman wrote:
> Please add a comment to explain the significance of `100` and `1000` (frankly i don't know what the purpose is of either since we don't actually round anything).
+1, what is it for?


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:66
+
+  // For weighted mutation selection
+  MutationWeights.resize(Mutators.size(), kDefaultMutationWeight);
----------------
nit: end the comment with a period


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:525
   for (int Iter = 0; Iter < 100; Iter++) {
-    auto M = &Mutators[Rand(Mutators.size())];
+    if (Options.UseWeightedMutations)
+      M = &Mutators[WeightedIndex()];
----------------
How will this work during first 10000 runs? Do we have a uniform distribution by default?


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:580
+void MutationDispatcher::SetMutationStats() {
+  if (Options.PrintMutationStats) Printf("\nstat::mutation_usefulness:      ");
+  // Calculate usefulness statistic for each mutation
----------------
Move `Printf` call to the next line please. Also, this `stat::...` syntax is used for final stats. If we decide to print something during runtime, we should probably follow the style of the other log messages.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:583
+  for (size_t i = 0; i < Mutators.size(); i++) {
+    double UsefulRatio = Mutators[i].TotalCount
+                             ? static_cast<double>(Mutators[i].UsefulCount) /
----------------
metzman wrote:
> I like that you went to fix something based on comments you saw in on the review even if they weren't explicit. 
> But in general it's better not to refactor unrelated code in a CL. I'll accept it though.
> This assignment statement is starting to look pretty bad, please make use a regular if else instead of ternary.
+1


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:583
+  for (size_t i = 0; i < Mutators.size(); i++) {
+    double UsefulRatio = Mutators[i].TotalCount
+                             ? static_cast<double>(Mutators[i].UsefulCount) /
----------------
Dor1s wrote:
> metzman wrote:
> > I like that you went to fix something based on comments you saw in on the review even if they weren't explicit. 
> > But in general it's better not to refactor unrelated code in a CL. I'll accept it though.
> > This assignment statement is starting to look pretty bad, please make use a regular if else instead of ternary.
> +1
Actually, that seems to be unnecessary complicated. Initially, we should have some default values in `Stats`. Probably zero, since none of the stats have been useful at that point. Later on, we should be able to overwrite  default `Stats` value with a new one. That's it, there is no way we'll have to put the default value back.

Also, I don't find it good to check whether `TotalCount` is 0 or not on every run. Let's just initialize it with `1` in the very beginning, so it will never be `0` and it always will be safe to use that value as a divisor.



================
Comment at: lib/fuzzer/FuzzerMutate.cpp:588
+    if (Options.PrintMutationStats) {
+      Printf("%.3f", 100 * UsefulRatio);
+      if (i < Mutators.size() - 1) Printf(",");
----------------
This code is duplicated in `MutationDispatcher::PrintMutationStats`, can you please have it in a one place and call it from different places as needed.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:598
+  SetMutationStats();
+  for (const double &Stat : Stats) SumOfStats += Stat;
+  // Normalize stats into weights.
----------------
I'd ask you to move `SumOfStats += Stat;` to the next line just to follow the style, but actually you should use `std::accumulate` instead of manual looping through the values.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:600
+  // Normalize stats into weights.
   for (size_t i = 0; i < Mutators.size(); i++) {
+    // Set to default weight if usefulness is zero so far.
----------------
It would be much safer to use `Stats.size()` here instead of `Mutators.size()` since you're using `i` to access elements in `Stats`, not in `Mutators`.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:602
+    // Set to default weight if usefulness is zero so far.
+    if (Stats[i] <= kDefaultMutationStat)
+      MutationWeights[i] = kDefaultMutationWeight;
----------------
Again, overcomplication. We should start with some initial weights, and later on only overwrite those.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:607
+      MutationWeights[i] =
+          (Stats[i] * 1000 / SumOfStats) + kDefaultMutationWeight;
+    }
----------------
I don't understand why we need `+ kDefaultMutationWeight`, as well as why we multiply the value by `1000`.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:615
+size_t MutationDispatcher::WeightedIndex() {
+  std::random_device rd;
+  std::mt19937 gen(rd());
----------------
It would be very inefficient to create a new random generator every time when we need to select an index. There is `Random Rand(Seed);` declared in the `FuzzerDriver` and I believe all the other code is using it, please use it as well.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:616
+  std::random_device rd;
+  std::mt19937 gen(rd());
+  std::discrete_distribution<> SelectIndex(MutationWeights.begin(),
----------------
Also, it's not the first time it looks like you've just copied the code from somewhere without even changing variable names to follow project's style. I guess,  https://en.cppreference.com/w/cpp/numeric/random/discrete_distribution this time.

Also, please don't forget to run clang format before uploading a CL.


================
Comment at: lib/fuzzer/FuzzerMutate.cpp:621
+  // Ensure chosen index is in range.
+  assert(Index < Mutators.size());
+  return Index;
----------------
This is an overkill, we never change sizes of `Mutators` and `MutationWeights` from the very beginning of the fuzzing. If you want to assert that their sizes are equal, somewhere after FuzzerMutate.cpp:68 would be the best place, but that wouldn't make any sense since you've just called `resize` there. Just remove the assert and return `SelectIndex(gen)`.




================
Comment at: lib/fuzzer/FuzzerMutate.h:100
 
+  /// Returns usefulness stats on command line if option is enabled.
+  void PrintMutationStats() {
----------------
nit: "Returns" -> "Prints" or "Outputs"


Repository:
  rCRT Compiler Runtime

https://reviews.llvm.org/D49621





More information about the llvm-commits mailing list