[all-commits] [llvm/llvm-project] e6597d: Greedy set cover implementation of `Merger::Merge`

Matt Morehouse via All-commits all-commits at lists.llvm.org
Tue Sep 7 09:43:08 PDT 2021


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: e6597dbae84034ce8ef97802584039b723adb526
      https://github.com/llvm/llvm-project/commit/e6597dbae84034ce8ef97802584039b723adb526
  Author: aristotelis <aristotelis at forallsecure.com>
  Date:   2021-09-07 (Tue, 07 Sep 2021)

  Changed paths:
    M compiler-rt/lib/fuzzer/FuzzerDriver.cpp
    M compiler-rt/lib/fuzzer/FuzzerFlags.def
    M compiler-rt/lib/fuzzer/FuzzerFork.cpp
    M compiler-rt/lib/fuzzer/FuzzerInternal.h
    M compiler-rt/lib/fuzzer/FuzzerMerge.cpp
    M compiler-rt/lib/fuzzer/FuzzerMerge.h
    M compiler-rt/lib/fuzzer/tests/FuzzerUnittest.cpp
    A compiler-rt/test/fuzzer/set_cover_merge.test

  Log Message:
  -----------
  Greedy set cover implementation of `Merger::Merge`

Extend the existing single-pass algorithm for `Merger::Merge` with an algorithm that gives better results. This new implementation can be used with a new **set_cover_merge=1** flag.

This greedy set cover implementation gives a substantially smaller final corpus (40%-80% less testcases) while preserving the same features/coverage. At the same time, the execution time penalty is not that significant (+50% for ~1M corpus files and far less for smaller corpora). These results were obtained by comparing several targets with varying size corpora.

Change `Merger::CrashResistantMergeInternalStep` to collect all features from each file and not just unique ones. This is needed for the set cover algorithm to work correctly. The implementation of the algorithm in `Merger::SetCoverMerge` uses a bitvector to store features that are covered by a file while performing the pass. Collisions while indexing the bitvector are ignored similarly to the fuzzer.

Reviewed By: morehouse

Differential Revision: https://reviews.llvm.org/D105284




More information about the All-commits mailing list