[PATCH] D105284: Greedy set cover implementation of `Merger::Merge`

Matt Morehouse via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 7 08:47:53 PDT 2021


morehouse accepted this revision.
morehouse added a comment.
This revision is now accepted and ready to land.

LGTM



================
Comment at: compiler-rt/lib/fuzzer/tests/FuzzerUnittest.cpp:953
+                      "",
+                      true));
+  EXPECT_EQ(M.SetCoverMerge(Features, &NewFeatures, Cov, &NewCov, &NewFiles),
----------------
arisKoutsou wrote:
> morehouse wrote:
> > arisKoutsou wrote:
> > > morehouse wrote:
> > > > IIUC, this test doesn't quite do what's intended.
> > > > 
> > > > We choose input C because it has the most unique features, but after that A only has `{0}` as a unique feature while B has `{4, 5}`.  So we do in fact choose B, but not because it is smaller.
> > > Correct, I will adjust the features correctly so that we can test breaking ties between files with equal number of features.
> > The comment above is confusing me now.  I *think* what's happening is that B is picked first (over A since B is smaller), leaving A with `{0, 3}` unique features and C with `{1, 2, 4}` unique features.  Then C gets picked since it has more features.
> > 
> > We still end up with C and B in the set, but the comment's explanation of how this happens is wrong.
> As far as I can see [[ https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/fuzzer/FuzzerMerge.cpp#L86 | here ]] , the format of the control file is: 
> ```
> FT <current_file_index> <feature_1> <feature_2> ... <feature_n>
> ```
> So, in our case, the feature sets are:
> ```
> A = {3, 5, 6}
> B = {4, 5, 6}
> C = {1, 2, 3, 4}
> D = {1}
> ```
> Since the set cover algorithm chooses the set that covers the maximum number of previously uncovered features, the file that is chosen in the first iteration is C. This is because `Covered` is empty and C has 4 features while all the other files have less features. In the next iteration of the algorithm A's uncovered features are {5, 6} and B's uncovered features are {5, 6} so we break the tie by selecting the smallest which is B. Finally, all features are covered so we can exit. 
Ah, yep you're right.  Thanks for explaining.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105284/new/

https://reviews.llvm.org/D105284



More information about the llvm-commits mailing list