[PATCH] D73776: Entropic: Boosting LibFuzzer Performance
Kostya Serebryany via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 3 16:59:25 PST 2020
kcc added a comment.
Sounds good. Max and I will do the next round(s) of review.
Re RareFeaturesFreqMap, I would consider two more options:
1. a vector of pairs (feature, freq), ordered by feature. Updating the frequency would be log(N) and removing a feature would be N*log(N), but it may still be better than unordered map.
2. same as a above, but just use a fixed-size array of some small size, e.g. 8. My assumption is that most inputs have very few, if any, rare features, and in cases where a given input has more than 8 rare features, it doesn't matter if we drop some of them.
option 2 is an optimization of option 1, so can just go with option 1 and see if it's too slow.
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:53
+public:
+ size_t g_NumExecutedMutations = 0;
InputCorpus(const std::string &OutputCorpus) : OutputCorpus(OutputCorpus) {
----------------
marcel wrote:
> kcc wrote:
> > why is this public?
> > Also, why the g_ prefix?
> `g_NumExecutedMutations` contains the total number of inputs generated. It is updated in FuzzerLoop.c in `MutateAndTestOne`. I can drop the `g_` prefix.
yea, please drop g_.
================
Comment at: compiler-rt/lib/fuzzer/FuzzerCorpus.h:500
+ bool DistributionNeedsUpdate = true;
+ size_t FeaturesFreqMap[kFeatureSetSize];
+ Vector<size_t> RareFeatures;
----------------
marcel wrote:
> kcc wrote:
> > does this have to be 8-byte per feature?
> Each entry is upper-bounded by the total number of generated inputs `g_NumExecutedMutations` which likely won't fit into `uint32_t`. That's why I chose size_t.
>
> However, we really only need abundance information for features with an abundance below `MostAbundant_RareFeature`. If memory footprint is a concern, we can go down to `uint16_t` at the cost of an overflow check in the hot code in `UpdateFeatureFrequency`. See top-level comment for more details.
Yea, I'd prefer uint16_t and a saturated add. Just to save some RAM
================
Comment at: compiler-rt/lib/fuzzer/FuzzerLoop.cpp:607
-void Fuzzer::ReportNewCoverage(InputInfo *II, const Unit &U) {
+void Fuzzer::ReportNewCoverage(bool reduced, InputInfo *II, const Unit &U) {
II->NumSuccessfullMutations++;
----------------
marcel wrote:
> kcc wrote:
> > do you need this change?
> Unrelated. This is just fixing a problem where LibFuzzer prints REDUCE more often than it should.
I'd prefer to not mix unrelated changes in one diff -- makes the code review quadratic.
Please contribute this one separately (I am not 100% sure I understand it)
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D73776/new/
https://reviews.llvm.org/D73776
More information about the llvm-commits
mailing list