[llvm-dev] libfuzzer questions

Tue Aug 11 17:25:49 PDT 2015

On Tue, Aug 11, 2015 at 4:58 PM, Brian Cain <brian.cain at gmail.com> wrote:

>
>
> On Mon, Aug 10, 2015 at 8:08 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>>
>>
>> On Mon, Aug 10, 2015 at 5:53 PM, Brian Cain via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>> First off, thanks -- this is a pretty great library and it feels like
>>> I'm learning a lot.
>>>
>>
>> Thanks!
>>
>>
>>> I'm getting some more experience with libfuzzer and finding that I have
>>> a couple of questions:
>>>
>>
>>
>>>
>>> - How does libfuzzer decide to write a new test file?  What
>>> distinguishes this one from all the other cases for which new test inputs
>>> were not written?  Must be something about the path taken through the code?
>>>
>>
>> Exactly.
>> It uses http://clang.llvm.org/docs/SanitizerCoverage.html to figure out
>> if any new edge in the control flow graph has been discovered with the
>> given input.
>>
>>
>
> So if I'm seeing tens of thousands of distinct test files, that represents
> tens of thousands of distinct edges?
>

In the extreme case -- yes.
However usually a single file covers more than one unique edge.
Also, if you are running the fuzzer in parallel (-jobs=N) some edges can be
discovered many times.

> Does the CFG span functions/methods or are they scoped more sanely?
>

Hm? What do you mean?
An control flow edge is a regular edge between basic blocks in a function.
With -fsanitize-coverage=indirect-calls it will also track indir call edges
(uniq pairs of caller-callee).

>
>
>>
>>> - Can I use afl-cmin or is there something similar for libFuzzer?
>>>
>>
>> I've never tried that. I'd expect you can.
>> libFuzzer and afl both use plain files to store the corpus.
>>
>>
> I think afl-cmin uses some afl-specific behavior.
>
>
>> I find that sometimes I get an enormous amount of tests and it becomes
>>> unmanageable.
>>>
>>
>> libFuzzer has an option to minimize the corpus.
>> It's not perfect, but very simple.
>> -------------
>>  save_minimized_corpus               0 If 1, the minimized corpus is
>> saved into the first input directory
>> -------------
>>
>>>
> Ohh, ok.  I think I misunderstood this to trying to minimize the size of
> the test case while still reproducing a crash.  Similar to how afl-tmin
> works, I was thinking.  I'll give this a try.
>
> Should I only use this option periodically or can I run it this way all
> the time?  Do we end up spending more execution time minimizing the
> corpus?  Will it delete redundant test cases, including ones that were
> there before this test run started?
>

You should only use this option if you want to store the minimized corpus
somewhere,
or if the initial stage (between  "#0      READ" and "#1331   INITED")
takes too long.
Otherwise you should not bother since libFuzzer minimizes the corpus in
memory on every run.
(minimization is done with a trivial greedy algorithm, not even close to
really minimal solution, but good enough).
The output looks like this:

#0      READ   cov 0 bits 0 units 1331 exec/s 0
...
#1024   pulse  cov 8043 bits 13474 units 1331 exec/s 256
#1331   INITED cov 8050 bits 13689 units 594 exec/s 221
#2048   pulse  cov 8050 bits 13689 units 594 exec/s 341

This means that the corpus on disk had 1331 units, they were read,
shuffled, executed, and those that added coverage were chosen.

>
>
>>
>>> - sometimes my process being tested appears to deadlock.  A common
>>> feature seems to be that AlarmCallback is allocating memory and as a
>>> consequence the ASan code is pending on a lock.  I'll speculate that this
>>> is because the alarm expired while the lock was already held.  Is this
>>> expected?  I can share specific call stacks if it helps.  I can just extend
>>> the timeout but I think it's probably appropriate.
>>>
>>
>> Yes, please give more details.
>>
>>
>
> Traces attached.  Not sure if the mailing list will preserve the
> attachments, though.
>

Aha, of course.
I run non-async-signal-safe code in the signal handler, bummer.
Let me try to fix this (no promises for a quick fix, I'll be out for a
while).

>
>
>
>>
>>> - AFL has a curses based display where a bunch of different stats are
>>> shown.  I'll be honest, I don't know how to read those yet. ;)  But I'd
>>> like to find some way to determine whether I'm seeing diminishing returns
>>> with libfuzzer.  Is there a good strategy?
>>>
>>
>> libFuzzer just dumps stats to stderr.
>> As long as you periodically see lines like
>> #325 NEW    cov 11985 bits 14108 units 113 exec/s 325 ...
>> you are good.
>>
>> Once you stop getting those, you may start playing with the flags.
>> (e.g. increase the max_len).
>> Unlike AFL which knows it all, libFuzzer still relies on a bit of user
>> help. :)
>>
>>
> Ok, that's good advice.
>
>
>
>
> --
> -Brian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150811/9f817137/attachment.html>