[llvm-dev] libfuzzer questions
Kostya Serebryany via llvm-dev
llvm-dev at lists.llvm.org
Tue Aug 11 17:25:49 PDT 2015
On Tue, Aug 11, 2015 at 4:58 PM, Brian Cain <brian.cain at gmail.com> wrote:
>
>
> On Mon, Aug 10, 2015 at 8:08 PM, Kostya Serebryany <kcc at google.com> wrote:
>
>>
>>
>> On Mon, Aug 10, 2015 at 5:53 PM, Brian Cain via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>>
>>> First off, thanks -- this is a pretty great library and it feels like
>>> I'm learning a lot.
>>>
>>
>> Thanks!
>>
>>
>>> I'm getting some more experience with libfuzzer and finding that I have
>>> a couple of questions:
>>>
>>
>>
>>>
>>> - How does libfuzzer decide to write a new test file? What
>>> distinguishes this one from all the other cases for which new test inputs
>>> were not written? Must be something about the path taken through the code?
>>>
>>
>> Exactly.
>> It uses http://clang.llvm.org/docs/SanitizerCoverage.html to figure out
>> if any new edge in the control flow graph has been discovered with the
>> given input.
>>
>>
>
> So if I'm seeing tens of thousands of distinct test files, that represents
> tens of thousands of distinct edges?
>
In the extreme case -- yes.
However usually a single file covers more than one unique edge.
Also, if you are running the fuzzer in parallel (-jobs=N) some edges can be
discovered many times.
> Does the CFG span functions/methods or are they scoped more sanely?
>
Hm? What do you mean?
An control flow edge is a regular edge between basic blocks in a function.
With -fsanitize-coverage=indirect-calls it will also track indir call edges
(uniq pairs of caller-callee).
>
>
>>
>>> - Can I use afl-cmin or is there something similar for libFuzzer?
>>>
>>
>> I've never tried that. I'd expect you can.
>> libFuzzer and afl both use plain files to store the corpus.
>>
>>
> I think afl-cmin uses some afl-specific behavior.
>
>
>> I find that sometimes I get an enormous amount of tests and it becomes
>>> unmanageable.
>>>
>>
>> libFuzzer has an option to minimize the corpus.
>> It's not perfect, but very simple.
>> -------------
>> save_minimized_corpus 0 If 1, the minimized corpus is
>> saved into the first input directory
>> -------------
>>
>>>
> Ohh, ok. I think I misunderstood this to trying to minimize the size of
> the test case while still reproducing a crash. Similar to how afl-tmin
> works, I was thinking. I'll give this a try.
>
> Should I only use this option periodically or can I run it this way all
> the time? Do we end up spending more execution time minimizing the
> corpus? Will it delete redundant test cases, including ones that were
> there before this test run started?
>
You should only use this option if you want to store the minimized corpus
somewhere,
or if the initial stage (between "#0 READ" and "#1331 INITED")
takes too long.
Otherwise you should not bother since libFuzzer minimizes the corpus in
memory on every run.
(minimization is done with a trivial greedy algorithm, not even close to
really minimal solution, but good enough).
The output looks like this:
#0 READ cov 0 bits 0 units 1331 exec/s 0
...
#1024 pulse cov 8043 bits 13474 units 1331 exec/s 256
#1331 INITED cov 8050 bits 13689 units 594 exec/s 221
#2048 pulse cov 8050 bits 13689 units 594 exec/s 341
This means that the corpus on disk had 1331 units, they were read,
shuffled, executed, and those that added coverage were chosen.
>
>
>>
>>> - sometimes my process being tested appears to deadlock. A common
>>> feature seems to be that AlarmCallback is allocating memory and as a
>>> consequence the ASan code is pending on a lock. I'll speculate that this
>>> is because the alarm expired while the lock was already held. Is this
>>> expected? I can share specific call stacks if it helps. I can just extend
>>> the timeout but I think it's probably appropriate.
>>>
>>
>> Yes, please give more details.
>>
>>
>
> Traces attached. Not sure if the mailing list will preserve the
> attachments, though.
>
Aha, of course.
I run non-async-signal-safe code in the signal handler, bummer.
Let me try to fix this (no promises for a quick fix, I'll be out for a
while).
>
>
>
>>
>>> - AFL has a curses based display where a bunch of different stats are
>>> shown. I'll be honest, I don't know how to read those yet. ;) But I'd
>>> like to find some way to determine whether I'm seeing diminishing returns
>>> with libfuzzer. Is there a good strategy?
>>>
>>
>> libFuzzer just dumps stats to stderr.
>> As long as you periodically see lines like
>> #325 NEW cov 11985 bits 14108 units 113 exec/s 325 ...
>> you are good.
>>
>> Once you stop getting those, you may start playing with the flags.
>> (e.g. increase the max_len).
>> Unlike AFL which knows it all, libFuzzer still relies on a bit of user
>> help. :)
>>
>>
> Ok, that's good advice.
>
>
>
>
> --
> -Brian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150811/9f817137/attachment.html>
More information about the llvm-dev
mailing list