<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Aug 11, 2015 at 4:58 PM, Brian Cain <span dir="ltr"><<a href="mailto:brian.cain@gmail.com" target="_blank" onclick="window.open('https://mail.google.com/mail/?view=cm&tf=1&to=brian.cain@gmail.com&cc=&bcc=&su=&body=','_blank');return false;">brian.cain@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Mon, Aug 10, 2015 at 8:08 PM, Kostya Serebryany <span dir="ltr"><<a href="mailto:kcc@google.com" target="_blank" onclick="window.open('https://mail.google.com/mail/?view=cm&tf=1&to=kcc@google.com&cc=&bcc=&su=&body=','_blank');return false;">kcc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span>On Mon, Aug 10, 2015 at 5:53 PM, Brian Cain via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" onclick="window.open('https://mail.google.com/mail/?view=cm&tf=1&to=llvm-dev@lists.llvm.org&cc=&bcc=&su=&body=','_blank');return false;">llvm-dev@lists.llvm.org</a>></span> wrote:<br></span><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div>First off, thanks -- this is a pretty great library and it feels like I'm learning a lot. </div></blockquote><div><br></div></span><div>Thanks! </div><span><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">I'm getting some more experience with libfuzzer and finding that I have a couple of questions:</div></blockquote><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>- How does libfuzzer decide to write a new test file? What distinguishes this one from all the other cases for which new test inputs were not written? Must be something about the path taken through the code?</div></div></blockquote><div><br></div></span><div>Exactly. </div><div>It uses <a href="http://clang.llvm.org/docs/SanitizerCoverage.html" target="_blank">http://clang.llvm.org/docs/SanitizerCoverage.html</a> to figure out if any new edge in the control flow graph has been discovered with the given input. </div><span><div> </div></span></div></div></div></blockquote><div><br></div></span><div>So if I'm seeing tens of thousands of distinct test files, that represents tens of thousands of distinct edges? </div></div></div></div></blockquote><div><br></div><div>In the extreme case -- yes.</div><div>However usually a single file covers more than one unique edge. </div><div>Also, if you are running the fuzzer in parallel (-jobs=N) some edges can be discovered many times. </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Does the CFG span functions/methods or are they scoped more sanely?</div></div></div></div></blockquote><div><br></div><div>Hm? What do you mean? </div><div>An control flow edge is a regular edge between basic blocks in a function. </div><div>With -fsanitize-coverage=indirect-calls it will also track indir call edges (uniq pairs of caller-callee). </div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>- Can I use afl-cmin or is there something similar for libFuzzer? </div></div></blockquote><div><br></div></span><div>I've never tried that. I'd expect you can. </div><div>libFuzzer and afl both use plain files to store the corpus. </div><span><div><br></div></span></div></div></div></blockquote><div><br></div></span><div>I think afl-cmin uses some afl-specific behavior.</div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>I find that sometimes I get an enormous amount of tests and it becomes unmanageable.</div></div></blockquote><div><br></div></span><div>libFuzzer has an option to minimize the corpus. </div><div>It's not perfect, but very simple. </div><div>-------------</div><div><div> save_minimized_corpus <span style="white-space:pre-wrap"> </span>0<span style="white-space:pre-wrap"> </span>If 1, the minimized corpus is saved into the first input directory</div></div><div>-------------<br></div><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div></div></div></blockquote></span></div></div></div></blockquote><div><br></div></span><div>Ohh, ok. I think I misunderstood this to trying to minimize the size of the test case while still reproducing a crash. Similar to how afl-tmin works, I was thinking. I'll give this a try. </div><div><br></div><div>Should I only use this option periodically or can I run it this way all the time? Do we end up spending more execution time minimizing the corpus? Will it delete redundant test cases, including ones that were there before this test run started?</div></div></div></div></blockquote><div><br></div><div>You should only use this option if you want to store the minimized corpus somewhere, </div><div>or if the initial stage (between "#0 READ" and "#1331 INITED") takes too long. </div><div>Otherwise you should not bother since libFuzzer minimizes the corpus in memory on every run. </div><div>(minimization is done with a trivial greedy algorithm, not even close to really minimal solution, but good enough).</div><div>The output looks like this: </div><div><br></div><div><div>#0 READ cov 0 bits 0 units 1331 exec/s 0 </div></div><div>... </div><div><div>#1024 pulse cov 8043 bits 13474 units 1331 exec/s 256 </div><div>#1331 INITED cov 8050 bits 13689 units 594 exec/s 221 </div><div>#2048 pulse cov 8050 bits 13689 units 594 exec/s 341 </div></div><div><br></div><div>This means that the corpus on disk had 1331 units, they were read, shuffled, executed, and those that added coverage were chosen. </div><div><br></div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div> </div><div>- sometimes my process being tested appears to deadlock. A common feature seems to be that AlarmCallback is allocating memory and as a consequence the ASan code is pending on a lock. I'll speculate that this is because the alarm expired while the lock was already held. Is this expected? I can share specific call stacks if it helps. I can just extend the timeout but I think it's probably appropriate.</div></div></blockquote><div><br></div></span><div>Yes, please give more details. </div><span><div> </div></span></div></div></div></blockquote><div><br></div></span><div>Traces attached. Not sure if the mailing list will preserve the attachments, though.</div></div></div></div></blockquote><div><br></div><div>Aha, of course. </div><div>I run non-async-signal-safe code in the signal handler, bummer.</div><div>Let me try to fix this (no promises for a quick fix, I'll be out for a while). </div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span class=""><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>- AFL has a curses based display where a bunch of different stats are shown. I'll be honest, I don't know how to read those yet. ;) But I'd like to find some way to determine whether I'm seeing diminishing returns with libfuzzer. Is there a good strategy?</div></div></blockquote><div><br></div></span><div>libFuzzer just dumps stats to stderr. </div><div>As long as you periodically see lines like </div><div>#325<span style="white-space:pre-wrap"> </span>NEW cov 11985 bits 14108 units 113 exec/s 325 ...<br></div><div>you are good. </div><div><br></div><div>Once you stop getting those, you may start playing with the flags. </div><div>(e.g. increase the max_len).</div><div>Unlike AFL which knows it all, libFuzzer still relies on a bit of user help. :) </div><span><div><br></div><div></div></span></div></div></div></blockquote></span></div><div class="gmail_extra"><br></div><div class="gmail_extra">Ok, that's good advice.</div><span class="HOEnZb"><font color="#888888"><div class="gmail_extra"><br></div><div class="gmail_extra"><br></div><br clear="all"><div><br></div>-- <br><div>-Brian</div>
</font></span></div></div>
</blockquote></div><br></div></div>