<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    <br>

    <div class="moz-cite-prefix">On 01/15/2015 11:58 PM, Nick Lewycky

      wrote:<br>

    </div>

    <blockquote

cite="mid:CADbEz-gAtGn7+-eTrLOVrjzXGPoB=FRb+cC9WhB_cDFeNP0R9w@mail.gmail.com"

      type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">On 3 January 2015 at 13:55, Sami

            Liedes <span dir="ltr"><<a moz-do-not-send="true"

                href="mailto:sami.liedes@iki.fi" target="_blank">sami.liedes@iki.fi</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0 0 0

              .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

              <br>

              I've set up a bot to test Clang by running svn HEAD of

              LLVM and Clang<br>

              with a large corpus of test cases that exercises a lot of

              different<br>

              code paths (from afl-fuzz). Whenever a test case causes

              clang to<br>

              crash, it records the output and reduces the test case

              using CReduce<br>

              or, if CReduce crashes, a dumber reducer. Crashes

              (assertion failures,<br>

              signals) are the only kind of failure detected.<br>

              <br>

              You can see here the kind of output it produces (running

              on my desktop<br>

              computer for now, so this URL will probably go away at

              some point):<br>

              <br>

                  <a moz-do-not-send="true"

                href="http://sli.dy.fi/%7Esliedes/clang-triage/triage_report.xhtml"

                target="_blank">http://sli.dy.fi/~sliedes/clang-triage/triage_report.xhtml</a><br>

              <br>

              Currently the bot only runs the test cases using clang

              -std=c++11 -O0;<br>

              trying with different language options would probably

              require more<br>

              afl-fuzzing with the different option set to really be

              effective.<br>

              <br>

              If someone wants to try it with different language

              parameters (or even<br>

              for different frontends), or to set it up on some better<br>

              infrastructure, the code and the instructions are here:<br>

              <br>

                  <a moz-do-not-send="true"

                href="https://github.com/sliedes/clang-triage"

                target="_blank">https://github.com/sliedes/clang-triage</a><br>

              <br>

              The number of test cases that cause a given kind of

              failure also<br>

              roughly corresponds to how likely afl-fuzz is to hit that

              failure<br>

              (though the corpus is minimized in the sense that every

              test case in<br>

              the bot's database should exercise some code path that

              other test<br>

              cases do not). By default afl-fuzz stops recording new

              crashes once it<br>

              has found 5000 crashing inputs. Once some of the most

              common ones have<br>

              been fixed, it would make sense to rerun the fuzzer and

              add new test<br>

              cases to the bot.<br>

            </blockquote>

            <div><br>

            </div>

            <div>Once we start to get clean under afl (from a seed of an

              empty file), it would be great to start testing that new

              commits don't introduce regressions. Here's my idea:

              afl-fuzz already requires a starting input to act as a

              seed. If there is a test file change in a commit, use that

              test as a seed for afl-fuzz to check that revision. What

              do you think?</div>

          </div>

        </div>

      </div>

    </blockquote>

    I've been playing with something like this in my spare time.  I've

    gotten most of the infrastructure built to do this, but my initial

    couple of runs haven't been real productive.  The problem is that

    most modified test cases are *massive* and afl-fuzz has a really

    hard time with that.  I've got a couple of ideas on how to address

    that, but I haven't gotten back to it yet.  If others are

    interested, I can push the code I've got up on github.  (It's just

    some basic python scripting.)<br>

    <br>

    Philip<br>

  </body>

</html>