<div dir="ltr">Thanks for the replies David and Philip. I am still finding my way in this area so I am starting with some background reading.<div><br></div><div>The first thing I will do is go through llvm-stress and see how it broadly works. I will then go through Philip's bulleted list and try to follow his suggestions.</div><div><br></div><div>Cheers,</div><div>Saurabh</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jul 19, 2021 at 9:31 PM Philip Reames <<a href="mailto:listmail@philipreames.com">listmail@philipreames.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

  
  <div>

    <p>A bit of prior work to be aware of:</p>

    <p>There's something running under OSSFuzz already.  I'm not super

      clear on what this is, how it works operationally, but definitely

      something to be aware of.</p>

    <p>llvm-stress is an in tree tool for generating random IR.  Not

      sure this has been actively maintained at all though.</p>

    <p>If you're going to use a coverage guided fuzzer, you want to give

      some thought to your corpus choice.  Will your corpus be IR? 

      Bitcode?  A random seed for llvm-stress?  A random buffer

      replacing llvm-stress' RNG?  Each has tradeoffs and will exercise

      different parts of the infrastructure.</p>

    <p>It's also worth commenting that bugpoint's reduction strategy

      tends to be a very effective mutation fuzzer in practice.  <br>

    </p>

    <p>Personally, I'd approach it with something like the following:</p>

    <ul>

      <li>Start with a corpus of random seeds to llvm-stress + a pass

        identifier.  Should be easy to stand up and run with any fuzz

        driver, make sure it works and fix the obvious problems to get a

        reasonable fuzz rate.</li>

      <li>Then extend your llvm-stress seed corpus into a random buffer

        corpus.  Extract llvm-stress into a library which consumes a

        string of random bytes.  Have the first byte of the buffer map

        to pass under test and the rest of an llvm-stress input. <br>

      </li>

      <li>Once that was running successfully - extend it.  There's lots

        of room to improve llvm-stress' generator.  <br>

      </li>

      <li>Another extension would be to add in mutation transforms after

        generation but before pass of interest.  (Extracting out

        bugpoint/llvm-bisect transforms to use for the mutation would

        work pretty well.)  Basically, you extend your input buffer to

        allow a set of transform identifies following the buffer passed

        to llvm-stress.  <br>

      </li>

    </ul>

    <p>The preceding is not super well thought out, just what occurred

      to me in the moment.  <br>

    </p>

    <p>Philip<br>

    </p>

    <p><br>

    </p>

    <div>On 7/19/21 12:12 PM, David Blaikie via

      llvm-dev wrote:<br>

    </div>

    <blockquote type="cite">

      
      <div dir="ltr">Seems viable (+Kostya, maybe he can +anyone else on

        his team/he's worked with who might be interesting in

        collaborating on this use of fuzzing, or provide other general

        pointers, etc)</div>

      <br>

      <div class="gmail_quote">

        <div dir="ltr" class="gmail_attr">On Mon, Jul 19, 2021 at 12:06

          PM Saurabh Jha via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>

          wrote:<br>

        </div>

        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

          <div dir="ltr">Hi llvm people,

            <div><br>

            </div>

            <div>I have been contributing to clang for a while. I am now

              looking for something to work on in llvm-core.</div>

            <div><br>

            </div>

            <div>In the list of open projects, I found <a href="https://llvm.org/OpenProjects.html#llvm_ir_fuzzing" target="_blank">llvm IR fuzzing</a>

              to be interesting. I saw the <a href="https://summerofcode.withgoogle.com/organizations/5767011616948224/?sp-page=2" target="_blank">gsoc page</a> for

              llvm and browsed through the mailing list and it seems to

              me that no one else is actively working on it at the

              moment.</div>

            <div><br>

            </div>

            <div>Is anyone else working on it right now? I am planning

              to start on the prerequisite readings once I get a better

              view on what's going on in this area or whether I should

              pursue something else.</div>

            <div><br>

            </div>

            <div>Many thanks,</div>

            <div>Saurabh</div>

          </div>

          _______________________________________________<br>

          LLVM Developers mailing list<br>

          <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>

          <a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

        </blockquote>

      </div>

      <br>

      <fieldset></fieldset>

      <pre>_______________________________________________

LLVM Developers mailing list

<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>

<a href="https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a>

</pre>

    </blockquote>

  </div>


</blockquote></div>