[llvm-dev] Regarding fuzzing llvm-ir passes

Tue Jul 20 00:57:54 PDT 2021

Thanks for the replies David and Philip. I am still finding my way in this
area so I am starting with some background reading.

The first thing I will do is go through llvm-stress and see how it broadly
works. I will then go through Philip's bulleted list and try to follow his
suggestions.

Cheers,
Saurabh

On Mon, Jul 19, 2021 at 9:31 PM Philip Reames <listmail at philipreames.com>
wrote:

> A bit of prior work to be aware of:
>
> There's something running under OSSFuzz already.  I'm not super clear on
> what this is, how it works operationally, but definitely something to be
> aware of.
>
> llvm-stress is an in tree tool for generating random IR.  Not sure this
> has been actively maintained at all though.
>
> If you're going to use a coverage guided fuzzer, you want to give some
> thought to your corpus choice.  Will your corpus be IR?  Bitcode?  A random
> seed for llvm-stress?  A random buffer replacing llvm-stress' RNG?  Each
> has tradeoffs and will exercise different parts of the infrastructure.
>
> It's also worth commenting that bugpoint's reduction strategy tends to be
> a very effective mutation fuzzer in practice.
>
> Personally, I'd approach it with something like the following:
>
>    - Start with a corpus of random seeds to llvm-stress + a pass
>    identifier.  Should be easy to stand up and run with any fuzz driver, make
>    sure it works and fix the obvious problems to get a reasonable fuzz rate.
>    - Then extend your llvm-stress seed corpus into a random buffer
>    corpus.  Extract llvm-stress into a library which consumes a string of
>    random bytes.  Have the first byte of the buffer map to pass under test and
>    the rest of an llvm-stress input.
>    - Once that was running successfully - extend it.  There's lots of
>    room to improve llvm-stress' generator.
>    - Another extension would be to add in mutation transforms after
>    generation but before pass of interest.  (Extracting out
>    bugpoint/llvm-bisect transforms to use for the mutation would work pretty
>    well.)  Basically, you extend your input buffer to allow a set of transform
>    identifies following the buffer passed to llvm-stress.
>
> The preceding is not super well thought out, just what occurred to me in
> the moment.
>
> Philip
>
>
> On 7/19/21 12:12 PM, David Blaikie via llvm-dev wrote:
>
> Seems viable (+Kostya, maybe he can +anyone else on his team/he's worked
> with who might be interesting in collaborating on this use of fuzzing, or
> provide other general pointers, etc)
>
> On Mon, Jul 19, 2021 at 12:06 PM Saurabh Jha via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi llvm people,
>>
>> I have been contributing to clang for a while. I am now looking for
>> something to work on in llvm-core.
>>
>> In the list of open projects, I found llvm IR fuzzing
>> <https://llvm.org/OpenProjects.html#llvm_ir_fuzzing> to be interesting.
>> I saw the gsoc page
>> <https://summerofcode.withgoogle.com/organizations/5767011616948224/?sp-page=2>
>> for llvm and browsed through the mailing list and it seems to me that no
>> one else is actively working on it at the moment.
>>
>> Is anyone else working on it right now? I am planning to start on the
>> prerequisite readings once I get a better view on what's going on in this
>> area or whether I should pursue something else.
>>
>> Many thanks,
>> Saurabh
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
> _______________________________________________
> LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20210720/d5e6bfdc/attachment.html>