[PATCH] D30156: llvm-mc-fuzzer: add support for assembly
Brian Cain via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 21 12:54:31 PST 2017
bcain added inline comments.
================
Comment at: tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp:244
if (Action == AC_Assemble)
- errs() << "error: -assemble is not implemented\n";
+ return AssembleOneInput(Data, Size);
else if (Action == AC_Disassemble)
----------------
kcc wrote:
> dsanders wrote:
> > kcc wrote:
> > > bcain wrote:
> > > > dsanders wrote:
> > > > > kcc wrote:
> > > > > > I strongly suggest to make this a separate fuzz target instead of using flags.
> > > > > > Otherwise it'll be harder to automate running this target.
> > > > > I'm not sure what you mean here. What difficulties are you thinking of?
> > > > >
> > > > > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> > > > > I strongly suggest to make this a separate fuzz target instead of using flags.
> > > >
> > > > I've preserved the original design for llvm-mc-fuzzer, apparently to imitate llvm-mc.
> > > >
> > > > Pros/cons of the current design:
> > > > - pro: matches llvm-mc
> > > > - pro: changing focus to probe different paths only requires different command line args
> > > > - con: reproducing fuzzer configuration more difficult because it depends on those args
> > > > - con: libFuzzer might see the uncovered feature set as a goal for coverage (that we already know statically it cannot cover).
> > > >
> > > > For that last one, it's speculation on my part.
> > > >
> > > > Kostya, would you be satisfied with this as-is or should I decompose it into two fuzzers? "Harder to automate" consists of "I must make sure that I can deliver the right command line args to the automation feature"? Or "won't fit well in oss-fuzz" or something else?
> > > > I'm not sure what you mean here. What difficulties are you thinking of?
> > >
> > > Imagine an automated system that runs continuous fuzzing (e.g. https://github.com/google/oss-fuzz).
> > > How are you going to tell it to run the same binary with two different flags and to treat those
> > > as two independent entities?
> > > Of course, it's possible to implement support for something like this, but OSS-Fuzz does not and will not support it.
> > > (because of KISS: https://en.wikipedia.org/wiki/KISS_principle)
> > >
> > > When analyzing the code coverage (manually, or automatically) there will be a huge lump of code that is never reached in one mode, i.e. this 2-in-1 bundle will confuse the analysis.
> > >
> > > Finally, at least in libFuzzer, part of the algorithm is linear by the size of the binary (more precisely: number of instrumented blocks) and so this bundled fuzzer will just be burning CPUs with no reason.
> > >
> > >
> > > > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> > > Yes, and I objected back then :)
> > > > I'm not sure what you mean here. What difficulties are you thinking of?
> > > Imagine an automated system that runs continuous fuzzing (e.g.
> > > https://github.com/google/oss-fuzz).
> > > How are you going to tell it to run the same binary with two different flags and to treat those
> > > as two independent entities?
> >
> > I'm not familiar with oss-fuzz but based on an initial glance through I'm not sure how this is different from oss-fuzz/projects/curl/. That project is using pre-processor macros to select between different fuzzers.
> >
> > To answer the question though, if I wanted to fuzz everything (assembler/disassembler, all arches, subarches, and feature combinations) in this kind of system and the curl/llvm-mc-fuzzer way had been ruled out. I'd probably use the first few bytes of the data as the configuration and do a full setup/teardown in LLVMFuzzerTestOneInput().
> >
> > That said, I think that's a different kind of fuzzer to llvm-mc-fuzzer. It would aim to improve the quality of the LLVM project as a whole whereas llvm-mc-fuzzer was meant to help backend developers improve the quality of their particular targets and subtargets.
> >
> > > Of course, it's possible to implement support for something like this, but OSS-Fuzz does not and
> > > will not support it.
> > > (because of KISS: https://en.wikipedia.org/wiki/KISS_principle)
> >
> > This principle is the reason this tool uses command line arguments for the action/triple/arch/subarch/features. Command line arguments were the simplest way to configure a particular target without having to re-compile for each combination. I included support for other archs/subarches/features because it made the original goal easier and also made the tool more useful to others.
> >
> > > When analyzing the code coverage (manually, or automatically) there will be a huge lump of code
> > > that is never reached in one mode, i.e. this 2-in-1 bundle will confuse the analysis.
> >
> > FWIW, this is also the case between arches/subarches/features. For example, on an X86 host using default options, the AArch64/ARM/Mips/etc. disassemblers are not tested.
> >
> > > Finally, at least in libFuzzer, part of the algorithm is linear by the size of the binary (more precisely:
> > > number of instrumented blocks) and so this bundled fuzzer will just be burning CPUs with no
> > > reason.
> >
> > That's a fair point.
> >
> > > > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> > > Yes, and I objected back then :)
> >
> > I remember you objected to having a custom main function that mangled the arguments before passing them on to libFuzzer and I fixed that. I didn't think there was an objection to command line arguments in general though.
> >
> > If the objection was to command line arguments in general, Is there a way to test an architecture in isolation from the others that's more in keeping with libFuzzer's style?
> > using pre-processor macros to select between different fuzzers.
>
> That's fine, since it creates different binaries, where unused code has a good chance to not even be linked in.
>
> > I'd probably use the first few bytes of the data as the configuration
> That's an option but it has 2 problems:
> 1. now the inputs are from some new artificial data format
> 2. fuzzing is less efficient due to larger binary
>
> > FWIW, this is also the case between arches/subarches/features.
> Yes.
>
> > you objected to having a custom main function that mangled the arguments
>
> Yes, and that is currently not supported at all (was a mistake).
>
> > Is there a way to test an architecture in isolation from the others that's more in keeping with libFuzzer's style?
> Have separate binary (build target) for every distinct configuration of code we have in mind.
> How many are we talking about here?
> tens or hundreds?
> Having 20-30 binaries like this is totally fine imho, and works great on oss-fuzz (see e.g. the way we've done it for ffmpeg, where we have ~40 binaries).
> https://github.com/google/oss-fuzz/blob/master/projects/ffmpeg/build.sh
>
>
> Have separate binary (build target) for every distinct configuration of code we have in mind.
I'd be willing to modify my submission to at least decompose llvm-mc-fuzzer into llvm-mc-assemble-fuzzer and llvm-mc-disassemble-fuzzer. That would solve the problem of having primarily disjoint content under a single binary. Is that a reasonable compromise, or is it necessary to have zero command line args?
Repository:
rL LLVM
https://reviews.llvm.org/D30156
More information about the llvm-commits
mailing list