[PATCH] D30156: llvm-mc-fuzzer: add support for assembly

Tue Feb 21 02:52:16 PST 2017

dsanders added inline comments.

================
Comment at: tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp:244
   if (Action == AC_Assemble)
-    errs() << "error: -assemble is not implemented\n";
+    return AssembleOneInput(Data, Size);
   else if (Action == AC_Disassemble)
----------------
kcc wrote:
> bcain wrote:
> > dsanders wrote:
> > > kcc wrote:
> > > > I strongly suggest to make this a separate fuzz target instead of using flags. 
> > > > Otherwise it'll be harder to automate running this target. 
> > > I'm not sure what you mean here. What difficulties are you thinking of?
> > > 
> > > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> > > I strongly suggest to make this a separate fuzz target instead of using flags. 
> > 
> > I've preserved the original design for llvm-mc-fuzzer, apparently to imitate llvm-mc.
> > 
> > Pros/cons of the current design:
> > - pro: matches llvm-mc
> > - pro: changing focus to probe different paths only requires different command line args
> > - con: reproducing fuzzer configuration more difficult because it depends on those args
> > - con: libFuzzer might see the uncovered feature set as a goal for coverage (that we already know statically it cannot cover).
> > 
> > For that last one, it's speculation on my part.
> > 
> > Kostya, would you be satisfied with this as-is or should I decompose it into two fuzzers?  "Harder to automate" consists of "I must make sure that I can deliver the right command line args to the automation feature"?  Or "won't fit well in oss-fuzz" or something else?
> > I'm not sure what you mean here. What difficulties are you thinking of?
> 
> Imagine an automated system that runs continuous fuzzing (e.g. https://github.com/google/oss-fuzz).
> How are you going to tell it to run the same binary with two different flags and to treat those
> as two independent entities?
> Of course, it's possible to implement support for something like this, but OSS-Fuzz does not and will not support it. 
> (because of KISS: https://en.wikipedia.org/wiki/KISS_principle)
> 
> When analyzing the code coverage (manually, or automatically) there will be a huge lump of code that is never reached in one mode, i.e. this 2-in-1 bundle will confuse the analysis. 
> 
> Finally, at least in libFuzzer, part of the algorithm is linear by the size of the binary (more precisely: number of instrumented blocks) and so this bundled fuzzer will just be burning CPUs with no reason. 
> 
> 
> > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> Yes, and I objected back then :) 
> > I'm not sure what you mean here. What difficulties are you thinking of?
> Imagine an automated system that runs continuous fuzzing (e.g.
> https://github.com/google/oss-fuzz).
> How are you going to tell it to run the same binary with two different flags and to treat those
> as two independent entities?

I'm not familiar with oss-fuzz but based on an initial glance through I'm not sure how this is different from oss-fuzz/projects/curl/. That project is using pre-processor macros to select between different fuzzers.

To answer the question though, if I wanted to fuzz everything (assembler/disassembler, all arches, subarches, and feature combinations) in this kind of system and the curl/llvm-mc-fuzzer way had been ruled out. I'd probably use the first few bytes of the data as the configuration and do a full setup/teardown in LLVMFuzzerTestOneInput().

That said, I think that's a different kind of fuzzer to llvm-mc-fuzzer. It would aim to improve the quality of the LLVM project as a whole whereas llvm-mc-fuzzer was meant to help backend developers improve the quality of their particular targets and subtargets.

> Of course, it's possible to implement support for something like this, but OSS-Fuzz does not and
> will not support it. 
> (because of KISS: https://en.wikipedia.org/wiki/KISS_principle)

This principle is the reason this tool uses command line arguments for the action/triple/arch/subarch/features. Command line arguments were the simplest way to configure a particular target without having to re-compile for each combination. I included support for other archs/subarches/features because it made the original goal easier and also made the tool more useful to others.

> When analyzing the code coverage (manually, or automatically) there will be a huge lump of code
> that is never reached in one mode, i.e. this 2-in-1 bundle will confuse the analysis.

FWIW, this is also the case between arches/subarches/features. For example, on an X86 host using default options, the AArch64/ARM/Mips/etc. disassemblers are not tested.

> Finally, at least in libFuzzer, part of the algorithm is linear by the size of the binary (more precisely:
> number of instrumented blocks) and so this bundled fuzzer will just be burning CPUs with no
> reason.

That's a fair point.

> > FWIW, this is in line with my original intent which was to mimic llvm-mc's interface.
> Yes, and I objected back then :)

I remember you objected to having a custom main function that mangled the arguments before passing them on to libFuzzer and I fixed that. I didn't think there was an objection to command line arguments in general though.

If the objection was to command line arguments in general, Is there a way to test an architecture in isolation from the others that's more in keeping with libFuzzer's style?

Repository:
  rL LLVM

https://reviews.llvm.org/D30156