[llvm] r247786 - llvm-mc-fuzzer: A fuzzing tool for the MC layer.

Kostya Serebryany via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 18 17:44:30 PDT 2015


Daniel,

one question related to /llvm-mc-fuzzer.
When running as
   ./bin/llvm-mc-fuzzer -triple x86_64-linux-gnu  -disassemble -fuzzer-args
CORPUS -max_len=8
I quickly run into this:
==24687==ERROR: AddressSanitizer: SEGV on unknown address 0xf4360000606f
(pc 0x7f5ef64a3cc9 bp 0x7ffc1682a750 sp 0x7ffc1682a5c8 T0)
    #0 0x7f5ef64a3cc8 in gsignal
/build/buildd/eglibc-2.19/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:56
    #1 0x7f5ef64a70d7 in abort /build/buildd/eglibc-2.19/stdlib/abort.c:89
    #2 0xdd1bd8 in llvm::llvm_unreachable_internal(char const*, char
const*, unsigned int) lib/Support/ErrorHandling.cpp:117:3
    #3 0xb12448 in translateImmediate
lib/Target/X86/Disassembler/X86Disassembler.cpp:379:16
    #4 0xb12448 in translateOperand(llvm::MCInst&,
llvm::X86Disassembler::OperandSpecifier const&,
llvm::X86Disassembler::InternalInstruction&, llvm::MCDisassembler const*)
lib/Target/X86/Disassembler/X86Disassembler.cpp:922
    #5 0xb0d09b in translateInstruction
lib/Target/X86/Disassembler/X86Disassembler.cpp:981:11
    #6 0xb0d09b in
llvm::X86Disassembler::X86GenericDisassembler::getInstruction(llvm::MCInst&,
unsigned long&, llvm::ArrayRef<unsigned char>, unsigned long,
llvm::raw_ostream&, llvm::raw_ostream&) const
lib/Target/X86/Disassembler/X86Disassembler.cpp:160
    #7 0xd3055b in LLVMDisasmInstruction
lib/MC/MCDisassembler/Disassembler.cpp:253:7
    #8 0x5162a6 in DisassembleOneInput(unsigned char const*, unsigned long)
tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp:71:16

But if I try to feed the crashy input into llvm-mc, nothing interesting
happens:

% ./bin/llvm-mc -triple x86_64-linux-gnu  -disassemble <
 crash-e3c8c95134622581ba71de8274406456dafef3b3
.text
<stdin>:1:1: error: invalid input token
b�,X�

So, how do I invoke llvm-mc to make it behave close to what llvm-mc-fuzzer
is doing?




On Thu, Sep 17, 2015 at 5:32 PM, Kostya Serebryany <kcc at google.com> wrote:

>
>
> On Thu, Sep 17, 2015 at 2:38 AM, Daniel Sanders <Daniel.Sanders at imgtec.com
> > wrote:
>
>> > I forgot to ask you to document the fuzzer at
>> http://llvm.org/docs/LibFuzzer.html#fuzzing-components-of-llvm
>>
>>
>>
>> Will do
>>
>>
>>
>> > One problem: with the current structure of flags libFuzzer's -jobs=10
>> does not work...
>>
>> > Thoughts?
>>
>>
>>
>> Hmm. I see why that happens, each spawned thread is calling system() to
>> spawn a subprocess and that system() call is given a command built from the
>> fuzzer config. The resulting command lacks any of the non-fuzzer args and
>> so the child llvm-mc-fuzzer is trying to parse arguments meant for the
>> underlying fuzzer. Why does it spawn a subprocess from the worker thread
>> instead of doing the work directly inside the worker thread? Am I right in
>> thinking that it's to stop a crash in one job from killing everything?
>>
>>
>>
>> I can think of four options:
>>
>> 1.      fork() the new process instead of using system(). After the
>> fork, the child should remove the effects of –job by setting it to 0 and
>> reopen its stdout/stderr to achieve the same effect. This removes the need
>> to reconstruct and reparse the command line since fork() will duplicate the
>> result of the parse in the child process. Unfortunately, I don't think
>> there's a direct Windows equivalent to this outside of Cygwin.
>>
>> 2.       Separate fuzzer option parsing from the driver call. I'm
>> thinking something along the lines of this quick sketch:
>>             FlagDescription *Config =
>> FuzzerDriver::ParseFlags(FuzzerArgv);
>>             return FuzzerDriver::FuzzerDriver(argv, Config,
>> DisassembleOneInput);
>> That would allow argv to differ from the options the fuzzer understands
>> which are in FuzzerArgv.
>>
>> 3.       Make it possible to extend the fuzzer option parsing. The
>> CommandLine library can do this nicely but you probably don't want the
>> additional dependency in libFuzzer. Llvm-mc-fuzzer could always change to
>> libFuzzer's approach to command line parsing.
>>
>> 4.      Make it possible to modify the command before the system() call.
>> The client of libFuzzer could install a callback that allows it to modify a
>> std::vector containing the desired Argv.
>>
>
> I frankly like none of these, will need to think about it more...
> It's probably not urgent for this particular fuzzer -- llvm-mc has pretty
> small inputs and we can fuzz lots out of it in a single process.
> But will need to figure out for future uses like this.
> Maybe,
>   5. Add a libFuzzer option -target_options=-option1,param,-option2
> and run llvm-mc-fuzzer like "./bin/llvm-mc-fuzzer
>  -target_options=-triple,x86_64-linux-gnu,-disassemble
>
> BTW, I've found one llvm_unreachable with -triple x86_64-linux-gnu
> already... will file a bug.
>
>
>
> --kcc
>
>
>>
>>
>> If all OS's had fork() then I'd favour #1 but Windows rules that out. Out
>> of the rest #2 is seems the most flexible but #3/#4 are simpler. What's
>> your opinion?
>>
>>
>>
>> *From:* Kostya Serebryany [mailto:kcc at google.com]
>> *Sent:* 17 September 2015 05:38
>> *To:* Daniel Sanders
>> *Cc:* LLVM Commits
>> *Subject:* Re: [llvm] r247786 - llvm-mc-fuzzer: A fuzzing tool for the
>> MC layer.
>>
>>
>>
>> One problem: with the current structure of flags libFuzzer's -jobs=10
>> does not work...
>>
>> Thoughts?
>>
>>
>>
>> On Wed, Sep 16, 2015 at 9:25 PM, Kostya Serebryany <kcc at google.com>
>> wrote:
>>
>> Cool! I'll add it to the bot when time permits.
>>
>> I forgot to ask you to document the fuzzer
>>
>> at http://llvm.org/docs/LibFuzzer.html#fuzzing-components-of-llvm
>>
>> Feel free to do it w/o prior review.
>>
>>
>>
>> On Wed, Sep 16, 2015 at 4:49 AM, Daniel Sanders via llvm-commits <
>> llvm-commits at lists.llvm.org> wrote:
>>
>> Author: dsanders
>> Date: Wed Sep 16 06:49:49 2015
>> New Revision: 247786
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=247786&view=rev
>> Log:
>> llvm-mc-fuzzer: A fuzzing tool for the MC layer.
>>
>> Summary:
>> Only the disassembler is supported in this patch but it has already found
>> a few
>> issues in the Mips disassembler (mostly invalid instructions being
>> successfully
>> disassembled).
>>
>> Reviewers: kcc
>>
>> Subscribers: russell.gallop, silvas, kcc, llvm-commits
>>
>> Differential Revision: http://reviews.llvm.org/D12723
>>
>> Added:
>>     llvm/trunk/tools/llvm-mc-fuzzer/
>>     llvm/trunk/tools/llvm-mc-fuzzer/CMakeLists.txt
>>     llvm/trunk/tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
>> Modified:
>>     llvm/trunk/docs/LibFuzzer.rst
>>
>> Modified: llvm/trunk/docs/LibFuzzer.rst
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/LibFuzzer.rst?rev=247786&r1=247785&r2=247786&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/docs/LibFuzzer.rst (original)
>> +++ llvm/trunk/docs/LibFuzzer.rst Wed Sep 16 06:49:49 2015
>> @@ -453,7 +453,14 @@ Trophies
>>
>>    * llvm-as: https://llvm.org/bugs/show_bug.cgi?id=24639
>>
>> -
>> +  * Disassembler:
>> +    * Mips: Discovered a number of untested instructions for the Mips
>> target
>> +      (see valid-mips*.s in http://reviews.llvm.org/rL247405,
>> +      http://reviews.llvm.org/rL247414, http://reviews.llvm.org/rL247416
>> ,
>> +      http://reviews.llvm.org/rL247417, http://reviews.llvm.org/rL247420
>> ,
>> +      and http://reviews.llvm.org/rL247422) as well some instructions
>> that
>> +      successfully disassembled on ISA's where they were not valid (see
>> +      invalid-xfail.s files in the same commits).
>>
>>  .. _pcre2: http://www.pcre.org/
>>
>>
>> Added: llvm/trunk/tools/llvm-mc-fuzzer/CMakeLists.txt
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-mc-fuzzer/CMakeLists.txt?rev=247786&view=auto
>>
>> ==============================================================================
>> --- llvm/trunk/tools/llvm-mc-fuzzer/CMakeLists.txt (added)
>> +++ llvm/trunk/tools/llvm-mc-fuzzer/CMakeLists.txt Wed Sep 16 06:49:49
>> 2015
>> @@ -0,0 +1,18 @@
>> +if( LLVM_USE_SANITIZE_COVERAGE )
>> +  include_directories(BEFORE
>> +    ${CMAKE_CURRENT_SOURCE_DIR}/../../lib/Fuzzer)
>> +
>> +  set(LLVM_LINK_COMPONENTS
>> +      AllTargetsDescs
>> +      AllTargetsDisassemblers
>> +      AllTargetsInfos
>> +      MC
>> +      MCDisassembler
>> +      Support
>> +      )
>> +  add_llvm_tool(llvm-mc-fuzzer
>> +                llvm-mc-fuzzer.cpp)
>> +  target_link_libraries(llvm-mc-fuzzer
>> +                        LLVMFuzzerNoMain
>> +                        )
>> +endif()
>>
>> Added: llvm/trunk/tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp?rev=247786&view=auto
>>
>> ==============================================================================
>> --- llvm/trunk/tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp (added)
>> +++ llvm/trunk/tools/llvm-mc-fuzzer/llvm-mc-fuzzer.cpp Wed Sep 16
>> 06:49:49 2015
>> @@ -0,0 +1,129 @@
>> +//===--- llvm-mc-fuzzer.cpp - Fuzzer for the MC layer
>> ---------------------===//
>> +//
>> +//                     The LLVM Compiler Infrastructure
>> +//
>> +// This file is distributed under the University of Illinois Open Source
>> +// License. See LICENSE.TXT for details.
>> +//
>>
>> +//===----------------------------------------------------------------------===//
>> +//
>>
>> +//===----------------------------------------------------------------------===//
>> +
>> +#include "llvm-c/Disassembler.h"
>> +#include "llvm-c/Target.h"
>> +#include "llvm/ADT/ArrayRef.h"
>> +#include "llvm/MC/SubtargetFeature.h"
>> +#include "llvm/Support/CommandLine.h"
>> +#include "llvm/Support/raw_ostream.h"
>> +#include "FuzzerInterface.h"
>> +
>> +using namespace llvm;
>> +
>> +const unsigned AssemblyTextBufSize = 80;
>> +
>> +enum ActionType {
>> +  AC_Assemble,
>> +  AC_Disassemble
>> +};
>> +
>> +static cl::opt<ActionType>
>> +Action(cl::desc("Action to perform:"),
>> +       cl::init(AC_Assemble),
>> +       cl::values(clEnumValN(AC_Assemble, "assemble",
>> +                             "Assemble a .s file (default)"),
>> +                  clEnumValN(AC_Disassemble, "disassemble",
>> +                             "Disassemble strings of hex bytes"),
>> +                  clEnumValEnd));
>> +
>> +static cl::opt<std::string>
>> +    TripleName("triple", cl::desc("Target triple to assemble for, "
>> +                                  "see -version for available targets"));
>> +
>> +static cl::opt<std::string>
>> +    MCPU("mcpu",
>> +         cl::desc("Target a specific cpu type (-mcpu=help for details)"),
>> +         cl::value_desc("cpu-name"), cl::init(""));
>> +
>> +static cl::list<std::string>
>> +    MAttrs("mattr", cl::CommaSeparated,
>> +           cl::desc("Target specific attributes (-mattr=help for
>> details)"),
>> +           cl::value_desc("a1,+a2,-a3,..."));
>> +// The feature string derived from -mattr's values.
>> +std::string FeaturesStr;
>> +
>> +static cl::list<std::string>
>> +    FuzzerArgv("fuzzer-args", cl::Positional,
>> +               cl::desc("Options to pass to the fuzzer"), cl::ZeroOrMore,
>> +               cl::PositionalEatsArgs);
>> +
>> +void DisassembleOneInput(const uint8_t *Data, size_t Size) {
>> +  char AssemblyText[AssemblyTextBufSize];
>> +
>> +  std::vector<uint8_t> DataCopy(Data, Data + Size);
>> +
>> +  LLVMDisasmContextRef Ctx = LLVMCreateDisasmCPUFeatures(
>> +      TripleName.c_str(), MCPU.c_str(), FeaturesStr.c_str(), nullptr, 0,
>> +      nullptr, nullptr);
>> +  assert(Ctx);
>> +  uint8_t *p = DataCopy.data();
>> +  unsigned Consumed;
>> +  do {
>> +    Consumed = LLVMDisasmInstruction(Ctx, p, Size, 0, AssemblyText,
>> +                                     AssemblyTextBufSize);
>> +    Size -= Consumed;
>> +    p += Consumed;
>> +  } while (Consumed != 0);
>> +  LLVMDisasmDispose(Ctx);
>> +}
>> +
>> +int main(int argc, char **argv) {
>> +  // The command line is unusual compared to other fuzzers due to the
>> need to
>> +  // specify the target. Options like -triple, -mcpu, and -mattr work
>> like
>> +  // their counterparts in llvm-mc, while -fuzzer-args collects options
>> for the
>> +  // fuzzer itself.
>> +  //
>> +  // Examples:
>> +  //
>> +  // Fuzz the big-endian MIPS32R6 disassembler using 100,000 inputs of
>> up to
>> +  // 4-bytes each and use the contents of ./corpus as the test corpus:
>> +  //   llvm-mc-fuzzer -triple mips-linux-gnu -mcpu=mips32r6 -disassemble
>> \
>> +  //       -fuzzer-args -max_len=4 -runs=100000 ./corpus
>> +  //
>> +  // Infinitely fuzz the little-endian MIPS64R2 disassembler with the MSA
>> +  // feature enabled using up to 64-byte inputs:
>> +  //   llvm-mc-fuzzer -triple mipsel-linux-gnu -mcpu=mips64r2 -mattr=msa
>> \
>> +  //       -disassemble -fuzzer-args ./corpus
>> +  //
>> +  // If your aim is to find instructions that are not tested, then it is
>> +  // advisable to constrain the maximum input size to a single
>> instruction
>> +  // using -max_len as in the first example. This results in a test
>> corpus of
>> +  // individual instructions that test unique paths. Without this
>> constraint,
>> +  // there will be considerable redundancy in the corpus.
>> +
>> +  LLVMInitializeAllTargetInfos();
>> +  LLVMInitializeAllTargetMCs();
>> +  LLVMInitializeAllDisassemblers();
>> +
>> +  cl::ParseCommandLineOptions(argc, argv);
>> +
>> +  // Package up features to be passed to target/subtarget
>> +  // We have to pass it via a global since the callback doesn't
>> +  // permit any user data.
>> +  if (MAttrs.size()) {
>> +    SubtargetFeatures Features;
>> +    for (unsigned i = 0; i != MAttrs.size(); ++i)
>> +      Features.AddFeature(MAttrs[i]);
>> +    FeaturesStr = Features.getString();
>> +  }
>> +
>> +  // Insert the program name into the FuzzerArgv.
>> +  FuzzerArgv.insert(FuzzerArgv.begin(), argv[0]);
>> +
>> +  if (Action == AC_Assemble)
>> +    errs() << "error: -assemble is not implemented\n";
>> +  else if (Action == AC_Disassemble)
>> +    return fuzzer::FuzzerDriver(FuzzerArgv, DisassembleOneInput);
>> +
>> +  llvm_unreachable("Unknown action");
>> +  return 1;
>> +}
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150918/b142bf81/attachment.html>


More information about the llvm-commits mailing list