[llvm-dev] Building LLVM's fuzzers

Kostya Serebryany via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 15 16:26:36 PDT 2017


Two wrap up this discussion I've filed
https://bugs.llvm.org/show_bug.cgi?id=34636 (for Matt)

On Mon, Sep 11, 2017 at 6:57 PM, Kostya Serebryany <kcc at google.com> wrote:

>
>
> On Mon, Sep 11, 2017 at 11:52 AM, Justin Bogner <mail at justinbogner.com>
> wrote:
>
>> Kostya Serebryany <kcc at google.com> writes:
>> > Justin,
>>
>> > Calling appendToUsed has horrible complexity and if we call it in
>> > every function clang consumes tons of memory (6Gb when compiling one
>> > of the clang's source files).  This killed my machine today :)
>> >
>> > The solution is to call appendToUsed once per module, instead of once
>> > per function.
>>
>> Oh right, updating lists in metadata is O(n), so doing it per function
>> is quadratic. This slipped my mind - sorry!
>>
>
> Yea. Very unexpected (although I've stumbled on it a few times, I still
> don't remember about it)
>
>
>>
>> > Also, since this does not seem to be required for linux, I've put this
>> > under if TargetTriple.isOSBinFormatMachO Submitted r312855, I'll see
>> > if this breaks Mac (there seem to be no llvm tests for this, only
>> > compiler-rt tests) but please also check if this looks ok.
>>
>> This looks fine, though I'd rather if we just did it on all platforms
>> for consistency / clear semantic intent. Running appendToUsed once
>> should be cheap, and we do it unguarded in our other instrumentation
>> (like InstrProfiling.cpp).
>>
>
> It's unclear if it's going to hurt things on linux.
> If I see it doesn't -- I'll remove the if TargetTriple.isOSBinFormatMachO
>
>
>>
>> > But this all still sounds bad on linux at least:
>> >   * with the old bfd linker and -ffunction-sections -Wl,-gc-sections
>> these
>> >     arrays get removed (as discussed here)
>>
>> This is sad, and I don't think we have any particularly good ideas to
>> fix this.
>>
>> >   * with newer linkers the sanitizer coverage essentially disables
>> >     gc-sections
>>
>> I'm not sure I follow. We're making sure the global arrays /
>> instrumentation data doesn't get dead stripped here, but dead stripping
>> of functions should work as normal.
>>
>
> Ah, that's now my confusion.
> inline-8bit-counters and trace-pc-guard seems to work just fine.
> pc-table actually does break gc-sections as there is no a reference to
> every function in the pc-table section.
>
> +eugenis, who dealt with a similar issue in asan (although it's probably a
> separate topic now).
>
> --kcc
>
>
>
>
>>
>> > --kcc
>> >
>> >
>> > On Thu, Aug 24, 2017 at 6:43 PM, Peter Collingbourne <peter at pcc.me.uk>
>> > wrote:
>> >
>> >>
>> >>
>> >> On Thu, Aug 24, 2017 at 6:30 PM, Justin Bogner <mail at justinbogner.com>
>> >> wrote:
>> >>
>> >>> Peter Collingbourne <peter at pcc.me.uk> writes:
>> >>> > On Thu, Aug 24, 2017 at 3:38 PM, Kostya Serebryany <kcc at google.com>
>> >>> wrote:
>> >>> >
>> >>> >>
>> >>> >>
>> >>> >> On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <
>> peter at pcc.me.uk>
>> >>> >> wrote:
>> >>> >>
>> >>> >>> On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <
>> >>> >>> llvm-dev at lists.llvm.org> wrote:
>> >>> >>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <
>> >>> mail at justinbogner.com>
>> >>> >>>> wrote:
>> >>> >>>>
>> >>> >>>>> I think the simplest fix is something like this:
>> >>> >>>>>
>> >>> >>>>> diff --git a/lib/Transforms/Instrumentati
>> on/SanitizerCoverage.cpp
>> >>> >>>>> b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>> >>>>> index c6f0d17f8fe..e81957ab80a 100644
>> >>> >>>>> --- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>> >>>>> +++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>> >>>>> @@ -256,6 +256,7 @@ SanitizerCoverageModule::Creat
>> >>> eSecStartEnd(Module
>> >>> >>>>> &M, const char *Section,
>> >>> >>>>>        new GlobalVariable(M, Ty, false,
>> >>> GlobalVariable::ExternalLinkag
>> >>> >>>>> e,
>> >>> >>>>>                           nullptr, getSectionEnd(Section));
>> >>> >>>>>    SecEnd->setVisibility(GlobalValue::HiddenVisibility);
>> >>> >>>>> +  appendToUsed(M, {SecStart, SecEnd});
>> >>> >>>>>
>> >>> >>>>>    return std::make_pair(SecStart, SecEnd);
>> >>> >>>>>  }
>> >>> >>>>>
>> >>> >>>>> I'm trying it out now.
>> >>> >>>>>
>> >>> >>>>
>> >>> >>>> LGTM (if this works), thanks!
>> >>> >>>>
>> >>> >>>
>> >>> >>> I wouldn't expect that to work because for ELF targets llvm.used
>> has
>> >>> no
>> >>> >>> effect on the object file (only on the optimizer).
>> >>> >>>
>> >>> >>> Is there a simple way to reproduce the link failure?
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>> >> ninja compiler-rt
>> >>> >> echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a,
>> >>> >> unsigned long b){return 0; } ' > test.cc
>> >>> >> clang -O3 test.cc   -fsanitize=fuzzer # works
>> >>> >> clang -O3 test.cc  -Wl,-gc-sections -fsanitize=fuzzer # fails
>> >>> >>
>> >>> >
>> >>> > It seems that the issue is that older versions of ld.bfd have a bug
>> >>> which
>> >>> > causes it not to define __start_ and __stop_ symbols if the only
>> >>> reference
>> >>> > to those symbols is from a constructor.
>> >>>
>> >>> It looks like this is a different problem from the one on macOS (and I
>> >>> wasn't able to reproduce it with any bfd ld I had available, they were
>> >>> all too new)
>> >>>
>> >>> I've gone ahead and fixed the issue on macOS in r311742.
>> >>>
>> >>> > If I add an artificial reference to the start symbol from
>> libfuzzer's
>> >>> main
>> >>> > function, the program links correctly.
>> >>> >
>> >>> > diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> >>> > b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> >>> > index af8657200be2..c41e28e012db 100644
>> >>> > --- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> >>> > +++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> >>> > @@ -16,6 +16,10 @@ extern "C" {
>> >>> >  int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
>> >>> >  }  // extern "C"
>> >>> >
>> >>> > +__attribute__((weak)) void nop(void *p) {}
>> >>> > +extern void *__start___sancov_pcs;
>> >>> > +
>> >>> >  int main(int argc, char **argv) {
>> >>> > +  nop(__start___sancov_pcs);
>> >>> >    return fuzzer::FuzzerDriver(&argc, &argv,
>> LLVMFuzzerTestOneInput);
>> >>> >  }
>> >>>
>> >>> If we were to do this, we'd have to guard it appropriately - not all
>> >>> platforms name the __start symbols like this.
>> >>>
>> >>
>> >> Of course. There's also the issue of how to keep the symbols alive in
>> DSOs.
>> >>
>> >> > The problem also goes away if I use "GNU ld (GNU Binutils)
>> >>> > 2.28.51.20170105".
>> >>>
>> >>> 2.27 also doesn't have the issue. I don't know what our minimum
>> version
>> >>> of binutils is, and I'm under the impression most people use gold or
>> lld
>> >>> to link LLVM these days, so it isn't clear to me how big of a problem
>> >>> this is.
>> >>>
>> >>
>> >> For the record, the problem reproduces under 2.24, which is shipped by
>> >> Ubuntu 14.04 LTS, which isn't that old. My view is that if we can find
>> an
>> >> unintrusive enough workaround, we should deploy it (with a comment to
>> >> remove it after N years).
>> >>
>> >> Peter
>> >>
>> >>
>> >>> > Peter
>> >>> >
>> >>> >
>> >>> >
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>>
>> >>> >>> Peter
>> >>> >>>
>> >>> >>>
>> >>> >>>>
>> >>> >>>>>
>> >>> >>>>> Kostya Serebryany <kcc at google.com> writes:
>> >>> >>>>> > With -Wl,-gc-sections I get this:
>> >>> >>>>> > SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]
>> >>> +0x1b):
>> >>> >>>>> > undefined reference to `__start___sancov_pcs'
>> >>> >>>>> > SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]
>> >>> +0x20):
>> >>> >>>>> > undefined reference to `__stop___sancov_pcs'
>> >>> >>>>> >
>> >>> >>>>> >
>> >>> >>>>> >
>> >>> >>>>> > On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <
>> >>> >>>>> ekarpenkov at apple.com>
>> >>> >>>>> > wrote:
>> >>> >>>>> >
>> >>> >>>>> >>
>> >>> >>>>> >> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <
>> kcc at google.com>
>> >>> >>>>> wrote:
>> >>> >>>>> >>
>> >>> >>>>> >> Interesting.
>> >>> >>>>> >> This is a relatively new addition
>> (fsanitize-coverage=pc-tables,
>> >>> >>>>> which is
>> >>> >>>>> >> now a part of -fsanitize=fuzzer).
>> >>> >>>>> >> The tests worked (did they? On Mac?) so I thought everything
>> is
>> >>> ok.
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> For tests we never compile the tested target with -O3 (and
>> that
>> >>> >>>>> wouldn’t
>> >>> >>>>> >> be sufficient),
>> >>> >>>>> >> and for testing fuzzers I was always building them in debug
>> >>> >>>>> >>
>> >>> >>>>> >> Yea, we need to make sure the pc-tables are not stripped
>> (this
>> >>> is a
>> >>> >>>>> >> separate section with globals).
>> >>> >>>>> >> (I still haven't documented pc-tables, will do soon)
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> Do you know what's the analog of Wl,-dead_strip on Linux?
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> Apparently -Wl,—gc-sections.
>> >>> >>>>> >> For some reason LLVM does not do it for gold, even though it
>> >>> seems to
>> >>> >>>>> >> support this flag as well.
>> >>> >>>>> >> (that could be another reason why you don’t see the failure
>> on
>> >>> Linux)
>> >>> >>>>> >>
>> >>> >>>>> >>  1 *if*(NOT LLVM_NO_DEAD_STRIP)
>> >>> >>>>> >>  2   *if*(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
>> >>> >>>>> >>  3     # ld64's implementation of -dead_strip breaks tools
>> that
>> >>> use
>> >>> >>>>> >> plugins.
>> >>> >>>>> >>  4     set_property(TARGET ${target_name} APPEND_STRING
>> PROPERTY
>> >>> >>>>> >>  5                  LINK_FLAGS " -Wl,-dead_strip")
>> >>> >>>>> >>  6   *elseif*(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")
>> >>> >>>>> >>  7     set_property(TARGET ${target_name} APPEND_STRING
>> PROPERTY
>> >>> >>>>> >>  8                  LINK_FLAGS " -Wl,-z
>> >>> -Wl,discard-unused=sections")
>> >>> >>>>> >>  9   *elseif*(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)
>> >>> >>>>> >> 10     # Object files are compiled with
>> -ffunction-data-sections.
>> >>> >>>>> >> 11     # Versions of bfd ld < 2.23.1 have a bug in
>> --gc-sections
>> >>> that
>> >>> >>>>> >> breaks
>> >>> >>>>> >> 12     # tools that use plugins. Always pass --gc-sections
>> once
>> >>> we
>> >>> >>>>> require
>> >>> >>>>> >> 13     # a newer linker.
>> >>> >>>>> >> 14     set_property(TARGET ${target_name} APPEND_STRING
>> PROPERTY
>> >>> >>>>> >> 15                  LINK_FLAGS " -Wl,--gc-sections")
>> >>> >>>>> >> 16   *endif*()
>> >>> >>>>> >> 17 *endif*()
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> --kcc
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> On Thu, Aug 24, 2017 at 2:49 PM, Justin Bogner <
>> >>> >>>>> mail at justinbogner.com>
>> >>> >>>>> >> wrote:
>> >>> >>>>> >>
>> >>> >>>>> >>> George Karpenkov <ekarpenkov at apple.com> writes:
>> >>> >>>>> >>> > OK so with Kuba’s help I’ve found the error: with
>> >>> optimization,
>> >>> >>>>> dead
>> >>> >>>>> >>> > stripping of produced libraries is enabled,
>> >>> >>>>> >>> > which removes coverage instrumentation.
>> >>> >>>>> >>> >
>> >>> >>>>> >>> > However, this has nothing to do with the move to
>> compiler-rt,
>> >>> so
>> >>> >>>>> I’m
>> >>> >>>>> >>> > quite skeptical on whether it has worked
>> >>> >>>>> >>> > beforehand.
>> >>> >>>>> >>> >
>> >>> >>>>> >>> > A trivial fix is to do:
>> >>> >>>>> >>> >
>> >>> >>>>> >>> > diff --git a/cmake/modules/HandleLLVMOptions.cmake
>> >>> >>>>> >>> b/cmake/modules/HandleLLVMOptions.cmake
>> >>> >>>>> >>> > index 04596a6ff63..5465d8d95ba 100644
>> >>> >>>>> >>> > --- a/cmake/modules/HandleLLVMOptions.cmake
>> >>> >>>>> >>> > +++ b/cmake/modules/HandleLLVMOptions.cmake
>> >>> >>>>> >>> > @@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)
>> >>> >>>>> >>> >    endif()
>> >>> >>>>> >>> >    if (LLVM_USE_SANITIZE_COVERAGE)
>> >>> >>>>> >>> >      append("-fsanitize=fuzzer-no-link" CMAKE_C_FLAGS
>> >>> >>>>> CMAKE_CXX_FLAGS)
>> >>> >>>>> >>> > +
>> >>> >>>>> >>> > +    # Dead stripping messes up coverage instrumentation.
>> >>> >>>>> >>> > +    set(LLVM_NO_DEAD_STRIP ON)
>> >>> >>>>> >>> >    endif()
>> >>> >>>>> >>> >  endif()
>> >>> >>>>> >>> >
>> >>> >>>>> >>> > Any arguments against that?
>> >>> >>>>> >>>
>> >>> >>>>> >>> We shouldn't do this. We really only want to prevent dead
>> >>> stripping
>> >>> >>>>> of
>> >>> >>>>> >>> the counters themselves - disabling it completely isn't very
>> >>> nice.
>> >>> >>>>> >>>
>> >>> >>>>> >>> > Apparently, a better way is to follow ASAN instrumentation
>> >>> pass,
>> >>> >>>>> >>> > which uses some magic to protect against dead-stripping.
>> >>> >>>>> >>>
>> >>> >>>>> >>> I thought this was already being done - how else did it work
>> >>> before?
>> >>> >>>>> >>>
>> >>> >>>>> >>> >> On Aug 24, 2017, at 11:29 AM, Justin Bogner <
>> >>> >>>>> mail at justinbogner.com>
>> >>> >>>>> >>> wrote:
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> (kcc, george: sorry for the re-send, the first was from a
>> >>> >>>>> non-list
>> >>> >>>>> >>> email
>> >>> >>>>> >>> >> address)
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> My configuration for building the fuzzers in the LLVM
>> tree
>> >>> >>>>> doesn't
>> >>> >>>>> >>> seem to
>> >>> >>>>> >>> >> work any more (possibly as of moving libFuzzer to
>> >>> compiler-rt,
>> >>> >>>>> but
>> >>> >>>>> >>> there
>> >>> >>>>> >>> >> have been a few other changes in the last week or so that
>> >>> may be
>> >>> >>>>> >>> related).
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> I'm building with a fresh top-of-tree clang and setting
>> >>> >>>>> >>> >> -DLLVM_USE_SANITIZER=Address and
>> >>> -DLLVM_USE_SANITIZE_COVERAGE=O
>> >>> >>>>> n,
>> >>> >>>>> >>> which
>> >>> >>>>> >>> >> was working before:
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >>  % cmake -GNinja \
>> >>> >>>>> >>> >>          -DCMAKE_BUILD_TYPE=Release
>> >>> -DLLVM_ENABLE_ASSERTIONS=On \
>> >>> >>>>> >>> >>          -DLLVM_ENABLE_WERROR=On \
>> >>> >>>>> >>> >>          -DLLVM_USE_SANITIZER=Address
>> >>> >>>>> -DLLVM_USE_SANITIZE_COVERAGE=On
>> >>> >>>>> >>> \
>> >>> >>>>> >>> >>          -DCMAKE_C_COMPILER=$HOME/llvm-lkgc/bin/clang \
>> >>> >>>>> >>> >>          $HOME/code/llvm-src
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> But when I run any of the fuzzers, it looks like the
>> >>> sanitizer
>> >>> >>>>> coverage
>> >>> >>>>> >>> >> hasn't been set up correctly:
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >>  % ./bin/llvm-as-fuzzer
>> >>> >>>>> >>>                                    2017-08-24 11:14:33
>> >>> >>>>> >>> >>  INFO: Seed: 4089166883 <(408)%20916-6883>
>> >>> >>>>> >>> >>  INFO: Loaded 1 modules   (50607 guards): 50607
>> [0x10e14ef80,
>> >>> >>>>> >>> 0x10e18063c),
>> >>> >>>>> >>> >>  INFO: Loaded 1 PC tables (0 PCs): 0
>> >>> [0x10e2870a8,0x10e2870a8),
>> >>> >>>>> >>> >>  ERROR: The size of coverage PC tables does not match the
>> >>> number
>> >>> >>>>> of
>> >>> >>>>> >>> instrumented PCs. This might be a bug in the compiler,
>> please
>> >>> >>>>> contact the
>> >>> >>>>> >>> libFuzzer developers.
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> From the build logs, it looks like we're now building
>> objects
>> >>> >>>>> with
>> >>> >>>>> >>> these
>> >>> >>>>> >>> >> sanitizer flags:
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >>  -fsanitize=address
>> >>> >>>>> >>> >>  -fsanitize-address-use-after-scope
>> >>> >>>>> >>> >>  -fsanitize=fuzzer-no-link
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> We're then linking the fuzzer binaries with these:
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >>  -fsanitize=address
>> >>> >>>>> >>> >>  -fsanitize-address-use-after-scope
>> >>> >>>>> >>> >>  -fsanitize=fuzzer-no-link
>> >>> >>>>> >>> >>  -fsanitize=fuzzer
>> >>> >>>>> >>> >>
>> >>> >>>>> >>> >> Any idea what's wrong or where to start looking?
>> >>> >>>>> >>>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> _______________________________________________
>> >>> >>>> LLVM Developers mailing list
>> >>> >>>> llvm-dev at lists.llvm.org
>> >>> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>> >>>>
>> >>> >>>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>> --
>> >>> >>> Peter
>> >>> >>>
>> >>> >>
>> >>> >>
>> >>> >
>> >>> >
>> >>> > --
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> --
>> >> Peter
>> >>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170915/c54e3eb4/attachment.html>


More information about the llvm-dev mailing list