[llvm-dev] Building LLVM's fuzzers
    Kostya Serebryany via llvm-dev 
    llvm-dev at lists.llvm.org
       
    Fri Sep  8 22:36:35 PDT 2017
    
    
  
Justin,
Calling appendToUsed has horrible complexity and if we call it in every
function clang
consumes tons of memory (6Gb when compiling one of the clang's source
files).
This killed my machine today :)
The solution is to call appendToUsed once per module, instead of once per
function.
Also, since this does not seem to be required for linux, I've put this
under if TargetTriple.isOSBinFormatMachO
Submitted r312855, I'll see if this breaks Mac (there seem to be no llvm
tests for this, only compiler-rt tests)
but please also check if this looks ok.
But this all still sounds bad on linux at least:
  * with the old bfd linker and -ffunction-sections -Wl,-gc-sections these
arrays get removed (as discussed here)
  * with newer linkers the sanitizer coverage essentially disables
gc-sections
--kcc
On Thu, Aug 24, 2017 at 6:43 PM, Peter Collingbourne <peter at pcc.me.uk>
wrote:
>
>
> On Thu, Aug 24, 2017 at 6:30 PM, Justin Bogner <mail at justinbogner.com>
> wrote:
>
>> Peter Collingbourne <peter at pcc.me.uk> writes:
>> > On Thu, Aug 24, 2017 at 3:38 PM, Kostya Serebryany <kcc at google.com>
>> wrote:
>> >
>> >>
>> >>
>> >> On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <peter at pcc.me.uk>
>> >> wrote:
>> >>
>> >>> On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <
>> >>> llvm-dev at lists.llvm.org> wrote:
>> >>>
>> >>>>
>> >>>>
>> >>>> On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <
>> mail at justinbogner.com>
>> >>>> wrote:
>> >>>>
>> >>>>> I think the simplest fix is something like this:
>> >>>>>
>> >>>>> diff --git a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>>>> b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>>>> index c6f0d17f8fe..e81957ab80a 100644
>> >>>>> --- a/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>>>> +++ b/lib/Transforms/Instrumentation/SanitizerCoverage.cpp
>> >>>>> @@ -256,6 +256,7 @@ SanitizerCoverageModule::Creat
>> eSecStartEnd(Module
>> >>>>> &M, const char *Section,
>> >>>>>        new GlobalVariable(M, Ty, false,
>> GlobalVariable::ExternalLinkag
>> >>>>> e,
>> >>>>>                           nullptr, getSectionEnd(Section));
>> >>>>>    SecEnd->setVisibility(GlobalValue::HiddenVisibility);
>> >>>>> +  appendToUsed(M, {SecStart, SecEnd});
>> >>>>>
>> >>>>>    return std::make_pair(SecStart, SecEnd);
>> >>>>>  }
>> >>>>>
>> >>>>> I'm trying it out now.
>> >>>>>
>> >>>>
>> >>>> LGTM (if this works), thanks!
>> >>>>
>> >>>
>> >>> I wouldn't expect that to work because for ELF targets llvm.used has
>> no
>> >>> effect on the object file (only on the optimizer).
>> >>>
>> >>> Is there a simple way to reproduce the link failure?
>> >>>
>> >>
>> >>
>> >> ninja compiler-rt
>> >> echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a,
>> >> unsigned long b){return 0; } ' > test.cc
>> >> clang -O3 test.cc   -fsanitize=fuzzer # works
>> >> clang -O3 test.cc  -Wl,-gc-sections -fsanitize=fuzzer # fails
>> >>
>> >
>> > It seems that the issue is that older versions of ld.bfd have a bug
>> which
>> > causes it not to define __start_ and __stop_ symbols if the only
>> reference
>> > to those symbols is from a constructor.
>>
>> It looks like this is a different problem from the one on macOS (and I
>> wasn't able to reproduce it with any bfd ld I had available, they were
>> all too new)
>>
>> I've gone ahead and fixed the issue on macOS in r311742.
>>
>> > If I add an artificial reference to the start symbol from libfuzzer's
>> main
>> > function, the program links correctly.
>> >
>> > diff --git a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> > b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> > index af8657200be2..c41e28e012db 100644
>> > --- a/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> > +++ b/compiler-rt/lib/fuzzer/FuzzerMain.cpp
>> > @@ -16,6 +16,10 @@ extern "C" {
>> >  int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
>> >  }  // extern "C"
>> >
>> > +__attribute__((weak)) void nop(void *p) {}
>> > +extern void *__start___sancov_pcs;
>> > +
>> >  int main(int argc, char **argv) {
>> > +  nop(__start___sancov_pcs);
>> >    return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput);
>> >  }
>>
>> If we were to do this, we'd have to guard it appropriately - not all
>> platforms name the __start symbols like this.
>>
>
> Of course. There's also the issue of how to keep the symbols alive in DSOs.
>
> > The problem also goes away if I use "GNU ld (GNU Binutils)
>> > 2.28.51.20170105".
>>
>> 2.27 also doesn't have the issue. I don't know what our minimum version
>> of binutils is, and I'm under the impression most people use gold or lld
>> to link LLVM these days, so it isn't clear to me how big of a problem
>> this is.
>>
>
> For the record, the problem reproduces under 2.24, which is shipped by
> Ubuntu 14.04 LTS, which isn't that old. My view is that if we can find an
> unintrusive enough workaround, we should deploy it (with a comment to
> remove it after N years).
>
> Peter
>
>
>> > Peter
>> >
>> >
>> >
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>>
>> >>> Peter
>> >>>
>> >>>
>> >>>>
>> >>>>>
>> >>>>> Kostya Serebryany <kcc at google.com> writes:
>> >>>>> > With -Wl,-gc-sections I get this:
>> >>>>> > SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]
>> +0x1b):
>> >>>>> > undefined reference to `__start___sancov_pcs'
>> >>>>> > SimpleTest.cpp:(.text.sancov.module_ctor[sancov.module_ctor]
>> +0x20):
>> >>>>> > undefined reference to `__stop___sancov_pcs'
>> >>>>> >
>> >>>>> >
>> >>>>> >
>> >>>>> > On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <
>> >>>>> ekarpenkov at apple.com>
>> >>>>> > wrote:
>> >>>>> >
>> >>>>> >>
>> >>>>> >> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <kcc at google.com>
>> >>>>> wrote:
>> >>>>> >>
>> >>>>> >> Interesting.
>> >>>>> >> This is a relatively new addition (fsanitize-coverage=pc-tables,
>> >>>>> which is
>> >>>>> >> now a part of -fsanitize=fuzzer).
>> >>>>> >> The tests worked (did they? On Mac?) so I thought everything is
>> ok.
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> For tests we never compile the tested target with -O3 (and that
>> >>>>> wouldn’t
>> >>>>> >> be sufficient),
>> >>>>> >> and for testing fuzzers I was always building them in debug
>> >>>>> >>
>> >>>>> >> Yea, we need to make sure the pc-tables are not stripped (this
>> is a
>> >>>>> >> separate section with globals).
>> >>>>> >> (I still haven't documented pc-tables, will do soon)
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> Do you know what's the analog of Wl,-dead_strip on Linux?
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> Apparently -Wl,—gc-sections.
>> >>>>> >> For some reason LLVM does not do it for gold, even though it
>> seems to
>> >>>>> >> support this flag as well.
>> >>>>> >> (that could be another reason why you don’t see the failure on
>> Linux)
>> >>>>> >>
>> >>>>> >>  1 *if*(NOT LLVM_NO_DEAD_STRIP)
>> >>>>> >>  2   *if*(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
>> >>>>> >>  3     # ld64's implementation of -dead_strip breaks tools that
>> use
>> >>>>> >> plugins.
>> >>>>> >>  4     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> >>>>> >>  5                  LINK_FLAGS " -Wl,-dead_strip")
>> >>>>> >>  6   *elseif*(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")
>> >>>>> >>  7     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> >>>>> >>  8                  LINK_FLAGS " -Wl,-z
>> -Wl,discard-unused=sections")
>> >>>>> >>  9   *elseif*(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)
>> >>>>> >> 10     # Object files are compiled with -ffunction-data-sections.
>> >>>>> >> 11     # Versions of bfd ld < 2.23.1 have a bug in --gc-sections
>> that
>> >>>>> >> breaks
>> >>>>> >> 12     # tools that use plugins. Always pass --gc-sections once
>> we
>> >>>>> require
>> >>>>> >> 13     # a newer linker.
>> >>>>> >> 14     set_property(TARGET ${target_name} APPEND_STRING PROPERTY
>> >>>>> >> 15                  LINK_FLAGS " -Wl,--gc-sections")
>> >>>>> >> 16   *endif*()
>> >>>>> >> 17 *endif*()
>> >>>>> >>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> --kcc
>> >>>>> >>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Thu, Aug 24, 2017 at 2:49 PM, Justin Bogner <
>> >>>>> mail at justinbogner.com>
>> >>>>> >> wrote:
>> >>>>> >>
>> >>>>> >>> George Karpenkov <ekarpenkov at apple.com> writes:
>> >>>>> >>> > OK so with Kuba’s help I’ve found the error: with
>> optimization,
>> >>>>> dead
>> >>>>> >>> > stripping of produced libraries is enabled,
>> >>>>> >>> > which removes coverage instrumentation.
>> >>>>> >>> >
>> >>>>> >>> > However, this has nothing to do with the move to compiler-rt,
>> so
>> >>>>> I’m
>> >>>>> >>> > quite skeptical on whether it has worked
>> >>>>> >>> > beforehand.
>> >>>>> >>> >
>> >>>>> >>> > A trivial fix is to do:
>> >>>>> >>> >
>> >>>>> >>> > diff --git a/cmake/modules/HandleLLVMOptions.cmake
>> >>>>> >>> b/cmake/modules/HandleLLVMOptions.cmake
>> >>>>> >>> > index 04596a6ff63..5465d8d95ba 100644
>> >>>>> >>> > --- a/cmake/modules/HandleLLVMOptions.cmake
>> >>>>> >>> > +++ b/cmake/modules/HandleLLVMOptions.cmake
>> >>>>> >>> > @@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)
>> >>>>> >>> >    endif()
>> >>>>> >>> >    if (LLVM_USE_SANITIZE_COVERAGE)
>> >>>>> >>> >      append("-fsanitize=fuzzer-no-link" CMAKE_C_FLAGS
>> >>>>> CMAKE_CXX_FLAGS)
>> >>>>> >>> > +
>> >>>>> >>> > +    # Dead stripping messes up coverage instrumentation.
>> >>>>> >>> > +    set(LLVM_NO_DEAD_STRIP ON)
>> >>>>> >>> >    endif()
>> >>>>> >>> >  endif()
>> >>>>> >>> >
>> >>>>> >>> > Any arguments against that?
>> >>>>> >>>
>> >>>>> >>> We shouldn't do this. We really only want to prevent dead
>> stripping
>> >>>>> of
>> >>>>> >>> the counters themselves - disabling it completely isn't very
>> nice.
>> >>>>> >>>
>> >>>>> >>> > Apparently, a better way is to follow ASAN instrumentation
>> pass,
>> >>>>> >>> > which uses some magic to protect against dead-stripping.
>> >>>>> >>>
>> >>>>> >>> I thought this was already being done - how else did it work
>> before?
>> >>>>> >>>
>> >>>>> >>> >> On Aug 24, 2017, at 11:29 AM, Justin Bogner <
>> >>>>> mail at justinbogner.com>
>> >>>>> >>> wrote:
>> >>>>> >>> >>
>> >>>>> >>> >> (kcc, george: sorry for the re-send, the first was from a
>> >>>>> non-list
>> >>>>> >>> email
>> >>>>> >>> >> address)
>> >>>>> >>> >>
>> >>>>> >>> >> My configuration for building the fuzzers in the LLVM tree
>> >>>>> doesn't
>> >>>>> >>> seem to
>> >>>>> >>> >> work any more (possibly as of moving libFuzzer to
>> compiler-rt,
>> >>>>> but
>> >>>>> >>> there
>> >>>>> >>> >> have been a few other changes in the last week or so that
>> may be
>> >>>>> >>> related).
>> >>>>> >>> >>
>> >>>>> >>> >> I'm building with a fresh top-of-tree clang and setting
>> >>>>> >>> >> -DLLVM_USE_SANITIZER=Address and
>> -DLLVM_USE_SANITIZE_COVERAGE=O
>> >>>>> n,
>> >>>>> >>> which
>> >>>>> >>> >> was working before:
>> >>>>> >>> >>
>> >>>>> >>> >>  % cmake -GNinja \
>> >>>>> >>> >>          -DCMAKE_BUILD_TYPE=Release
>> -DLLVM_ENABLE_ASSERTIONS=On \
>> >>>>> >>> >>          -DLLVM_ENABLE_WERROR=On \
>> >>>>> >>> >>          -DLLVM_USE_SANITIZER=Address
>> >>>>> -DLLVM_USE_SANITIZE_COVERAGE=On
>> >>>>> >>> \
>> >>>>> >>> >>          -DCMAKE_C_COMPILER=$HOME/llvm-lkgc/bin/clang \
>> >>>>> >>> >>          $HOME/code/llvm-src
>> >>>>> >>> >>
>> >>>>> >>> >> But when I run any of the fuzzers, it looks like the
>> sanitizer
>> >>>>> coverage
>> >>>>> >>> >> hasn't been set up correctly:
>> >>>>> >>> >>
>> >>>>> >>> >>  % ./bin/llvm-as-fuzzer
>> >>>>> >>>                                    2017-08-24 11:14:33
>> >>>>> >>> >>  INFO: Seed: 4089166883 <(408)%20916-6883>
>> >>>>> >>> >>  INFO: Loaded 1 modules   (50607 guards): 50607 [0x10e14ef80,
>> >>>>> >>> 0x10e18063c),
>> >>>>> >>> >>  INFO: Loaded 1 PC tables (0 PCs): 0
>> [0x10e2870a8,0x10e2870a8),
>> >>>>> >>> >>  ERROR: The size of coverage PC tables does not match the
>> number
>> >>>>> of
>> >>>>> >>> instrumented PCs. This might be a bug in the compiler, please
>> >>>>> contact the
>> >>>>> >>> libFuzzer developers.
>> >>>>> >>> >>
>> >>>>> >>> >> From the build logs, it looks like we're now building objects
>> >>>>> with
>> >>>>> >>> these
>> >>>>> >>> >> sanitizer flags:
>> >>>>> >>> >>
>> >>>>> >>> >>  -fsanitize=address
>> >>>>> >>> >>  -fsanitize-address-use-after-scope
>> >>>>> >>> >>  -fsanitize=fuzzer-no-link
>> >>>>> >>> >>
>> >>>>> >>> >> We're then linking the fuzzer binaries with these:
>> >>>>> >>> >>
>> >>>>> >>> >>  -fsanitize=address
>> >>>>> >>> >>  -fsanitize-address-use-after-scope
>> >>>>> >>> >>  -fsanitize=fuzzer-no-link
>> >>>>> >>> >>  -fsanitize=fuzzer
>> >>>>> >>> >>
>> >>>>> >>> >> Any idea what's wrong or where to start looking?
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >>
>> >>>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> LLVM Developers mailing list
>> >>>> llvm-dev at lists.llvm.org
>> >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>>>
>> >>>>
>> >>>
>> >>>
>> >>> --
>> >>> --
>> >>> Peter
>> >>>
>> >>
>> >>
>> >
>> >
>> > --
>>
>
>
>
> --
> --
> Peter
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170908/b5deaf01/attachment.html>
    
    
More information about the llvm-dev
mailing list