<div dir="ltr"><div>Justin, </div>Calling <span style="color:rgb(0,0,0)">appendToUsed has horrible complexity and if we call it in every function clang</span><div><span style="color:rgb(0,0,0)">consumes tons of memory (6Gb when compiling one of the clang's source files).</span><div><span style="color:rgb(0,0,0)">This killed my machine today :) </span></div></div><div><span style="color:rgb(0,0,0)"><br></span></div><div><span style="color:rgb(0,0,0)">The solution is to call </span><span style="color:rgb(0,0,0)">appendToUsed once per module, instead of once per function. </span></div><div><span style="color:rgb(0,0,0)">Also, since this does not seem to be required for linux, I've put this under if </span><span style="color:rgb(0,0,0)">TargetTriple.isOSBinFormatMachO</span></div><div><span style="color:rgb(0,0,0)">Submitted r</span><span style="color:rgb(0,0,0)">312855, I'll see if this breaks Mac (there seem to be no llvm tests for this, only compiler-rt tests)</span></div><div><span style="color:rgb(0,0,0)">but please also check if this looks ok. </span></div><div><span style="color:rgb(0,0,0)"><br></span></div><div><span style="color:rgb(0,0,0)">But this all still sounds bad </span><span style="color:rgb(0,0,0)">on linux at least:</span></div><div><span style="color:rgb(0,0,0)"> * with the old bfd linker and </span><span style="color:rgb(0,0,0)">-ffunction-sections -Wl,-gc-sections these arrays get removed (as discussed here) </span><span style="color:rgb(0,0,0)"> </span></div><div><span style="color:rgb(0,0,0)"> * with newer linkers the sanitizer coverage essentially disables gc-sections </span></div><div><span style="color:rgb(0,0,0)"><br></span></div><div><span style="color:rgb(0,0,0)">--kcc </span></div><div><span style="color:rgb(0,0,0)"><br></span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 24, 2017 at 6:43 PM, Peter Collingbourne <span dir="ltr"><<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="h5">On Thu, Aug 24, 2017 at 6:30 PM, Justin Bogner <span dir="ltr"><<a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="m_5619763847933861810HOEnZb"><div class="m_5619763847933861810h5">Peter Collingbourne <<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>> writes:<br>
> On Thu, Aug 24, 2017 at 3:38 PM, Kostya Serebryany <<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>> wrote:<br>
><br>
>><br>
>><br>
>> On Thu, Aug 24, 2017 at 3:35 PM, Peter Collingbourne <<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>><br>
>> wrote:<br>
>><br>
>>> On Thu, Aug 24, 2017 at 3:21 PM, Kostya Serebryany via llvm-dev <<br>
>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:<br>
>>><br>
>>>><br>
>>>><br>
>>>> On Thu, Aug 24, 2017 at 3:20 PM, Justin Bogner <<a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>><br>
>>>> wrote:<br>
>>>><br>
>>>>> I think the simplest fix is something like this:<br>
>>>>><br>
>>>>> diff --git a/lib/Transforms/Instrumentati<wbr>on/SanitizerCoverage.cpp<br>
>>>>> b/lib/Transforms/Instrumentati<wbr>on/SanitizerCoverage.cpp<br>
>>>>> index c6f0d17f8fe..e81957ab80a 100644<br>
>>>>> --- a/lib/Transforms/Instrumentati<wbr>on/SanitizerCoverage.cpp<br>
>>>>> +++ b/lib/Transforms/Instrumentati<wbr>on/SanitizerCoverage.cpp<br>
>>>>> @@ -256,6 +256,7 @@ SanitizerCoverageModule::Creat<wbr>eSecStartEnd(Module<br>
>>>>> &M, const char *Section,<br>
>>>>> new GlobalVariable(M, Ty, false, GlobalVariable::ExternalLinkag<br>
>>>>> e,<br>
>>>>> nullptr, getSectionEnd(Section));<br>
>>>>> SecEnd->setVisibility(GlobalVa<wbr>lue::HiddenVisibility);<br>
>>>>> + appendToUsed(M, {SecStart, SecEnd});<br>
>>>>><br>
>>>>> return std::make_pair(SecStart, SecEnd);<br>
>>>>> }<br>
>>>>><br>
>>>>> I'm trying it out now.<br>
>>>>><br>
>>>><br>
>>>> LGTM (if this works), thanks!<br>
>>>><br>
>>><br>
>>> I wouldn't expect that to work because for ELF targets llvm.used has no<br>
>>> effect on the object file (only on the optimizer).<br>
>>><br>
>>> Is there a simple way to reproduce the link failure?<br>
>>><br>
>><br>
>><br>
>> ninja compiler-rt<br>
>> echo 'extern "C" int LLVMFuzzerTestOneInput(const unsigned char *a,<br>
>> unsigned long b){return 0; } ' > test.cc<br>
>> clang -O3 test.cc -fsanitize=fuzzer # works<br>
>> clang -O3 test.cc -Wl,-gc-sections -fsanitize=fuzzer # fails<br>
>><br>
><br>
> It seems that the issue is that older versions of ld.bfd have a bug which<br>
> causes it not to define __start_ and __stop_ symbols if the only reference<br>
> to those symbols is from a constructor.<br>
<br>
</div></div>It looks like this is a different problem from the one on macOS (and I<br>
wasn't able to reproduce it with any bfd ld I had available, they were<br>
all too new)<br>
<br>
I've gone ahead and fixed the issue on macOS in r311742.<br>
<span><br>
> If I add an artificial reference to the start symbol from libfuzzer's main<br>
> function, the program links correctly.<br>
><br>
> diff --git a/compiler-rt/lib/fuzzer/Fuzze<wbr>rMain.cpp<br>
> b/compiler-rt/lib/fuzzer/Fuzze<wbr>rMain.cpp<br>
> index af8657200be2..c41e28e012db 100644<br>
> --- a/compiler-rt/lib/fuzzer/Fuzze<wbr>rMain.cpp<br>
> +++ b/compiler-rt/lib/fuzzer/Fuzze<wbr>rMain.cpp<br>
> @@ -16,6 +16,10 @@ extern "C" {<br>
> int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);<br>
> } // extern "C"<br>
><br>
> +__attribute__((weak)) void nop(void *p) {}<br>
> +extern void *__start___sancov_pcs;<br>
> +<br>
> int main(int argc, char **argv) {<br>
> + nop(__start___sancov_pcs);<br>
> return fuzzer::FuzzerDriver(&argc, &argv, LLVMFuzzerTestOneInput);<br>
> }<br>
<br>
</span>If we were to do this, we'd have to guard it appropriately - not all<br>
platforms name the __start symbols like this.<br></blockquote><div><br></div></div></div><div>Of course. There's also the issue of how to keep the symbols alive in DSOs.</div><span class=""><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
> The problem also goes away if I use "GNU ld (GNU Binutils)<br>
> 2.28.51.20170105".<br>
<br>
</span>2.27 also doesn't have the issue. I don't know what our minimum version<br>
of binutils is, and I'm under the impression most people use gold or lld<br>
to link LLVM these days, so it isn't clear to me how big of a problem<br>
this is.<br></blockquote><div><br></div></span><div>For the record, the problem reproduces under 2.24, which is shipped by Ubuntu 14.04 LTS, which isn't that old. My view is that if we can find an unintrusive enough workaround, we should deploy it (with a comment to remove it after N years).</div><div><br></div><div>Peter</div><div><div class="h5"><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div class="m_5619763847933861810h5"><br>
> Peter<br>
><br>
><br>
><br>
>><br>
>><br>
>><br>
>><br>
>><br>
>>><br>
>>> Peter<br>
>>><br>
>>><br>
>>>><br>
>>>>><br>
>>>>> Kostya Serebryany <<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>> writes:<br>
>>>>> > With -Wl,-gc-sections I get this:<br>
>>>>> > SimpleTest.cpp:(.text.sancov.m<wbr>odule_ctor[sancov.module_ctor]<wbr>+0x1b):<br>
>>>>> > undefined reference to `__start___sancov_pcs'<br>
>>>>> > SimpleTest.cpp:(.text.sancov.m<wbr>odule_ctor[sancov.module_ctor]<wbr>+0x20):<br>
>>>>> > undefined reference to `__stop___sancov_pcs'<br>
>>>>> ><br>
>>>>> ><br>
>>>>> ><br>
>>>>> > On Thu, Aug 24, 2017 at 3:07 PM, George Karpenkov <<br>
>>>>> <a href="mailto:ekarpenkov@apple.com" target="_blank">ekarpenkov@apple.com</a>><br>
>>>>> > wrote:<br>
>>>>> ><br>
>>>>> >><br>
>>>>> >> On Aug 24, 2017, at 2:55 PM, Kostya Serebryany <<a href="mailto:kcc@google.com" target="_blank">kcc@google.com</a>><br>
>>>>> wrote:<br>
>>>>> >><br>
>>>>> >> Interesting.<br>
>>>>> >> This is a relatively new addition (fsanitize-coverage=pc-tables,<br>
>>>>> which is<br>
>>>>> >> now a part of -fsanitize=fuzzer).<br>
>>>>> >> The tests worked (did they? On Mac?) so I thought everything is ok.<br>
>>>>> >><br>
>>>>> >><br>
>>>>> >> For tests we never compile the tested target with -O3 (and that<br>
>>>>> wouldn’t<br>
>>>>> >> be sufficient),<br>
>>>>> >> and for testing fuzzers I was always building them in debug<br>
>>>>> >><br>
>>>>> >> Yea, we need to make sure the pc-tables are not stripped (this is a<br>
>>>>> >> separate section with globals).<br>
>>>>> >> (I still haven't documented pc-tables, will do soon)<br>
>>>>> >><br>
>>>>> >><br>
>>>>> >> Do you know what's the analog of Wl,-dead_strip on Linux?<br>
>>>>> >><br>
>>>>> >><br>
>>>>> >> Apparently -Wl,—gc-sections.<br>
>>>>> >> For some reason LLVM does not do it for gold, even though it seems to<br>
>>>>> >> support this flag as well.<br>
>>>>> >> (that could be another reason why you don’t see the failure on Linux)<br>
>>>>> >><br>
>>>>> >> 1 *if*(NOT LLVM_NO_DEAD_STRIP)<br>
>>>>> >> 2 *if*(${CMAKE_SYSTEM_NAME} MATCHES "Darwin")<br>
>>>>> >> 3 # ld64's implementation of -dead_strip breaks tools that use<br>
>>>>> >> plugins.<br>
>>>>> >> 4 set_property(TARGET ${target_name} APPEND_STRING PROPERTY<br>
>>>>> >> 5 LINK_FLAGS " -Wl,-dead_strip")<br>
>>>>> >> 6 *elseif*(${CMAKE_SYSTEM_NAME} MATCHES "SunOS")<br>
>>>>> >> 7 set_property(TARGET ${target_name} APPEND_STRING PROPERTY<br>
>>>>> >> 8 LINK_FLAGS " -Wl,-z -Wl,discard-unused=sections")<br>
>>>>> >> 9 *elseif*(NOT WIN32 AND NOT LLVM_LINKER_IS_GOLD)<br>
>>>>> >> 10 # Object files are compiled with -ffunction-data-sections.<br>
>>>>> >> 11 # Versions of bfd ld < 2.23.1 have a bug in --gc-sections that<br>
>>>>> >> breaks<br>
>>>>> >> 12 # tools that use plugins. Always pass --gc-sections once we<br>
>>>>> require<br>
>>>>> >> 13 # a newer linker.<br>
>>>>> >> 14 set_property(TARGET ${target_name} APPEND_STRING PROPERTY<br>
>>>>> >> 15 LINK_FLAGS " -Wl,--gc-sections")<br>
>>>>> >> 16 *endif*()<br>
>>>>> >> 17 *endif*()<br>
>>>>> >><br>
>>>>> >><br>
>>>>> >><br>
>>>>> >> --kcc<br>
>>>>> >><br>
>>>>> >><br>
>>>>> >><br>
>>>>> >> On Thu, Aug 24, 2017 at 2:49 PM, Justin Bogner <<br>
>>>>> <a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>><br>
>>>>> >> wrote:<br>
>>>>> >><br>
>>>>> >>> George Karpenkov <<a href="mailto:ekarpenkov@apple.com" target="_blank">ekarpenkov@apple.com</a>> writes:<br>
>>>>> >>> > OK so with Kuba’s help I’ve found the error: with optimization,<br>
>>>>> dead<br>
>>>>> >>> > stripping of produced libraries is enabled,<br>
>>>>> >>> > which removes coverage instrumentation.<br>
>>>>> >>> ><br>
>>>>> >>> > However, this has nothing to do with the move to compiler-rt, so<br>
>>>>> I’m<br>
>>>>> >>> > quite skeptical on whether it has worked<br>
>>>>> >>> > beforehand.<br>
>>>>> >>> ><br>
>>>>> >>> > A trivial fix is to do:<br>
>>>>> >>> ><br>
>>>>> >>> > diff --git a/cmake/modules/HandleLLVMOpti<wbr>ons.cmake<br>
>>>>> >>> b/cmake/modules/HandleLLVMOpti<wbr>ons.cmake<br>
>>>>> >>> > index 04596a6ff63..5465d8d95ba 100644<br>
>>>>> >>> > --- a/cmake/modules/HandleLLVMOpti<wbr>ons.cmake<br>
>>>>> >>> > +++ b/cmake/modules/HandleLLVMOpti<wbr>ons.cmake<br>
>>>>> >>> > @@ -665,6 +665,9 @@ if(LLVM_USE_SANITIZER)<br>
>>>>> >>> > endif()<br>
>>>>> >>> > if (LLVM_USE_SANITIZE_COVERAGE)<br>
>>>>> >>> > append("-fsanitize=fuzzer-no-l<wbr>ink" CMAKE_C_FLAGS<br>
>>>>> CMAKE_CXX_FLAGS)<br>
>>>>> >>> > +<br>
>>>>> >>> > + # Dead stripping messes up coverage instrumentation.<br>
>>>>> >>> > + set(LLVM_NO_DEAD_STRIP ON)<br>
>>>>> >>> > endif()<br>
>>>>> >>> > endif()<br>
>>>>> >>> ><br>
>>>>> >>> > Any arguments against that?<br>
>>>>> >>><br>
>>>>> >>> We shouldn't do this. We really only want to prevent dead stripping<br>
>>>>> of<br>
>>>>> >>> the counters themselves - disabling it completely isn't very nice.<br>
>>>>> >>><br>
>>>>> >>> > Apparently, a better way is to follow ASAN instrumentation pass,<br>
>>>>> >>> > which uses some magic to protect against dead-stripping.<br>
>>>>> >>><br>
>>>>> >>> I thought this was already being done - how else did it work before?<br>
>>>>> >>><br>
>>>>> >>> >> On Aug 24, 2017, at 11:29 AM, Justin Bogner <<br>
>>>>> <a href="mailto:mail@justinbogner.com" target="_blank">mail@justinbogner.com</a>><br>
>>>>> >>> wrote:<br>
>>>>> >>> >><br>
>>>>> >>> >> (kcc, george: sorry for the re-send, the first was from a<br>
>>>>> non-list<br>
>>>>> >>> email<br>
>>>>> >>> >> address)<br>
>>>>> >>> >><br>
>>>>> >>> >> My configuration for building the fuzzers in the LLVM tree<br>
>>>>> doesn't<br>
>>>>> >>> seem to<br>
>>>>> >>> >> work any more (possibly as of moving libFuzzer to compiler-rt,<br>
>>>>> but<br>
>>>>> >>> there<br>
>>>>> >>> >> have been a few other changes in the last week or so that may be<br>
>>>>> >>> related).<br>
>>>>> >>> >><br>
>>>>> >>> >> I'm building with a fresh top-of-tree clang and setting<br>
>>>>> >>> >> -DLLVM_USE_SANITIZER=Address and -DLLVM_USE_SANITIZE_COVERAGE=O<br>
>>>>> n,<br>
>>>>> >>> which<br>
>>>>> >>> >> was working before:<br>
>>>>> >>> >><br>
>>>>> >>> >> % cmake -GNinja \<br>
>>>>> >>> >> -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=On \<br>
>>>>> >>> >> -DLLVM_ENABLE_WERROR=On \<br>
>>>>> >>> >> -DLLVM_USE_SANITIZER=Address<br>
>>>>> -DLLVM_USE_SANITIZE_COVERAGE=O<wbr>n<br>
>>>>> >>> \<br>
>>>>> >>> >> -DCMAKE_C_COMPILER=$HOME/llvm-<wbr>lkgc/bin/clang \<br>
>>>>> >>> >> $HOME/code/llvm-src<br>
>>>>> >>> >><br>
>>>>> >>> >> But when I run any of the fuzzers, it looks like the sanitizer<br>
>>>>> coverage<br>
>>>>> >>> >> hasn't been set up correctly:<br>
>>>>> >>> >><br>
>>>>> >>> >> % ./bin/llvm-as-fuzzer<br>
>>>>> >>> 2017-08-24 11:14:33<br>
</div></div>>>>>> >>> >> INFO: Seed: <a href="tel:4089166883" value="+14089166883" target="_blank">4089166883</a> <(408)%20916-6883><br>
<div class="m_5619763847933861810HOEnZb"><div class="m_5619763847933861810h5">>>>>> >>> >> INFO: Loaded 1 modules (50607 guards): 50607 [0x10e14ef80,<br>
>>>>> >>> 0x10e18063c),<br>
>>>>> >>> >> INFO: Loaded 1 PC tables (0 PCs): 0 [0x10e2870a8,0x10e2870a8),<br>
>>>>> >>> >> ERROR: The size of coverage PC tables does not match the number<br>
>>>>> of<br>
>>>>> >>> instrumented PCs. This might be a bug in the compiler, please<br>
>>>>> contact the<br>
>>>>> >>> libFuzzer developers.<br>
>>>>> >>> >><br>
>>>>> >>> >> From the build logs, it looks like we're now building objects<br>
>>>>> with<br>
>>>>> >>> these<br>
>>>>> >>> >> sanitizer flags:<br>
>>>>> >>> >><br>
>>>>> >>> >> -fsanitize=address<br>
>>>>> >>> >> -fsanitize-address-use-after-s<wbr>cope<br>
>>>>> >>> >> -fsanitize=fuzzer-no-link<br>
>>>>> >>> >><br>
>>>>> >>> >> We're then linking the fuzzer binaries with these:<br>
>>>>> >>> >><br>
>>>>> >>> >> -fsanitize=address<br>
>>>>> >>> >> -fsanitize-address-use-after-s<wbr>cope<br>
>>>>> >>> >> -fsanitize=fuzzer-no-link<br>
>>>>> >>> >> -fsanitize=fuzzer<br>
>>>>> >>> >><br>
>>>>> >>> >> Any idea what's wrong or where to start looking?<br>
>>>>> >>><br>
>>>>> >><br>
>>>>> >><br>
>>>>> >><br>
>>>>><br>
>>>><br>
>>>><br>
>>>> ______________________________<wbr>_________________<br>
>>>> LLVM Developers mailing list<br>
>>>> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
>>>> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
>>>><br>
>>>><br>
>>><br>
>>><br>
>>> --<br>
>>> --<br>
>>> Peter<br>
>>><br>
>><br>
>><br>
><br>
><br>
> --<br>
</div></div></blockquote></div></div></div><span class="HOEnZb"><font color="#888888"><br><br clear="all"><div><br></div>-- <br><div class="m_5619763847933861810gmail_signature" data-smartmail="gmail_signature"><div dir="ltr">-- <div>Peter</div></div></div>
</font></span></div></div>
</blockquote></div><br></div>