<div dir="ltr"><div dir="ltr">> I am skeptical that users will want to have this behavior by default.<br>> If this behavior is guarded by an option, it might be fine.<br></div><div dir="ltr"><br></div><div>That's a good point. If the reproducer will be more than a few hundreds MiBs, it is definitely not suitable to be enabled by default. I agree it's better to be guarded by an option flag such as `--gen-lld-crash-reproducer`.</div><div dir="ltr"><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Apr 14, 2021 at 2:40 PM Fangrui Song <<a href="mailto:maskray@google.com">maskray@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br>

On 2021-04-14, Haowei Wu via llvm-dev wrote:<br>

>*Background / Motivation*<br>

><br>

>Both clang and lld have the ability to generate a reproducer (an archive<br>

>with input files and invoker script to reproduce the clang/lld build).<br>

>While clang will generate a reproducer archive when a crash happens, lld<br>

>only generates a reproducer when '--reproduce' flag is explicitly set (this<br>

>is equivalent to Clang's -gen-reproducer flag). This is not very helpful<br>

>for debugging lld bugs, particularly when the crash happens in building big<br>

>projects, since it will be unrealistic to set reproducer flags to generate<br>

>reproducer archives for every lld invocation. This design also causes<br>

>troubles when the crash happens on bots only, as in most cases, developers<br>

>do not have access to the file system of these bots. It would be great to<br>

>improve the lld reproducer generation for easier debugging in these<br>

>scenarios.<br>

><br>

>*Proposal*<br>

><br>

>Given the use cases and status of clang and lld. I think there are 2<br>

>possible solutions.<br>

><br>

>*Extend Clang driver*<br>

>In most cases, lld is invoked by the clang driver instead of being invoked<br>

>by the build system directly. Therefore, the clang driver can be changed to<br>

>re-invoke lld with '--reproduce' flags when it detects the lld subprocess<br>

>is crashed.<br>

><br>

>Advantages:<br>

>    * It probably does not require any changes to the lld and might be<br>

>easier than handling the crash directly in lld.<br>

><br>

>Disadvantages:<br>

>    * In case when there is a racing condition in the build system, the<br>

>input files might have changed between 1st lld crash and 2nd lld rerun with<br>

>'--reproduce' flag. In this case, the generated lld reproducer archive<br>

>might not be able to trigger a crash, makes it less useful.<br>

><br>

>*Improve lld reproducer*<br>

>Another way would be to make lld generate a reproducer archive when it<br>

>crashes, just like what clang is doing.<br>

><br>

>Advantages:<br>

>    * It will work no matter if lld is invoked from Clang or from the build<br>

>system.<br>

>    * It will catch the input file in case the crash is caused by build<br>

>races.<br>

><br>

>Disadvantages:<br>

>    * It might need a lot of work if lld does not already have a<br>

>sophisticated crash handler. It might still need some plumbing changes in<br>

>clang driver so lld can honor the '-fcrash-diagnostic-dir' flag.<br>

><br>

>*Comments?*<br>

>Which approach do you prefer? Feel free to share your opinions.<br>

<br>

There is a resource difference between clang -gen-reproducer /<br>

environment variable "FORCE_CLANG_DIAGNOSTICS_CRASH" and ld.lld --reproduce.<br>

<br>

clang -gen-reproducer produces a source file and a .sh file for one<br>

single translation unit, the space consumption is low.<br>

ld.lld --reproduce can potentially pack a large list of files, which may<br>

take hundreds of megabytes or several gigabytes.<br>

<br>

I am skeptical that users will want to have this behavior by default.<br>

If this behavior is guarded by an option, it might be fine.<br>

</blockquote></div></div>