[llvm-dev] [RFC] Generating LLD reproducers on crashes

Fangrui Song via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 14 14:40:31 PDT 2021


On 2021-04-14, Haowei Wu via llvm-dev wrote:
>*Background / Motivation*
>
>Both clang and lld have the ability to generate a reproducer (an archive
>with input files and invoker script to reproduce the clang/lld build).
>While clang will generate a reproducer archive when a crash happens, lld
>only generates a reproducer when '--reproduce' flag is explicitly set (this
>is equivalent to Clang's -gen-reproducer flag). This is not very helpful
>for debugging lld bugs, particularly when the crash happens in building big
>projects, since it will be unrealistic to set reproducer flags to generate
>reproducer archives for every lld invocation. This design also causes
>troubles when the crash happens on bots only, as in most cases, developers
>do not have access to the file system of these bots. It would be great to
>improve the lld reproducer generation for easier debugging in these
>scenarios.
>
>*Proposal*
>
>Given the use cases and status of clang and lld. I think there are 2
>possible solutions.
>
>*Extend Clang driver*
>In most cases, lld is invoked by the clang driver instead of being invoked
>by the build system directly. Therefore, the clang driver can be changed to
>re-invoke lld with '--reproduce' flags when it detects the lld subprocess
>is crashed.
>
>Advantages:
>    * It probably does not require any changes to the lld and might be
>easier than handling the crash directly in lld.
>
>Disadvantages:
>    * In case when there is a racing condition in the build system, the
>input files might have changed between 1st lld crash and 2nd lld rerun with
>'--reproduce' flag. In this case, the generated lld reproducer archive
>might not be able to trigger a crash, makes it less useful.
>
>*Improve lld reproducer*
>Another way would be to make lld generate a reproducer archive when it
>crashes, just like what clang is doing.
>
>Advantages:
>    * It will work no matter if lld is invoked from Clang or from the build
>system.
>    * It will catch the input file in case the crash is caused by build
>races.
>
>Disadvantages:
>    * It might need a lot of work if lld does not already have a
>sophisticated crash handler. It might still need some plumbing changes in
>clang driver so lld can honor the '-fcrash-diagnostic-dir' flag.
>
>*Comments?*
>Which approach do you prefer? Feel free to share your opinions.

There is a resource difference between clang -gen-reproducer /
environment variable "FORCE_CLANG_DIAGNOSTICS_CRASH" and ld.lld --reproduce.

clang -gen-reproducer produces a source file and a .sh file for one
single translation unit, the space consumption is low.
ld.lld --reproduce can potentially pack a large list of files, which may
take hundreds of megabytes or several gigabytes.

I am skeptical that users will want to have this behavior by default.
If this behavior is guarded by an option, it might be fine.


More information about the llvm-dev mailing list