[llvm-dev] [RFC] Generating LLD reproducers on crashes

Fāng-ruì Sòng via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 14 15:39:30 PDT 2021


On Wed, Apr 14, 2021 at 3:27 PM Haowei Wu <haowei at google.com> wrote:
>
> > I am skeptical that users will want to have this behavior by default.
> > If this behavior is guarded by an option, it might be fine.
>
> That's a good point. If the reproducer will be more than a few hundreds MiBs, it is definitely not suitable to be enabled by default. I agree it's better to be guarded by an option flag such as `--gen-lld-crash-reproducer`.
>
> On Wed, Apr 14, 2021 at 2:40 PM Fangrui Song <maskray at google.com> wrote:
>>
>>
>> On 2021-04-14, Haowei Wu via llvm-dev wrote:
>> >*Background / Motivation*
>> >
>> >Both clang and lld have the ability to generate a reproducer (an archive
>> >with input files and invoker script to reproduce the clang/lld build).
>> >While clang will generate a reproducer archive when a crash happens, lld
>> >only generates a reproducer when '--reproduce' flag is explicitly set (this
>> >is equivalent to Clang's -gen-reproducer flag). This is not very helpful
>> >for debugging lld bugs, particularly when the crash happens in building big
>> >projects, since it will be unrealistic to set reproducer flags to generate
>> >reproducer archives for every lld invocation. This design also causes
>> >troubles when the crash happens on bots only, as in most cases, developers
>> >do not have access to the file system of these bots. It would be great to
>> >improve the lld reproducer generation for easier debugging in these
>> >scenarios.
>> >
>> >*Proposal*
>> >
>> >Given the use cases and status of clang and lld. I think there are 2
>> >possible solutions.
>> >
>> >*Extend Clang driver*
>> >In most cases, lld is invoked by the clang driver instead of being invoked
>> >by the build system directly. Therefore, the clang driver can be changed to
>> >re-invoke lld with '--reproduce' flags when it detects the lld subprocess
>> >is crashed.
>> >
>> >Advantages:
>> >    * It probably does not require any changes to the lld and might be
>> >easier than handling the crash directly in lld.
>> >
>> >Disadvantages:
>> >    * In case when there is a racing condition in the build system, the
>> >input files might have changed between 1st lld crash and 2nd lld rerun with
>> >'--reproduce' flag. In this case, the generated lld reproducer archive
>> >might not be able to trigger a crash, makes it less useful.
>> >
>> >*Improve lld reproducer*
>> >Another way would be to make lld generate a reproducer archive when it
>> >crashes, just like what clang is doing.
>> >
>> >Advantages:
>> >    * It will work no matter if lld is invoked from Clang or from the build
>> >system.
>> >    * It will catch the input file in case the crash is caused by build
>> >races.
>> >
>> >Disadvantages:
>> >    * It might need a lot of work if lld does not already have a
>> >sophisticated crash handler. It might still need some plumbing changes in
>> >clang driver so lld can honor the '-fcrash-diagnostic-dir' flag.
>> >
>> >*Comments?*
>> >Which approach do you prefer? Feel free to share your opinions.
>>
>> There is a resource difference between clang -gen-reproducer /
>> environment variable "FORCE_CLANG_DIAGNOSTICS_CRASH" and ld.lld --reproduce.
>>
>> clang -gen-reproducer produces a source file and a .sh file for one
>> single translation unit, the space consumption is low.
>> ld.lld --reproduce can potentially pack a large list of files, which may
>> take hundreds of megabytes or several gigabytes.
>>
>> I am skeptical that users will want to have this behavior by default.
>> If this behavior is guarded by an option, it might be fine.

I'll retract my words about an option. This behavior looks like it
needs a fair bit of customization and is build system dependent.
You can replace the proposed option with a shell script wrapper, which
is more convenient than implementing the restartable action in the
clang driver.
When dealing with linker problems, (I doubt there are many nowadays;
when there are problems, mostly are LTO problems), I will usually
change compiler/linker options a bit.
If you do this, you may only specify the proposed option when all the
stuff has been done, but then it is only a very small extra step to
invoke the link again with -Wl,--reproduce.


More information about the llvm-dev mailing list