[llvm-dev] Switch to ld.bfd tombstone behavior by default

Fangrui Song via llvm-dev llvm-dev at lists.llvm.org
Fri Jul 17 00:03:05 PDT 2020

Thanks for the write-up!

On 2020-07-16, David Blaikie wrote:
>In short: Perhaps we should switch lld to the bfd-style tombstoning
>behavior for a release or two, letting users opt-in to testing with the new
>-1/-2 tombstoning in the interim, before switching to the new tombstone by
>default (while still having the flag to switch back when users find
>surprise places that can't handle the new behavior).
>In long:
>https://reviews.llvm.org/D81784 and follow-on patches modified the behavior
>of lld with regards to resolving relocations from debug sections to dead
>code (either comdat deduplicated, or gc-sections use).
>A very quick summary of the situation:
>Original Behavior:
>   - bfd: 1 for debug_ranges(0 would prematurely terminate the list), 0
>   elsewhere
>   - gold/lld: 0+addend everywhere
>   - bfd/gold/lld
>      -  doesn't support 0 as a valid executable address without ambiguities
>   - gold/lld
>      - ambiguities with large gc'd functions combined with a .text mapping
>      that starts in relative low addresses
>      - premature debug_range termination with zero-length functions (Clang
>      produces these with __builtin_unreachable or non-void return
>type functions
>      without a return statement)
>New behavior:
>   - -2 for DWARFv4 debug_loc, debug_ranges (-1 is a base address specifier
>   there)
>   - -1 elsewhere
>   - linker flag to customize to other values if desired
>Known issues:
>   - lldb's line table parsing can't handle -1 well at all (essentially
>   unusable)

Pavel Labath will fix this soon https://reviews.llvm.org/D83957
This is an unhandled address-space wraparound problem.
This pattern is potentially common - and other downstream DWARF
consumers might make similar line table handling mistakes.

>   - gdb's line table parsing ends up with different handling when breaking
>   on gc'd functions (minor functionality issue)

This is just a behavior difference, not affecting users.
It did break a test if linked with LLD (gdb intrinsically has lots of
failing tests even if built with GCC+GNU ld).

Previous behavior (when an address is zero): a breakpoint on a
--gc-sections discarded function will be redirected to a larger line
number with debug info, even if that line can be an unrelated different
New behavior is that the breakpoint is on a wrapped-around small address.

GDB 9.3 will restore the previous behavior

>I think there's enough risk in this work (even given the small number of
>bugs found so far), given there's a pretty wide array of debug info
>consumers out there, that we should change lld's default to match the
>long-lived bfd strategy. This would address my original motivation for
>raising all this (empty functions prematurely terminating the list), while
>letting users who want to experiment with it, or need it (like Alexey), can
>opt-in to the -1/-2 behavior.

I think we can only confidently say that there is enough risk in using
tombstone value -1 in .debug_line, but I'd not say tombstone value -1 in
other .debug_* can cause problems. Hope others can chime in.

With consideration for satefy for the upcoming release/11.x, we can make
two choices:

a) .debug_ranges&.debug_loc => -2, .debug_line => 0, other .debug_* -> -1
b) .debug_ranges&.debug_loc => -2, other .debug_* => 0

Delaying .debug_line => -1 for one or two release sounds good to me.
So LLD 11 or 12 linked binaries can be debugged by LLDB 10. This is a
nice property.

This write-up proposes b), but I'd say a) is likely sufficient. With the
available information, I cannot yet say that a) will have more risk.

>   - chromium/firefox have some tools that were broken:
>   https://bugs.chromium.org/p/chromium/issues/detail?id=1102223#c5

This is potentially related to other .debug_* (not .debug_line)
I hope Chromium developers can chime in here:) The breakage was
unfortunate but I don't know how we could have avoided that. IMHO this
is no different from "clang started to emit a new DW_FORM_* and a
postprocessing tool of .debug chokes on that" Whether we want to
suppress that particular DW_FORM_* definitely should depend on how
likely it can cause problems, but we can't yet say we have to hold off
on a feature for a solved (precisely, mitigated) problem.

>I'm not sure how to get the word out to DWARF consumers that they should
>consider this new experimental behavior. Ray's done a good job
>evangelizing/discussing this with gdb and lldb at least - and of course
>having turned it on by default briefly has found some users (like Chromium)
>that we probably wouldn't have found no matter how long we left this as an
>experimental option... so some things are going to break when we switch no
>matter what.

Thank you for following up with some GNU folks on their lists!
If folks want to follow along the thread:

We have informed binutils, elfutils-devel (elfutils has a few debug
tools) and gdb. I don't recall that anyone has thought about problems
with a tombstone value.

>P.S: Sony's already been using the -1 technique with their debugger and
>linker for a while, so they may want to keep this on by default for SCE -
>but I'm not sure how to do that in-tree.
>Clang doesn't know which lld
>version it's running, so whether the flag can be specified, I would think?
>(so it'd be hard to have Clang go "if SCE and LLD, pass the flag to use
>-1", I think) - if there is a way to make that decision in the compiler
>driver+linker, then we'd have a question of "default new behavior except
>when tuning for LLDB and GDB" or "default bfd behavior except when tuning
>for SCE".

I've been involed in another thread on SHF_LINK_ORDER (https://sourceware.org/pipermail/binutils/2020-July/112415.html ).
We may need a way to tell codegen about the used linker.

pcc proposed -mbinutils-version= - This is nice in that some MC
decisions related to -fno-integrated-as can use this option as well.
jyknight proposed -mlinker-version= and syntax like -fuse-ld=bfd:2.34

This may get more complex if the generated object file want to be linked
with more than one linker. This discussion probably deserves its own

More information about the llvm-dev mailing list