[llvm-dev] LLD: Can we make --warn-backrefs the default?

Fāng-ruì Sòng via llvm-dev llvm-dev at lists.llvm.org
Thu Sep 3 16:08:32 PDT 2020


On 2020-09-03, Peter Collingbourne wrote:
>On Thu, Sep 3, 2020 at 2:00 PM Fāng-ruì Sòng <maskray at google.com> wrote:
>
>> On 2020-09-03, Peter Collingbourne wrote:
>> >On Tue, Sep 1, 2020 at 5:35 PM Fāng-ruì Sòng via llvm-dev <
>> >llvm-dev at lists.llvm.org> wrote:
>> >
>> >> On 2020-09-01, Petr Hosek wrote:
>> >> >I see the GNU ld behavior as a limitation, not as a feature, as Peter
>> >> Smith
>> >> >also pointed out in https://reviews.llvm.org/D86762. While it can be
>> >> argued
>> >> >that there are certain cases where it can help detect layering
>> >> >violations as you mentioned in your change, I'm not sure how valuable
>> that
>> >> >is in practice. Every case I've encountered so far either in Chrome or
>> in
>> >> >Fuchsia was a valid use case, most commonly interceptors. The solution
>> >> >has always been the same, wrap all libraries in
>> --start-group/--stop-group
>> >> >and it's what most projects do by default to avoid dealing with these
>> >> >issues, see for example [Chromium](
>> >> >
>> >>
>> https://source.chromium.org/chromium/chromium/src/+/master:build/toolchain/gcc_toolchain.gni;l=409
>> >> ).
>> >> >In our case, compatibility with linkers on other platforms is more
>> >> >important than compatibility with GNU ld, so I'd prefer to keep the
>> >> current
>> >> >behavior. Projects that care about compatibility with GNU ld can use
>> >> >--warn-backrefs.
>> >>
>> >> I totally understand that some users may not want to deal with GNU ld
>> >> compatibility:) I'll then question about Chromium's addition of -z defs:
>> >> https://crrev.com/843583006 :)
>> >>
>> >> -z defs is like a layering checking tool for shared objects while
>> >> --warn-backrefs is for archives. For performance, ABI concerns and ease
>> >> of deployment, many projects tend to build their own components as
>> >> archives instead of shared objects. In this sense --warn-backrefs will
>> >> probably be more useful than -z defs.
>> >>
>> >> (
>> >> TIL lorder and tsort were created to define an order of archives in
>> >> early versions of Unix.
>> >>
>> https://www.gnu.org/software/coreutils/manual/html_node/tsort-background.html
>> >> It seems that the article missed the point that proper library layering
>> is
>> >> still useful
>> >> )
>> >>
>> >
>> >I'm not a fan of this idea of reframing GNU ld behavior as a "layering
>> >checking tool". It is an incomplete layering checking tool because it does
>> >not detect the scenario where, for example, you have the intended
>> >dependency graph:
>> >
>> >A -> B
>> >A -> C
>> >B -> D
>> >C -> D
>> >
>> >(resulting in -la -lb -lc -ld) and you have an unexpected dependency B ->
>> >C.
>>
>> Yes, the GNU ld layering checking behavior is incomplete (yet important
>> and sufficient if we aim for compatibility).
>>
>
>As Petr mentioned, only certain users care about this aspect of GNU ld
>compatibility, and those users can just turn this feature on.

OK, I will consider you and Petr are two users not caring about this aspect of GNU ld.

If the other side turns out to be overwhelming: the users who don't want this
feature can turn it off (--no-warn-backrefs).

>The build system pick two orders:
>> -la -lb -lc -ld
>> -la -lc -lb -ld
>>
>
>Unless you have multiple programs linking against A, you've just introduced
>non-determinism in your build tool, which is generally considered to be a
>Bad Thing.
>
>
>> >There is already a way to detect layering problems, that detects
>> >practical layering problems and not just theoretical ones, which is to
>> link
>> >programs that use subsets of the libraries. For example, linking a program
>> >that depends only on B would result in detecting the invalid B -> C
>> >dependency.
>>
>> This is actually cumbersome and is explicitly described in
>> https://reviews.llvm.org/D86762
>
>
>Right, and as I mention below even that doesn't catch all cases. My point
>is that there is already a way to detect "practical" layering problems,
>defined as "layering problems that cause an undefined symbol error in
>programs that are linked at the same time as the library is built".
>
>- Users who don't care about GNU ld can link their libraries normally.
>- Users who care about GNU ld can pass --warn-backrefs.
>
>Users who care more about "theoretical" layering problems, such as users
>who ship prebuilt archive files to customers, will not be satisfied by
>--warn-backrefs, as this will not catch every possible layering problem
>before shipping the library, as they will not have a copy of every
>customer's program. Instead, they will be better served by a separate tool.
>
>If B -> C is not specified,
>>
>> * If people write B_test ("linking a program that depends only on B"),
>>    they will notice the dependency issue immediately via the "undefined
>> symbol" diagnostic.
>> * If such a B_test does not exist. The user may work on a large
>>    application which depends on B (and transitively on D) but not on A.
>>    OK, they will get an undefined symbol from C. However, it is not
>>    appropriate for them to add a dependency on C because they don't use C
>>    directly. See the "If adding the dependency does not form a cycle:"
>>    paragraph in D86762.
>>
>>    If their application actually depends on A (thus they will get the
>>    dependency on C), their link may or may not succeed with GNU ld,
>>    depending on whether the build system picks
>>    -la -lb -lc -ld or -la -lc -lb -ld, and the diagnostic can be very
>>    confusing "undefined reference to" while C is actually linked.
>>
>> >It's also worth noting that even that would only detect the
>> >layering problem if the program depends on the part of B that depends on
>> C.
>>
>>
>> >A better way to go about achieving layering checking would IMHO be to
>> >implement a separate tool (not part of the linker) that is capable of
>> >a complete layering check. Such a tool would only depend on symbol table
>> >features common to all object formats, so it could probably be implemented
>> >generically.
>> >
>> >Peter
>>
>> A standalone tool will not achieve sufficient ergonomics.
>> It will read all input files and duplicate the symbol resolution work done
>> by the linker,
>> which can be slow. (ld.lld --warn-backrefs only imposes negligible
>> overhead when a lazy object is fetched.)
>>
>
>As I mentioned, it would do additional work that the linker is not
>currently doing, which is necessary to implement a complete layering
>checking tool, such as reading all archive members out of each archive
>(whereas linkers only need to read depended-on archive members), and which
>linkers should not be doing by default exactly for performance reasons.
>Furthermore, the additional tool can be moved out of the critical
>edit-build-run path, unlike a linker based tool, which should improve
>performance as well.

If the build system uses --start-lib more than archives, making the linker do
the additional check may have little additional overhead. Lazy object files
don't have the archive symbol table, which was much useful in the old days but
limited in today's viewpoint. LLD already has quite a bit logic iterating the
symbol table, doing one or two more things may not affect overall performance.

>An additional tool would give the flexibility of allowing the interface to
>specify the actual dependencies instead of just giving us only what we can
>achieve as a result of historical design decisions. It would free build
>tools like cmake or gn from needing to topologically sort libraries in
>order to implement layering checking; instead, they can simply dump their
>dependency graph.
>
>IMHO, selling this feature as a layering checker is worse than having no
>layering checker at all, because it will mislead users into thinking "oh,
>I'm using lld's layering checker, that must mean that my program is
>properly layered".
>
>Peter

The option name is still "--warn-backrefs", not "--check-layering". The selling
point is more about the compatibility checker. While it does incomplete layering
checking, it is an almost full GNU ld compatibility checking tool. I have
personally checked dozens of executables with --warn-backrefs problems. Most
don't link with GNU ld and the few remaining cases are whether symbol resolution
will pick different definitions with LLD and GNU ld.

>>
>> >
>> >> >On Mon, Aug 31, 2020 at 1:44 PM Fāng-ruì Sòng <maskray at google.com>
>> wrote:
>> >> >
>> >> >> On Mon, Aug 31, 2020 at 1:29 PM David Blaikie <dblaikie at gmail.com>
>> >> wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > On Mon, Aug 31, 2020 at 1:24 PM Fāng-ruì Sòng <maskray at google.com>
>> >> >> wrote:
>> >> >> >>
>> >> >> >> On Fri, Aug 28, 2020 at 11:16 AM David Blaikie <
>> dblaikie at gmail.com>
>> >> >> wrote:
>> >> >> >> >
>> >> >> >> > Would you like to conduct the conversation here, or on the
>> review
>> >> >> thread? (I lean towards having them here, but don't mind if folks
>> feel
>> >> like
>> >> >> it keeps the noise down & want to more post a notice saying "hey,
>> here's
>> >> >> this thing, if you're interested, go discuss it over there" - more an
>> >> >> optional opt-in rather than requiring people to opt-out via muting
>> the
>> >> >> thread, etc)
>> >> >> >>
>> >> >> >> Yes, we can conduct the "should we enable --warn-backrefs by
>> default"
>> >> >> >> conversation here. Since the semantics --warn-backrefs of are a
>> bit
>> >> >> >> complex, we need a documentation. https://reviews.llvm.org/D86762
>> is
>> >> >> >> put up to get wording suggestions. Explicitly adding the people to
>> >> the
>> >> >> >> CC list...
>> >> >> >>
>> >> >> >> FWIW for many code bases, --warn-backrefs should produce no
>> warnings
>> >> >> >> (error if --fatal-warnings). For some code bases, GNU ld may error
>> >> >> >> "undefined reference".  --warn-backrefs can catch such problems.
>> >> >> >
>> >> >> >
>> >> >> > One of the questions raised on the thread there was about different
>> >> >> linker semantics. I assume the "--warn-backrefs by default" we're
>> >> >> discussing is only related to the ld.lld frontend? Not the Windows
>> >> linker
>> >> >> lld behavior (or ld64 (old or new) lld behavior)?
>> >> >>
>> >> >> Yes, the ELF port (lld/ELF). No other port has implemented
>> >> >> --warn-backrefs. The intion of --warn-backrefs is to capture behavior
>> >> >> differences with GNU ld's ELF ports. If traditional ELF linkers don't
>> >> >> behave like that, I can replace "traditional ELF linkers" in
>> >> >> https://reviews.llvm.org/D86762 with "GNU linkers".
>> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> > On Thu, Aug 27, 2020 at 10:15 PM Fangrui Song via llvm-dev <
>> >> >> llvm-dev at lists.llvm.org> wrote:
>> >> >> >> >>
>> >> >> >> >> Hi all, LLD's --warn-backrefs is a tool to identify potential
>> >> >> >> >> incompatible archive selection semantics with traditional Unix
>> >> >> linkers.
>> >> >> >> >> I have improved it (via D77522,D77630 and D77512) to a state
>> >> where a
>> >> >> >> >> --warn-backrefs diagnostic almost assuredly means that the link
>> >> will
>> >> >> >> >> fail with GNU ld, or the symbol will get different resolution
>> in
>> >> GNU
>> >> >> ld and LLD.
>> >> >> >> >>
>> >> >> >> >> My conclusion is that --warn-backrefs is a very useful layering
>> >> >> check tool.
>> >> >> >> >> I just wrote a documentation about the advantage (of GNU ld's
>> >> archive
>> >> >> >> >> selection semantics..... But we can do better with
>> >> --warn-backrefs!
>> >> >> >> >> GNU ld just reports "undefined reference" with no actionable
>> >> feedback
>> >> >> >> >> about the offending archive)
>> >> >> >> >>
>> >> >> >> >> https://reviews.llvm.org/D86762
>> >> >> >> >>
>> >> >> >> >> I am wondering whether in the next release we can make
>> >> >> --warn-backrefs
>> >> >> >> >> the default.  I have added many known users to the review.
>> >> >> >> >> (There is no need for --no-warn-backrefs because
>> >> >> --warn-backrefs-exclude='*' does the same job)
>> >> >> >> >> _______________________________________________
>> >> >> >> >> LLVM Developers mailing list
>> >> >> >> >> llvm-dev at lists.llvm.org
>> >> >> >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >>
>> >
>> >
>> >--
>> >--
>> >Peter
>>
>
>
>-- 
>-- 
>Peter


More information about the llvm-dev mailing list