[llvm-dev] RFC: ELF Autolinking

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 25 10:20:27 PDT 2019


Could you explain what that feature is?

On Mon, Mar 25, 2019 at 10:08 AM James Y Knight via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Are you planning to add support for "-F" and "-framework" to ELF linkers?
>
> On Mon, Mar 25, 2019 at 12:51 AM Saleem Abdulrasool via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Sorry for the late chiming in.
>>
>> Yes, swift does use autolinking, and I would like to use that on all the
>> targets.  The only target which does not support this functionality
>> currently are ELF based.  That said, I think that `#pragma comment(link,
>> ...)` is insufficient for my needs.  Building Foundation requires framework
>> style linking as well.  The original design that I had in mind was derived
>> from ld64 and link.  Personally, I still strongly favour link's behaviour
>> of parsing "command line" options from the object files when they are
>> loaded.  There was strong opposition to that approach from Rui though.
>> Would we want to have special pragmas for each "feature"?
>>
>> The ELF model doesn't have the simplistic model for processing the
>> command line that PE/COFF does.  Because ordering is relevant to the model,
>> it would be ideal to process them inline, but, since lld already moves far
>> enough away from the traditional Unix model, perhaps we can simplify it to
>> append the command line directives to the end of the command line.
>>
>> The other case that is interesting to think about is the autolinking
>> support in C++ (and clang) modules.
>>
>> On Thu, Mar 21, 2019 at 9:49 AM bd1976 llvm <bd1976llvm at gmail.com> wrote:
>>
>>> On Thu, Mar 21, 2019 at 12:06 AM Rui Ueyama <ruiu at google.com> wrote:
>>>
>>>> Perhaps there's no one clean way to solve this issue, because
>>>> previously all libraries and object files are explicitly given to the
>>>> linker via a command line and the order of files in the command line
>>>> matters. That assumes human intervention to work correctly. Now, the
>>>> autolinking feature will add libraries implicitly. Since it's implicit,
>>>> there will be only one way how that works, so sometimes that works and
>>>> sometimes doesn't.
>>>>
>>>> It feels to me that we should aim for making it work reasonably well
>>>> for reasonable use cases. By reasonable use cases, I'm thinking of the
>>>> following:
>>>>
>>>>  1. --static option may or may not be given (i.e. we should allow that
>>>> feature for both static linking and dynamic linking.)
>>>>  2. There are no competing defined symbols in a given set of libraries,
>>>> or if they exist, the program owner doesn't care which is linked to their
>>>> program.
>>>>  3. There may be circular dependencies between libraries.
>>>>
>>>> I don't think the above assumption is too odd. If I have to implement
>>>> the autolinking feature to GNU linker for the above scenario, I'd probably
>>>> use the following scheme:
>>>>
>>>>  1. While reading object files, memorize libraries that are autolinked
>>>>  2. After linking everything, create a list of files consisting of
>>>> autolinked libraries AND libraries given via the command line
>>>>  3. Visit each file in the list as if they were wrapped in
>>>> --start-group and --end-group.
>>>>
>>>> I'd think the above scheme should work reasonably well. What do you
>>>> think?
>>>>
>>>
>>> Very nice. I agree with your definition of "reasonable" usecaes
>>> (actually, as I have said before, I think that restricting autolinking to
>>> this "reasonable" set is actually a feature -  to avoid developers having
>>> source code that only works with a particular linker). I also like the
>>> proposal for a GNU implementation - I think this is enough to show that
>>> GNU-like linkers could implement this.
>>>
>>> At this point I will try to prototype this up so that people have an
>>> implementation to play with.
>>>
>>> I am keen to hear from Saleem (compnerd) on this, as he did the
>>> original .linker-options work.
>>>
>>>
>>>>
>>>> On Tue, Mar 19, 2019 at 11:02 AM bd1976 llvm <bd1976llvm at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Mar 18, 2019 at 8:02 PM Rui Ueyama <ruiu at google.com> wrote:
>>>>>
>>>>>> On Thu, Mar 14, 2019 at 1:05 PM bd1976 llvm via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> On Thu, Mar 14, 2019 at 6:27 PM Peter Collingbourne <peter at pcc.me.uk>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 14, 2019 at 6:08 AM bd1976 llvm via llvm-dev <
>>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>>
>>>>>>>>> At Sony we offer autolinking as a feature in our ELF toolchain. We
>>>>>>>>> would like to see full support for this feature upstream as there is
>>>>>>>>> anecdotal evidence that it would find use beyond Sony.
>>>>>>>>>
>>>>>>>>> In general autolinking (https://en.wikipedia.org/wiki/Auto-linking)
>>>>>>>>> allows developers to specify inputs to the linker in their source code.
>>>>>>>>> LLVM and Clang already have support for autolinking on ELF via embedding
>>>>>>>>> strings, which specify linker behavior, into a .linker-options section in
>>>>>>>>> relocatable object files, see:
>>>>>>>>>
>>>>>>>>> RFC -
>>>>>>>>> http://lists.llvm.org/pipermail/llvm-dev/2018-January/120101.html
>>>>>>>>> LLVM -
>>>>>>>>> https://llvm.org/docs/Extensions.html#linker-options-section-linker-options,
>>>>>>>>> https://reviews.llvm.org/D40849
>>>>>>>>> Clang -
>>>>>>>>> https://clang.llvm.org/docs/LanguageExtensions.html#specifying-linker-options-on-elf-targets,
>>>>>>>>> https://reviews.llvm.org/D42758
>>>>>>>>>
>>>>>>>>> However, although support was added to Clang and LLVM, no support
>>>>>>>>> has been implemented in LLD; and, I get the sense, from reading the
>>>>>>>>> reviews, that there wasn't agreement on the implementation when the changes
>>>>>>>>> landed. The original motivation seems to have been to remove the
>>>>>>>>> "autolink-extract" mechanism used by Swift to workaround the lack of
>>>>>>>>> autolinking support for ELF. However, looking at the Swift source code,
>>>>>>>>> Swift still seems to be using the "autolink-extract" method.
>>>>>>>>>
>>>>>>>>> So my first question: Are there any users of the current
>>>>>>>>> implementation for ELF?
>>>>>>>>>
>>>>>>>>> Assuming that no one is using the current code, I would like to
>>>>>>>>> suggest a different mechanism for autolinking.
>>>>>>>>>
>>>>>>>>> For ELF we need limited autolinking support. Specifically, we only
>>>>>>>>> need support for "comment lib" pragmas (
>>>>>>>>> https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017)
>>>>>>>>> in C/C++ e.g. #pragma comment(lib, "foo"). My suggestion that we keep the
>>>>>>>>> implementation as lean as possible.
>>>>>>>>>
>>>>>>>>> Principles to guide the implementation:
>>>>>>>>> - Developers should be able to easily understand autolinking
>>>>>>>>> behavior.
>>>>>>>>> - Developers should be able to override autolinking from the
>>>>>>>>> linker command line.
>>>>>>>>> - Inputs specified via pragmas should be handled in a general way
>>>>>>>>> to allow the same source code to work in different environments.
>>>>>>>>>
>>>>>>>>> I would like to propose that we focus on autolinking exclusively
>>>>>>>>> and that we divorce the implementation from the idea of "linker options"
>>>>>>>>> which, by nature, would tie source code to the vagaries of particular
>>>>>>>>> linkers. I don't see much value in supporting other linker operations so I
>>>>>>>>> suggest that the binary representation be a mergable string section
>>>>>>>>> (SHF_MERGE, SHF_STRINGS), called .autolink, with custom type
>>>>>>>>> SHT_LLVM_AUTOLINK (0x6fff4c04), and SHF_EXCLUDE set (to avoid the contents
>>>>>>>>> appearing in the output). The compiler can form this section by
>>>>>>>>> concatenating the arguments of the "comment lib" pragmas in the order they
>>>>>>>>> are encountered. Partial (-r, -Ur) links can be handled by concatenating
>>>>>>>>> .autolink sections with the normal mergeable string section rules. The
>>>>>>>>> current .linker-options can remain (or be removed); but, "comment lib"
>>>>>>>>> pragmas for ELF should be lowered to .autolink not to .linker-options. This
>>>>>>>>> makes sense as there is no linker option that "comment lib" pragmas map
>>>>>>>>> directly to. As an example, #pragma comment(lib, "foo") would result in:
>>>>>>>>>
>>>>>>>>> .section ".autolink","eMS", at llvm_autolink,1
>>>>>>>>>         .asciz "foo"
>>>>>>>>>
>>>>>>>>> For LTO, equivalent information to the contents of a the .autolink
>>>>>>>>> section will be written to the IRSymtab so that it is available to the
>>>>>>>>> linker for symbol resolution.
>>>>>>>>>
>>>>>>>>> The linker will process the .autolink strings in the following way:
>>>>>>>>>
>>>>>>>>> 1. Inputs from the .autolink sections of a relocatable object file
>>>>>>>>> are added when the linker decides to include that file (which could itself
>>>>>>>>> be in a library) in the link. Autolinked inputs behave as if they were
>>>>>>>>> appended to the command line as a group after all other options. As a
>>>>>>>>> consequence the set of autolinked libraries are searched last to resolve
>>>>>>>>> symbols.
>>>>>>>>>
>>>>>>>>
>>>>>>>> If we want this to be compatible with GNU linkers, doesn't the
>>>>>>>> autolinked input need to appear at the point immediately after the object
>>>>>>>> file appears in the link? I'm imagining the case where you have a
>>>>>>>> statically linked libc as well as a libbar.a autolinked from a foo.o. The
>>>>>>>> link command line would look like this:
>>>>>>>>
>>>>>>>> ld foo.o -lc
>>>>>>>>
>>>>>>>> Now foo.o autolinks against bar. The command line becomes:
>>>>>>>>
>>>>>>>> ld foo.o -lc -lbar
>>>>>>>>
>>>>>>>
>>>>>>> Actually, I was thinking that on a GNU linker the command line would
>>>>>>> become "ld foo.o -lc -( -lbar )-"; but, this doesn't affect your point.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> If libbar.a requires an additional object file from libc.a, it will
>>>>>>>> not be added to the link.
>>>>>>>>
>>>>>>>>
>>>>>>> As it stands all the dependencies of an autolinked library must
>>>>>>> themselves be autolinked. I had imagined that this is a reasonable
>>>>>>> limitation. If not we need another scheme. I try to think about some
>>>>>>> motivating examples for this.
>>>>>>>
>>>>>>>
>>>>>>>> 2. It is an error if a file cannot be found for a given string.
>>>>>>>>> 3. Any command line options in effect at the end of the command
>>>>>>>>> line parsing apply to autolinked inputs, e.g. --whole-archive.
>>>>>>>>> 4. Duplicate autolinked inputs are ignored.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This seems like it would work in GNU linkers, as long as the
>>>>>>>> autolinked file is added to the link immediately after the last mention,
>>>>>>>> rather than the first. Otherwise a command line like:
>>>>>>>>
>>>>>>>> ld foo1.o foo2.o
>>>>>>>>
>>>>>>>> (where foo1.o and foo2.o both autolink bar) could end up looking
>>>>>>>> like:
>>>>>>>>
>>>>>>>> ld foo1.o -lbar foo2.o
>>>>>>>>
>>>>>>>> and you will not link anything from libbar.a that only foo2.o
>>>>>>>> requires. It may end up being simpler to not ignore duplicates.
>>>>>>>>
>>>>>>>
>>>>>>> Correct; but, given that the proposal was to handle the libraries as
>>>>>>> if they are appended to the link line after everything on the command line
>>>>>>> then I think this will work. With deduplication (and the use of SHF_MERGE)
>>>>>>> developers get no ordering guarantees. I claim that this is a feature! My
>>>>>>> rationale is that the order in which libraries are linked affects different
>>>>>>> linkers in different ways (e.g. LLD does not resolve symbols from archives
>>>>>>> in a compatible manner with either the Microsoft linker or the GNU
>>>>>>> linkers.), by not allowing the user to control the order I am essentially
>>>>>>> saying that autolinking is not suitable for libraries that offer competing
>>>>>>> copies of the same symbol. This ties into my argument that "comment lib"
>>>>>>> pragmas should be handled in as "general" a way as possible.
>>>>>>>
>>>>>>
>>>>>> Right. I think if you need a fine control over the link order,
>>>>>> autolinking is not a feature you want to use. Or, in general, if your
>>>>>> program is sensitive to a link order because its source object files have
>>>>>> competing symbols of the same name, it's perhaps unnecessarily fragile.
>>>>>>
>>>>>> That being said, I think you need to address the issue that pcc
>>>>>> pointed out. If you statically link a program `foo` with the following
>>>>>> command line
>>>>>>
>>>>>>   ld -o foo foo.o -lc
>>>>>>
>>>>>> , `foo.o` auto-imports libbar.a, and libbar.a depends on libc.a, can
>>>>>> your proposed feature pull out object files needed for libbar.a?
>>>>>>
>>>>>
>>>>> It won't work on GNU linkers. It will work with LLD as LLD has
>>>>> MSVC-like archive handling. However, I would like to make sure that
>>>>> whatever we come up with can be supported in the GNU toolchain.
>>>>>
>>>>> I had thought that it would be acceptable that all the dependencies of
>>>>> an autolinked library must themselves be autolinked in order to work on GNU
>>>>> style linkers. Having thought more, I don't like this limitation -
>>>>> especially as it doesn't exist for Microsoft style linkers. One possible
>>>>> resolution could be that GNU linkers might have to implement another
>>>>> command line option e.g. --auto-dep=<file> to allow injection into the
>>>>> group of autolinked libraries.
>>>>>
>>>>> i.e In pcc's example you would need to do: "ld foo.o
>>>>> --auto-dep=libc.a" which would become "ld --start-group libbar.a libc.a
>>>>> --end-group" with autolinking.
>>>>>
>>>>> I wanted to avoid the approach of inserting autolinked libraries after
>>>>> the object that autolinks them. In LLD (and MSVC) it becomes hard to reason
>>>>> about "where" the linker is in the command line and it would also mean that
>>>>> we can't have the nice separation between parsing the command line and
>>>>> doing the rest of the link that we currently have. Also, if you give people
>>>>> a way to have a fine grained control over the link order with autolinking
>>>>> you risk ending up with source code that will link on GNU style linkers but
>>>>> not with LLD (assuming GNU ever implemented support for autolinking).
>>>>>
>>>>> Scenario:
>>>>>
>>>>> libbar.a(bar.o) - defines symbol bar
>>>>> libfoo.a(foo.o) - defines foo and autolinks libbar.a
>>>>> main.o - references foo
>>>>> another.o - does not reference foo
>>>>> No references to bar exist
>>>>>
>>>>> lld -lfoo another.o --whole-archive main.o with autolinking becomes
>>>>> lld -lfoo another.o --whole-archive main.o -lbar result: bar.o gets added
>>>>> to the link.
>>>>> But, if a change is made so that another.o references bar then the
>>>>> link line with autolinking becomes lld -lfoo another.o
>>>>> -lbar --whole-archive main.o result: bar.o is not added to the link.
>>>>>
>>>>> Hopefully the above scenario demonstrates why I think that it becomes
>>>>> too complicated to reason about the effects of autolinking with pcc's
>>>>> proposed insertion scheme.
>>>>>
>>>>>
>>>>>
>>>>>> 5. The linker tries to add a library or relocatable object file from
>>>>>>>>> each of the strings in a .autolink section by; first, handling the string
>>>>>>>>> as if it was specified on the commandline; second, by looking for the
>>>>>>>>> string in each of the library search paths in turn; third, by looking for a
>>>>>>>>> lib<string>.a or lib<string>.so (depending on the current mode of the
>>>>>>>>> linker) in each of the library search paths.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Is the second part necessary? "-l:foo" causes the linker to search
>>>>>>>> for a file named "foo" in the library search path, so it seems that
>>>>>>>> allowing the autolink string to look like ":foo" would satisfy this use
>>>>>>>> case.
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I worded the proposal to avoid mapping "comment lib" pragmas to
>>>>>>> --library command line options. My reasons:
>>>>>>>
>>>>>>> 1. I find the requirement that the user put ':' in their lib strings
>>>>>>> slightly awkward. It means that the source code is now coupled to a
>>>>>>> GNU-style linker. So then this isn't merely an ELF linking proposal, it's a
>>>>>>> proposal for ELF toolchains with GNU-like linkers (e.g. the arm linker
>>>>>>> doesn't support the colon prefix
>>>>>>> http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0474c/Cjahbdei.html
>>>>>>> ).
>>>>>>>
>>>>>>> 2. The syntax is #pragma comment(lib, ...) not #pragma
>>>>>>> linker-option(library, ...) i.e. the only thing this (frankly rather
>>>>>>> bizarre) syntax definitely implies is that the argument is related to
>>>>>>> libraries (and comments ¯\_(ツ)_/¯); it is a bit of a stretch to interpret
>>>>>>> "comment lib" pragmas as mapping directly to "specifying an additional
>>>>>>> --library command line option".
>>>>>>>
>>>>>>> AFAIK all linkers support two ways of specifying inputs; firstly,
>>>>>>> directly on the command line; secondly, with an option with very similar
>>>>>>> semantics to GNU's --library option. I choose a method of finding a input
>>>>>>> files that encompasses both methods of specifying a library on the command
>>>>>>> line. I think that this method is actually more intuitive than either the
>>>>>>> method used by the linker script INPUT command or by --library. FWIW, I
>>>>>>> looked into the history of the colon prefix. It was added in
>>>>>>> https://www.sourceware.org/ml/binutils/2007-03/msg00421.html.
>>>>>>> Unfortunately, the rationale given is that it was merely a port of a
>>>>>>> vxworks linker extension. I couldn't trace the history any further than
>>>>>>> that to find the actual design discussion. The linker script command INPUT
>>>>>>> uses a different scheme and the command already had this search order 20
>>>>>>> years ago, which is the earliest version of the GNU linker I have history
>>>>>>> for; again, the rationale is not available.
>>>>>>>
>>>>>>>
>>>>>>>> 6. A new command line option --no-llvm-autolink will tell LLD to
>>>>>>>>> ignore the .autolink sections.
>>>>>>>>>
>>>>>>>>> Rationale for the above points:
>>>>>>>>>
>>>>>>>>> 1. Adding the autolinked inputs last makes the process simple to
>>>>>>>>> understand from a developers perspective. All linkers are able to implement
>>>>>>>>> this scheme.
>>>>>>>>> 2. Error-ing for libraries that are not found seems like better
>>>>>>>>> behavior than failing the link during symbol resolution.
>>>>>>>>> 3. It seems useful for the user to be able to apply command line
>>>>>>>>> options which will affect all of the autolinked input files. There is a
>>>>>>>>> potential problem of surprise for developers, who might not realize that
>>>>>>>>> these options would apply to the "invisible" autolinked input files;
>>>>>>>>> however, despite the potential for surprise, this is easy for developers to
>>>>>>>>> reason about and gives developers the control that they may require.
>>>>>>>>> 4. Unlike on the command line it is probably easy to include the
>>>>>>>>> same input file twice via pragmas and might be a pain to fix; think of
>>>>>>>>> Third-party libraries supplied as binaries.
>>>>>>>>> 5. This algorithm takes into account all of the different ways
>>>>>>>>> that ELF linkers find input files. The different search methods are tried
>>>>>>>>> by the linker in most obvious to least obvious order.
>>>>>>>>> 6. I considered adding finer grained control over which .autolink
>>>>>>>>> inputs were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I
>>>>>>>>> concluded that this is not necessary: if finer control is required
>>>>>>>>> developers can recreate the same effect autolinking would have had using
>>>>>>>>> command line options.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> --
>>>>>>>> Peter
>>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> llvm-dev at lists.llvm.org
>>>>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>
>>>>>>
>>
>> --
>> Saleem Abdulrasool
>> compnerd (at) compnerd (dot) org
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190325/d6421de2/attachment.html>


More information about the llvm-dev mailing list