[llvm-dev] Linker Option support for ELF

Sat Jan 6 12:05:12 PST 2018

> On Jan 5, 2018, at 4:35 PM, Cary Coutant <ccoutant at gmail.com> wrote:
> 
>>> In general I'm in favor of the proposal. Defining a generic way to convey
>>> some information from the compiler to the linker is useful, and it looks
>>> like it is just a historical reason that the ELF lacks the feature at the
>>> moment.
>>> 
>>> This is a scenario in which the feature is useful: when you include
>>> math.h, a compiler (which is driven by some pragma) could added `-lm` to the
>>> note section so that a linker automatically links libm.

Glad to have you chime in; I know that you have quit a bit of experience from binutils and gold.  I really would love to see this support be implemented there too, and having your input is certainly valuable.

> I agree that this would be a very useful addition to ELF. I've always
> wanted to reach the point where you could just type "ld main.o" and
> have all the dependencies automatically linked in. (Go kind of
> achieves this, I think.)

Excellent!  I think that everyone agrees that this is a useful extension to add.

> I'm not in favor of using yet another note section, however. SHT_NOTE
> sections are intended for the use of "off-axis" tools, not for
> something the linker would need to look for. I don't want to have the
> linker parsing individual note entries looking for notes of interest
> to itself, and then having to decide whether to edit those entries out
> of the larger section, or merge them together. And I also don't want
> to key off of individual section names -- the linker is not supposed
> to have to care about the section name. There should be a new section
> type for this feature. This is  the kind of extension that ELF was
> designed for.

I’m really not tied to the note approach of implementing this.  I am (admittedly) abusing the notes due to a couple of behavioral aspects of them.  So, the main things to realize is that this information is embedded into the object files that are built.  The information should be processed by the linker and then *discarded*, none of it should be in the final binary (unless it is a relocatable link).  I’m concerned about linkers which do not support this feature preserving the contents.  Now, this could very well be a misconception on my part.  If that is the case, then, I would say that this needs to be entirely reworked, because then adding the section sounds much nicer.

>>> I think I'm also in favor of the format, which is essentially runs of
>>> null-terminated strings (*1) that are basically opaque to compilers.
>> 
>> Yes.  However, I think I want to clarify that we want this to be completely
>> opaque to the backend.  The front end could possibly have some enhancements
>> to make this better.  But, that will be a separate change, and that
>> discussion should take place then.  We shouldn’t paint ourselves into a
>> corner.  Basically, I think that there is some legitimate concerns here, but
>> they would not be handled at this layer, but above.
>> 
>>> However, you should define as a spec what options are allowed and what
>>> their semantics are. We should not accept arbitrary linker options because
>>> semantics of some linker options cannot be clearly defined when they appear
>>> as embedded options. Just saying "this feature allows you to embed linker
>>> options to object files" is too weak as a specification. You need to clearly
>>> define a list of options that will be supported by linkers with their clear
>>> semantics.
>> 
>> Personally, I would like to see the ability to add support for additional
>> options without having to modify the compiler.  That said, I think that
>> there are options which can be scary (e.g. -nopie).  I think that the linker
>> should make the decision of what it supports and error out on others.  This
>> allows for us to enhance the support over time without a huge overhead.  As
>> a starting point, I think that -l and -L are two that would be interesting.
>> I can see -u being useful as well, but the point is that we can slowly grow
>> the support after consideration by delaying the validation of the options.
>> 
>>> (*1) One of the big annoyances that I noticed when I was implementing the
>>> same feature for COFF is that the COFF's .drctve section that contains
>>> linker options have to be tokenized in the same way as the Windows command
>>> line does. So it needs to interpret double quotes and backslashes correctly
>>> especially when handling space-containing pathnames. This is a design
>>> failure that a COFF file contains just a single string instead of runs of
>>> strings that have already been tokenized.
> 
> I too would like to keep the linker from having to tokenize the
> strings. I kind of agree with Rafael that there should be defined tags
> and values, much like a .dynamic section, but I wouldn't want to have
> values pointing to strings in yet another section, so I'd prefer
> something in between that and a free-form null-terminated string. I
> also wouldn't want to open up the complete list of linker options, so
> I'd prefer a defined list of tags in string form that could easily be
> augmented without additional backend support. We could start with,
> perhaps, "lib" to inject a library (a la "-l"), "file" to inject an
> object file by full name, and "path" to provide a search path (a la
> "-L"). I don't think an equivalent for "-u" would be needed, since the
> compiler can simply generate an undef symbol for that case. For the
> section format, I'd suggest a series of null-terminated strings,
> alternating between tags and values, so that no quote or escape
> parsing is necessary.

Sounds like we agree on the direction: we don’t want  the backend to be involved in adding new options, we don’t think that all options make sense but want to be able to add options still.  As to the `-u` option, Im thinking about cases were an unreferenced symbol would like to be preserved with `—gc-sections` and being built with `-ffunction-sections` and/or `-fdata-sections`.

So, after discussing some of the items, we ended up somewhere in-between.  My current proposal is a semi-pre-tokenized linker response file.  Basically, each option/parameter “pair” would be a single string entry in an array of string values.  The only difference is instead of TLV entries, it is simply the raw entry.  My resistance to the TLV really is driven more by the LLVM IR (which I suppose is possible to alter):

https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata <https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata>
> For the header files, a simple syntax like
> 
>   #pragma linker_directive "lib" "m"
> 
> would provide the extensibility needed to add new tags with no
> additional support in the front end or back end.

I *really* wish to avoid this discussion right now.  I am happy to loop you into a subsequent thread discussing that.  I figure that this will be a much more contentious issue as syntax is something everyone has differing opinions on.  I’m trying to split this work into three distinct pieces: the frontend support to emit the information, the backend to emit this into the object, and the linker to use it.

As an aside, personally, I was thinking more along the lines of `#pragma comment(lib, “m”)`.

> -cary

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180106/e3568555/attachment.html>