[llvm-dev] Linker Option support for ELF

Sat Jan 6 16:33:02 PST 2018

On Jan 6, 2018 12:05 PM, "Saleem Abdulrasool via llvm-dev" <
llvm-dev at lists.llvm.org> wrote:

On Jan 5, 2018, at 4:35 PM, Cary Coutant <ccoutant at gmail.com> wrote:

In general I'm in favor of the proposal. Defining a generic way to convey
some information from the compiler to the linker is useful, and it looks
like it is just a historical reason that the ELF lacks the feature at the
moment.

This is a scenario in which the feature is useful: when you include
math.h, a compiler (which is driven by some pragma) could added `-lm` to the
note section so that a linker automatically links libm.

Glad to have you chime in; I know that you have quit a bit of experience
from binutils and gold.  I really would love to see this support be
implemented there too, and having your input is certainly valuable.

I agree that this would be a very useful addition to ELF. I've always
wanted to reach the point where you could just type "ld main.o" and
have all the dependencies automatically linked in. (Go kind of
achieves this, I think.)

Excellent!  I think that everyone agrees that this is a useful extension to
add.

I'm not in favor of using yet another note section, however. SHT_NOTE
sections are intended for the use of "off-axis" tools, not for
something the linker would need to look for. I don't want to have the
linker parsing individual note entries looking for notes of interest
to itself, and then having to decide whether to edit those entries out
of the larger section, or merge them together. And I also don't want
to key off of individual section names -- the linker is not supposed
to have to care about the section name. There should be a new section
type for this feature. This is  the kind of extension that ELF was
designed for.

I’m really not tied to the note approach of implementing this.  I am
(admittedly) abusing the notes due to a couple of behavioral aspects of
them.  So, the main things to realize is that this information is embedded
into the object files that are built.  The information should be processed
by the linker and then *discarded*, none of it should be in the final
binary (unless it is a relocatable link).  I’m concerned about linkers
which do not support this feature preserving the contents.  Now, this could
very well be a misconception on my part.  If that is the case, then, I
would say that this needs to be entirely reworked, because then adding the
section sounds much nicer.

Wouldn't a special section type trigger an "unrecognized section type"
error for linkers that don't support it?

-- Sean Silva

I think I'm also in favor of the format, which is essentially runs of
null-terminated strings (*1) that are basically opaque to compilers.

Yes.  However, I think I want to clarify that we want this to be completely
opaque to the backend.  The front end could possibly have some enhancements
to make this better.  But, that will be a separate change, and that
discussion should take place then.  We shouldn’t paint ourselves into a
corner.  Basically, I think that there is some legitimate concerns here, but
they would not be handled at this layer, but above.

However, you should define as a spec what options are allowed and what
their semantics are. We should not accept arbitrary linker options because
semantics of some linker options cannot be clearly defined when they appear
as embedded options. Just saying "this feature allows you to embed linker
options to object files" is too weak as a specification. You need to clearly
define a list of options that will be supported by linkers with their clear
semantics.

Personally, I would like to see the ability to add support for additional
options without having to modify the compiler.  That said, I think that
there are options which can be scary (e.g. -nopie).  I think that the linker
should make the decision of what it supports and error out on others.  This
allows for us to enhance the support over time without a huge overhead.  As
a starting point, I think that -l and -L are two that would be interesting.
I can see -u being useful as well, but the point is that we can slowly grow
the support after consideration by delaying the validation of the options.

(*1) One of the big annoyances that I noticed when I was implementing the
same feature for COFF is that the COFF's .drctve section that contains
linker options have to be tokenized in the same way as the Windows command
line does. So it needs to interpret double quotes and backslashes correctly
especially when handling space-containing pathnames. This is a design
failure that a COFF file contains just a single string instead of runs of
strings that have already been tokenized.

I too would like to keep the linker from having to tokenize the
strings. I kind of agree with Rafael that there should be defined tags
and values, much like a .dynamic section, but I wouldn't want to have
values pointing to strings in yet another section, so I'd prefer
something in between that and a free-form null-terminated string. I
also wouldn't want to open up the complete list of linker options, so
I'd prefer a defined list of tags in string form that could easily be
augmented without additional backend support. We could start with,
perhaps, "lib" to inject a library (a la "-l"), "file" to inject an
object file by full name, and "path" to provide a search path (a la
"-L"). I don't think an equivalent for "-u" would be needed, since the
compiler can simply generate an undef symbol for that case. For the
section format, I'd suggest a series of null-terminated strings,
alternating between tags and values, so that no quote or escape
parsing is necessary.

Sounds like we agree on the direction: we don’t want  the backend to be
involved in adding new options, we don’t think that all options make sense
but want to be able to add options still.  As to the `-u` option, Im
thinking about cases were an unreferenced symbol would like to be preserved
with `—gc-sections` and being built with `-ffunction-sections` and/or
`-fdata-sections`.

So, after discussing some of the items, we ended up somewhere in-between.
My current proposal is a semi-pre-tokenized linker response file.
Basically, each option/parameter “pair” would be a single string entry in
an array of string values.  The only difference is instead of TLV entries,
it is simply the raw entry.  My resistance to the TLV really is driven more
by the LLVM IR (which I suppose is possible to alter):

https://llvm.org/docs/LangRef.html#automatic-linker-flags-named-metadata

For the header files, a simple syntax like

  #pragma linker_directive "lib" "m"

would provide the extensibility needed to add new tags with no
additional support in the front end or back end.

I *really* wish to avoid this discussion right now.  I am happy to loop you
into a subsequent thread discussing that.  I figure that this will be a
much more contentious issue as syntax is something everyone has differing
opinions on.  I’m trying to split this work into three distinct pieces: the
frontend support to emit the information, the backend to emit this into the
object, and the linker to use it.

As an aside, personally, I was thinking more along the lines of `#pragma
comment(lib, “m”)`.

-cary

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180106/436e50bc/attachment.html>