[LLVMdev] LLD improvement plan

Thu May 28 20:22:17 PDT 2015

On Thu, May 28, 2015 at 6:25 PM, Nick Kledzik <kledzik at apple.com> wrote:

>
> On May 28, 2015, at 5:42 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
> I guess, looking back at Nick's comment:
>
> "The atom model is a good fit for the llvm compiler model for all
> architectures.  There is a one-to-one mapping between llvm::GlobalObject
> (e.g. function or global variable) and lld:DefinedAtom."
>
> it seems that the primary issue on the ELF/COFF side is that currently the
> LLVM backends are taking a finer-grained atomicity that is present inside
> LLVM, and losing information by converting that to a coarser-grained
> atomicity that is the typical "section" in ELF/COFF.
> But doesn't -ffunction-sections -fdata-sections already fix this,
> basically?
>
> On the Mach-O side, the issue seems to be that Mach-O's notion of section
> carries more hard-coded meaning than e.g. ELF, so at the very least another
> layer of subdivision below what Mach-O calls "section" would be needed to
> preserve this information; currently symbols are used as a bit of a hack as
> this "sub-section" layer.
>
> I’m not sure what you mean here.
>
>
> So the problem seems to be that the transport format between the compiler
> and linker varies by platform, and each one has a different way to
> represent things, some can't represent everything we want to do, apparently.
>
> Yes!
>
>
> BUT it sounds like at least relocatable ELF semantics can, in principle,
> represent everything that we can imagine an "atom-based file
> format"/"native format" to want to represent. Just to play devil's
> advocate here, let's start out with the "native format" being relocatable
> ELF - on *all platforms*. Relocatable object files are just a transport
> format between compiler and linker, after all; who cares what we use? If
> the alternative is a completely new format, then bootstrapping from
> relocatable ELF is strictly less churn/tooling cost.
>
> People on the "atom side of the fence", what do you think? Is there
> anything that we cannot achieve by saying "native"="relocatable ELF"?
>
> 1) Turns out .o files are written once but read many times by the linker.
> Therefore, the design goal of .o files should be that they are as fast to
> read/parse in the linker as possible.  Slowing down the compiler to make a
> .o file that is faster for the linker to read is a good trade off.  This is
> the motivation for the native format - not that it is a universal format.
>

I don't think that switching from ELF to something new can make linkers
significantly faster. We need to handle ELF files carefully not to waste
time on initial load, but if you do, reading data required for symbol
resolution from ELF file should be satisfactory fast (I did that for COFF
-- the current "atom-based ELF" linker is doing too much things in an
initial load, like read all relocation tables, splitting indivisble chunk
of data and connect them with "indivisible" edges, etc.) Looks like we read
symbol table pretty quickly in the new implementation, and the bottleneck
of it is now the time to insert symbols into the symbol hash table -- which
you cannot make faster by changing object file format.

Speaking of the performance, if I want to make a significant difference,
I'd focus on introducing new symbol resolution semantics. Especially, the
Unix linker semantics is pretty bad for performance because we have to
visit files one by one serially and possibly repeatedly. It's not only bad
for parallelism but also for a single-thread case because it increase size
of data to be processed. This is I believe the true bottleneck of Unix
linkers. Tackling that problem seems to be most important to me, and "ELF
as a file format is slow" is still an unproved thing to me.

>
> 2) I think the ELF camp still thinks that linkers are “dumb”.  That they
> just collate .o files into executable files.  The darwin linker does a lot
> of processing/optimizing the content (e.g. Objective-C optimizing, dead
> stripping, function/data re-ordering).  This is why atom level granularity
> is needed.
>

I think that all these things are doable (and are being done) using
-ffunction-sections.

>
> For darwin, ELF based .o files is not interesting.  It won’t be faster,
> and it will take a bunch of effort to figure out how to encode all the
> mach-o info into ELF.  We’d rather wait for a new native format.
>

> -Nick
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150528/d29ccbe8/attachment.html>