[llvm-dev] [LLVM-DEV][LLD] RFC Range Thunks Implementation review for ARM and Mips

Thu Apr 6 09:44:42 PDT 2017

Just FYI: A quick experiment that got as far as creating an
OutputSectionCmd for each OutputSection when doing a link without a
linker script exposed an interesting performance problem with the
many-sections.s test.

To reproduce just add a linker script to the link in the test that
will force the creation of a large number of orphan sections, for
example:
// RUN: echo "SECTIONS { \
// RUN:       . = SIZEOF_HEADERS; \
// RUN:       .text : { *(.text) } } \
// RUN: " > %t.script
// RUN: ld.lld %t --script %t.script -o %t2
// RUN: llvm-readobj -t %t2 | FileCheck --check-prefix=LINKED %s

This will take over 60s to run the test on my machine. I think the
culprit is Script->writeDataBytes(Name, Buf); in
OutputSection::writeTo() which searches for the OutputSection by name.
With a huge number of sections this is going to take a long time. I'm
not sure if many-sections.s with a linker script is a representative
test case for lld as it stands but if we do go down the route of
fabricating a linker script command for each output section we'll need
to make a better mapping from OutputSection to OutputSection command
than a linear search by name.

Peter

On 6 April 2017 at 12:01, Peter Smith <peter.smith at linaro.org> wrote:
> My understanding is that this would be (initially) limited to
> fabricating enough linker script commands such that we could replace:
> fixSectionAlignments()
> assignAddresses()
> Script->processNonSectionCommands()
>
> With something like:
> Script->assignAddresses() // Could be done multiple times
> Script->processNonSectionCommands() // This should only be done once
>
> In theory all the other __start and __end symbols could still be kept
> separate if the linker script commands were created late, and in a
> compatible way. I also don't think that this means removing
> OutputSections::Sections just yet either?
>
> I don't think that we are proposing to follow the ld.bfd model of
> driving the default case via a built in linker script yet? I think
> that this would be considerably more work than just this limited
> change.
>
> I think the best way forward is to try and prototype something to see
> if it splashes out any special cases. I can give this a go to see what
> happens.
>
> In the meantime I would be grateful if there is any opportunity to
> move forward some of the range thunks changes in parallel, even if
> they do not initially work with some linker scripts.
>
> If the above change to always using Script->assignAddresses() did
> happen then createThunks() would become a little bit more complicated
> as it would need to step through one or more input section
> descriptions per OutputSection. Any Thunks created would still need to
> be added to both InputSectionDescriptions and OutputSections::Sections
> but we could just use push_back().
>
> Peter
>
> On 6 April 2017 at 00:24, Rui Ueyama <ruiu at google.com> wrote:
>> Are you suggesting other linker jobs such as creating _end symbols to the
>> linker script?
>>
>> The linker script support was implemented after we wrote the current Writer
>> class, so it is somewhat "plugged in" to the Writer. It might not be the
>> best design, and not many other options have been explored. So there might
>> be room to improve code by moving work loads from the Writer to the
>> LinkerScript. But we need to careful not to hurt performance by doing that.
>>
>> On Wed, Apr 5, 2017 at 4:14 PM, Rafael Espíndola via llvm-dev
>> <llvm-dev at lists.llvm.org> wrote:
>>>
>>> > Proposed implementation for range extension thunks
>>> > At a high-level we need to solve the following problems:
>>> > - Assign addresses more than once
>>> > - Maintain state between successive calls of createThunks()
>>> > - Synchronization of the linker script and the OutputSection after
>>> > adding thunks
>>>
>>> This last past seems to be the messier. The issue is not with the
>>> patch, is with the existing infrastructure that uses a completely
>>> different representation for linker scripts and non linker scripts.
>>>
>>> What I think is needed is for the writer to create a dummy "script"
>>> and use what is now LinkerScript::assignAddresses. That "script" would
>>>
>>> * Contain only OutputSectionCommand.
>>> * All string manipulations would have been moved before assignAddress.
>>> * All the orphan handling would have been made explicit before
>>> assignAddress.
>>> * Each OutputSectionCommand would contain just a InputSectionDescription.
>>>
>>> With this the thunk creation should be able to add thunk to a single
>>> location.
>>>
>>> Cheers,
>>> Rafael
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>