[llvm-dev] [LLD] Linker Relaxation
Rui Ueyama via llvm-dev
llvm-dev at lists.llvm.org
Tue Jul 11 12:44:24 PDT 2017
On Wed, Jul 12, 2017 at 4:37 AM, Bruce Hoult <bruce at hoult.org> wrote:
> Yes. The code is correct and runnable once the relocations have been done,
> without supporting shrinking.
>
> This RISC-V functionality is in mainline gnu binutils ld since 2.28 (March
> 2, 2017), so no one absolutely needs lld to support it immediately.
>
In that sense, no one absolutely needs lld for any architecture :) But we
provide better performance and better integration with LLVM than those.
On Tue, Jul 11, 2017 at 9:27 PM, Rui Ueyama via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> By the way, since this is an optional code relaxation, we can think about
>> it later. The first thing I would do is to add RISC-V support to lld
>> without code shrinking relaxations, which I believe is doable by at most a
>> few hundreds lines of code.
>>
>> On Wed, Jul 12, 2017 at 3:21 AM, Rui Ueyama <ruiu at google.com> wrote:
>>
>>> On Tue, Jul 11, 2017 at 9:14 PM, Bruce Hoult via llvm-dev <
>>> llvm-dev at lists.llvm.org> wrote:
>>>
>>>> Here's an example using the gcc toolchain for embedded 32 bit RISC-V
>>>> (my HiFive1 board):
>>>>
>>>> #include <stdio.h>
>>>>
>>>> int foo(int i){
>>>> if (i < 100){
>>>> printf("%d\n", i);
>>>> }
>>>> return i;
>>>> }
>>>>
>>>> int main(){
>>>> foo(10);
>>>> return 0;
>>>> }
>>>>
>>>> After compiling to a .o with -O2 -march=RV32IC we get (just looking at
>>>> foo)
>>>>
>>>> 00000000 <foo>:
>>>> 0: 1141 addi sp,sp,-16
>>>> 2: c422 sw s0,8(sp)
>>>> 4: c606 sw ra,12(sp)
>>>> 6: 06300793 li a5,99
>>>> a: 842a mv s0,a0
>>>> c: 00a7cb63 blt a5,a0,22 <.L2>
>>>> 10: 85aa mv a1,a0
>>>> 12: 00000537 lui a0,0x0
>>>> 16: 00050513 mv a0,a0
>>>> 1a: 00000317 auipc t1,0x0
>>>> 1e: 000300e7 jalr t1
>>>>
>>>> 00000022 <.L2>:
>>>> 22: 40b2 lw ra,12(sp)
>>>> 24: 8522 mv a0,s0
>>>> 26: 4422 lw s0,8(sp)
>>>> 28: 0141 addi sp,sp,16
>>>> 2a: 8082 ret
>>>>
>>>> And after linking:
>>>>
>>>> 00010164 <foo>:
>>>> 10164: 1141 addi sp,sp,-16
>>>> 10166: c422 sw s0,8(sp)
>>>> 10168: c606 sw ra,12(sp)
>>>> 1016a: 06300793 li a5,99
>>>> 1016e: 842a mv s0,a0
>>>> 10170: 00a7c863 blt a5,a0,10180 <foo+0x1c>
>>>> 10174: 85aa mv a1,a0
>>>> 10176: 0001a537 lui a0,0x1a
>>>> 1017a: 6a050513 addi a0,a0,1696 # 1a6a0
>>>> <__clz_tab+0x100>
>>>> 1017e: 2a69 jal 10318 <printf>
>>>> 10180: 40b2 lw ra,12(sp)
>>>> 10182: 8522 mv a0,s0
>>>> 10184: 4422 lw s0,8(sp)
>>>> 10186: 0141 addi sp,sp,16
>>>> 10188: 8082 ret
>>>>
>>>> The linker has done quite a lot!
>>>>
>>>> 1) the format string address generation has had the LUI (Load Upper
>>>> Immediate)
>>>> changed from 0x0 to 0x1a (the literal is in flash memory). If it had
>>>> stayed at
>>>> 0x0 it would have been removed by the linker. The mv a0,a0 (which is
>>>> really
>>>> addi a0,a0,#0) has had the real immediate filled in.
>>>>
>>>> 2) the call of printf had the general call-anywhere-in-the-address-space
>>>> auipc
>>>> (Add Upper Immediate to PC); jalr (Jump And Link to address in Register
>>>> (plus
>>>> offset)) sequence replaced by a simple jal (Jump And Link, with PC +/-
>>>> 1 MB range)
>>>>
>>>> 3) as the jal offset was in fact less than +/- 2 KB, the 32 bit jal was
>>>> replaced by a
>>>> 16 bit jal instruction.
>>>>
>>>> 4) the conditional branch has been shortened from 18 bytes to 12 bytes
>>>> due to
>>>> the other changes.
>>>>
>>>
>>> Thanks, Bruce. This is a very interesting optimization.
>>>
>>> lld doesn't currently have code to support that kind of code shrinking
>>> optimization, but we can definitely add it. It seems that essentially we
>>> need to iterate over all relocations while rewriting instructions until a
>>> convergence is obtained. But the point is how to do do it efficiently --
>>> link speed really matters. I can't come up with an algorithm to parallelize
>>> it. Do you have any idea?
>>>
>>> In order to shrink instructions, all address references must be
>>> explicitly represented as relocations even if they are in the same section.
>>> I think that means object files for RISC-V have many more relocations than
>>> the other architectures. Is this correct?
>>>
>>>
>>>>
>>>
>>>> On Tue, Jul 11, 2017 at 1:59 PM, Peter Smith via llvm-dev <
>>>> llvm-dev at lists.llvm.org> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> To the best of my knowledge I think the closest analogue is something
>>>>> like the Synthetic EHFrame and MergeInputSections, not strictly code
>>>>> relaxation, but these do involve changes in size of sections.
>>>>>
>>>>> Can I ask you a quick question? In many architectures not all
>>>>> pc-relative offsets are exposed to the linker as relocations so it
>>>>> isn't safe to change the sizes of sections in arbitrary places [*];
>>>>> does the RiscV ABI have restrictions on code-generation to allow a
>>>>> linker to reduce the code-size of a code-sequence within a Section? If
>>>>> there are constraints on the relaxations it might help us make a
>>>>> suggestion.
>>>>>
>>>>> I'm assuming that if you are doing some kind of range based relaxation
>>>>> you'll need something like range extension thunks (I'm working on
>>>>> these right now) this means you'll probably have to do your
>>>>> calculations on what you can relax in finalizeSections() at a similar
>>>>> point to createThunks(), or perhaps the mechanisms would need to be
>>>>> merged as I think they'll need to converge on a point when no more
>>>>> relaxations are possible and no more thunks can be added.
>>>>>
>>>>> Writing out the relaxed sections will be interesting as you won't want
>>>>> all of the InputSectionContents. I suggest looking at EHFrame and
>>>>> MergeInputSections for ideas.
>>>>>
>>>>> Hope that is of some use
>>>>>
>>>>> Peter
>>>>>
>>>>> [*] For example in pseudo ARM
>>>>>
>>>>> ldr r0, [pc, offset] ; where pc + offset == label
>>>>> ...
>>>>> relaxable sequence such as an indirect jump via a register
>>>>> ...
>>>>> label: .word foo
>>>>>
>>>>> If the compiler/assembler has pre-computed the offset to label then
>>>>> changing the size of the relaxable sequence without also updating the
>>>>> offset will break the program.
>>>>>
>>>>>
>>>>>
>>>>> On 11 July 2017 at 11:09, PkmX via llvm-dev <llvm-dev at lists.llvm.org>
>>>>> wrote:
>>>>> > Hi,
>>>>> >
>>>>> > Does lld support linker relaxation that may shrink code size? As far
>>>>> > as I see lld seems to assume that the content of input sections to be
>>>>> > fixed other than patching up relocations, but I believe some targets
>>>>> > may benefit the extra optimization opportunity with relaxation.
>>>>> > Specifically, I'm currently working on adding support for RISC-V in
>>>>> > lld, and RISC-V heavily relies on linker relaxation to remove
>>>>> > extraneous code and to handle alignment. Since linker relaxation may
>>>>> > be of interest to other targets as well, I'm wondering what would be
>>>>> a
>>>>> > good way to modify lld to support that. Thanks.
>>>>> >
>>>>> > --
>>>>> > Chih-Mao Chen (PkmX)
>>>>> > Software R&D, Andes Technology
>>>>> > _______________________________________________
>>>>> > LLVM Developers mailing list
>>>>> > llvm-dev at lists.llvm.org
>>>>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>
>>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170712/74545d6a/attachment.html>
More information about the llvm-dev
mailing list