<div dir="ltr">Here's an example using the gcc toolchain for embedded 32 bit RISC-V (my HiFive1 board):<div><br></div><div><div>#include <stdio.h></div><div><br></div><div>int foo(int i){</div><div> if (i < 100){</div><div> printf("%d\n", i);</div><div> }</div><div> return i;</div><div>}</div><div><br></div><div>int main(){</div><div> foo(10);</div><div> return 0;</div><div>}</div></div><div><br></div><div>After compiling to a .o with -O2 -march=RV32IC we get (just looking at foo)</div><div><br></div><div><div>00000000 <foo>:</div><div> 0:<span style="white-space:pre"> </span>1141 <span style="white-space:pre"> </span>addi<span style="white-space:pre"> </span>sp,sp,-16</div><div> 2:<span style="white-space:pre"> </span>c422 <span style="white-space:pre"> </span>sw<span style="white-space:pre"> </span>s0,8(sp)</div><div> 4:<span style="white-space:pre"> </span>c606 <span style="white-space:pre"> </span>sw<span style="white-space:pre"> </span>ra,12(sp)</div><div> 6:<span style="white-space:pre"> </span>06300793 <span style="white-space:pre"> </span>li<span style="white-space:pre"> </span>a5,99</div><div> a:<span style="white-space:pre"> </span>842a <span style="white-space:pre"> </span>mv<span style="white-space:pre"> </span>s0,a0</div><div> c:<span style="white-space:pre"> </span>00a7cb63 <span style="white-space:pre"> </span>blt<span style="white-space:pre"> </span>a5,a0,22 <.L2></div><div> 10:<span style="white-space:pre"> </span>85aa <span style="white-space:pre"> </span>mv<span style="white-space:pre"> </span>a1,a0</div><div> 12:<span style="white-space:pre"> </span>00000537 <span style="white-space:pre"> </span>lui<span style="white-space:pre"> </span>a0,0x0</div><div> 16:<span style="white-space:pre"> </span>00050513 <span style="white-space:pre"> </span>mv<span style="white-space:pre"> </span>a0,a0</div><div> 1a:<span style="white-space:pre"> </span>00000317 <span style="white-space:pre"> </span>auipc<span style="white-space:pre"> </span>t1,0x0</div><div> 1e:<span style="white-space:pre"> </span>000300e7 <span style="white-space:pre"> </span>jalr<span style="white-space:pre"> </span>t1</div><div><br></div><div>00000022 <.L2>:</div><div> 22:<span style="white-space:pre"> </span>40b2 <span style="white-space:pre"> </span>lw<span style="white-space:pre"> </span>ra,12(sp)</div><div> 24:<span style="white-space:pre"> </span>8522 <span style="white-space:pre"> </span>mv<span style="white-space:pre"> </span>a0,s0</div><div> 26:<span style="white-space:pre"> </span>4422 <span style="white-space:pre"> </span>lw<span style="white-space:pre"> </span>s0,8(sp)</div><div> 28:<span style="white-space:pre"> </span>0141 <span style="white-space:pre"> </span>addi<span style="white-space:pre"> </span>sp,sp,16</div><div> 2a:<span style="white-space:pre"> </span>8082 <span style="white-space:pre"> </span>ret</div></div><div><br></div><div>And after linking:</div><div><br></div><div><div>00010164 <foo>:</div><div> 10164: 1141 addi sp,sp,-16</div><div> 10166: c422 sw s0,8(sp)</div><div> 10168: c606 sw ra,12(sp)</div><div> 1016a: 06300793 li a5,99</div><div> 1016e: 842a mv s0,a0</div><div> 10170: 00a7c863 blt a5,a0,10180 <foo+0x1c></div><div> 10174: 85aa mv a1,a0</div><div> 10176: 0001a537 lui a0,0x1a</div><div> 1017a: 6a050513 addi a0,a0,1696 # 1a6a0 <__clz_tab+0x100></div><div> 1017e: 2a69 jal 10318 <printf></div><div> 10180: 40b2 lw ra,12(sp)</div><div> 10182: 8522 mv a0,s0</div><div> 10184: 4422 lw s0,8(sp)</div><div> 10186: 0141 addi sp,sp,16</div><div> 10188: 8082 ret</div></div><div><br></div><div>The linker has done quite a lot!</div><div><br></div><div>1) the format string address generation has had the LUI (Load Upper Immediate)<br></div><div>changed from 0x0 to 0x1a (the literal is in flash memory). If it had stayed at</div><div>0x0 it would have been removed by the linker. The mv a0,a0 (which is really</div><div>addi a0,a0,#0) has had the real immediate filled in.</div><div><br></div><div>2) the call of printf had the general call-anywhere-in-the-address-space auipc</div><div>(Add Upper Immediate to PC); jalr (Jump And Link to address in Register (plus</div><div>offset)) sequence replaced by a simple jal (Jump And Link, with PC +/- 1 MB range)</div><div><br></div><div>3) as the jal offset was in fact less than +/- 2 KB, the 32 bit jal was replaced by a</div><div>16 bit jal instruction.</div><div><br></div><div>4) the conditional branch has been shortened from 18 bytes to 12 bytes due to</div><div>the other changes.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jul 11, 2017 at 1:59 PM, Peter Smith via llvm-dev <span dir="ltr"><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hello,<br>
<br>
To the best of my knowledge I think the closest analogue is something<br>
like the Synthetic EHFrame and MergeInputSections, not strictly code<br>
relaxation, but these do involve changes in size of sections.<br>
<br>
Can I ask you a quick question? In many architectures not all<br>
pc-relative offsets are exposed to the linker as relocations so it<br>
isn't safe to change the sizes of sections in arbitrary places [*];<br>
does the RiscV ABI have restrictions on code-generation to allow a<br>
linker to reduce the code-size of a code-sequence within a Section? If<br>
there are constraints on the relaxations it might help us make a<br>
suggestion.<br>
<br>
I'm assuming that if you are doing some kind of range based relaxation<br>
you'll need something like range extension thunks (I'm working on<br>
these right now) this means you'll probably have to do your<br>
calculations on what you can relax in finalizeSections() at a similar<br>
point to createThunks(), or perhaps the mechanisms would need to be<br>
merged as I think they'll need to converge on a point when no more<br>
relaxations are possible and no more thunks can be added.<br>
<br>
Writing out the relaxed sections will be interesting as you won't want<br>
all of the InputSectionContents. I suggest looking at EHFrame and<br>
MergeInputSections for ideas.<br>
<br>
Hope that is of some use<br>
<br>
Peter<br>
<br>
[*] For example in pseudo ARM<br>
<br>
ldr r0, [pc, offset] ; where pc + offset == label<br>
...<br>
relaxable sequence such as an indirect jump via a register<br>
...<br>
label: .word foo<br>
<br>
If the compiler/assembler has pre-computed the offset to label then<br>
changing the size of the relaxable sequence without also updating the<br>
offset will break the program.<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
<br>
On 11 July 2017 at 11:09, PkmX via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>
> Hi,<br>
><br>
> Does lld support linker relaxation that may shrink code size? As far<br>
> as I see lld seems to assume that the content of input sections to be<br>
> fixed other than patching up relocations, but I believe some targets<br>
> may benefit the extra optimization opportunity with relaxation.<br>
> Specifically, I'm currently working on adding support for RISC-V in<br>
> lld, and RISC-V heavily relies on linker relaxation to remove<br>
> extraneous code and to handle alignment. Since linker relaxation may<br>
> be of interest to other targets as well, I'm wondering what would be a<br>
> good way to modify lld to support that. Thanks.<br>
><br>
> --<br>
> Chih-Mao Chen (PkmX)<br>
> Software R&D, Andes Technology<br>
> ______________________________<wbr>_________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
______________________________<wbr>_________________<br>
LLVM Developers mailing list<br>
<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
<a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>
</div></div></blockquote></div><br></div>