<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Oct 26, 2016 at 9:41 AM, George Rimar <span dir="ltr"><<a href="mailto:grimar@accesssoftek.com" target="_blank">grimar@accesssoftek.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">grimar added a comment.<br>
<br>
I did lot of testing last days and unfortunatly I still can't find a proper way to generate "good" order list of sections<br>
to demonstrate the positive result.<br>
But I can prove that ordering of sections defenetely matters.<br>
<br>
When I start lld linked clang under perf, with this patch applied and some empty order file:<br>
perf stat ./clang-4.0 -help<br>
<br>
Performance counter stats for './clang-4.0 -help':<br>
60.445699 task-clock (msec) # 0.860 CPUs utilized<br>
0 context-switches # 0.000 K/sec<br>
0 cpu-migrations # 0.000 K/sec<br>
889 page-faults # 0.015 M/sec<br>
...<br>
<br>
Now if I change ordering funtion to return random:<br>
<br>
int elf::getSectionFileOrder(Strin<wbr>gRef S) {<br>
return rand() % INT32_MAX;<br>
}<br>
<br>
<br>
<br>
I have:<br>
Performance counter stats for './clang-4.0 -help':<br>
26.831371 task-clock (msec) # 0.742 CPUs utilized<br>
2 context-switches # 0.075 K/sec<br>
0 cpu-migrations # 0.000 K/sec<br>
1,963 page-faults # 0.073 M/sec<br>
....<br></blockquote><div><br></div><div>I may be missing something, but you probably cannot draw any conclusion from a program that completes in 30 milliseconds with 2k page faults. I think it is way too small. All positive/negative signals might have been buried in noise.</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
So I observed major slowdown, just because of reordering sections. I think that can probably work as a prove that building<br>
proper ordering file can help to boost the startup of application in theory.<br>
<br>
Unfortunately I was unable to generate ordering file that works better than empty or absence of one. I tried to use next method for that:<br>
<br>
1. Generated a list of pagefaults: perf trace -F all -o test.txt ./clang-4.0 -help<br>
2. Applied filter: grep -E '(clang-4.0/57467 minfault)' test.txt. That produces a list of faults of next view:<br>
<br>
0.092 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x54d] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x40 (d.)<br>
0.097 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x6b1] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x881df50 (d.)<br>
0.102 ( 0.000 ms): clang-4.0/57467 minfault [_dl_setup_hash+0x10] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x23455a8 (d.)<br>
0.120 ( 0.000 ms): clang-4.0/57467 minfault [strlen+0x26] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x2a6ebcf (d.)<br>
0.143 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x1aab] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x881e128 (d.)<br>
0.153 ( 0.000 ms): clang-4.0/57467 minfault [strchr+0x23] => /home/umb/LLVM/build_self/bin/<wbr>clang-4.0@0x23f5363 (d.)<br>
<br>
3. Generated list of clang binary symbols: readelf -W -s clang-4.0 > symbols.txt<br>
4. Using self written tool generated a order list of sections (it take offset of each page fault, finds proper symbol name and just attach<br>
<br>
prefixes like ".text." etc). I posted the result just for reference: <a href="https://justpaste.it/zrbs" rel="noreferrer" target="_blank">https://justpaste.it/zrbs</a><br>
<br>
I checked that sorting really works. But result was worse than without ordering file. Partially because<br>
if getSectionFileOrder() which returns 0 by default, that is wrong I think, so I made it to return INT32_MAX (to place unlisted sections after listed ones).<br>
After that change there is no difference with and without use of ordering file. Even the page faults looks to be almost the same.<br>
<br>
<br>
<a href="https://reviews.llvm.org/D25766" rel="noreferrer" target="_blank">https://reviews.llvm.org/D2576<wbr>6</a><br>
<br>
<br>
<br>
</blockquote></div><br></div></div>