[PATCH] D25766: [ELF] - Implemented --section-ordering-file option.
George Rimar via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 26 09:41:03 PDT 2016
grimar added a comment.
I did lot of testing last days and unfortunatly I still can't find a proper way to generate "good" order list of sections
to demonstrate the positive result.
But I can prove that ordering of sections defenetely matters.
When I start lld linked clang under perf, with this patch applied and some empty order file:
perf stat ./clang-4.0 -help
Performance counter stats for './clang-4.0 -help':
60.445699 task-clock (msec) # 0.860 CPUs utilized
0 context-switches # 0.000 K/sec
0 cpu-migrations # 0.000 K/sec
889 page-faults # 0.015 M/sec
...
Now if I change ordering funtion to return random:
int elf::getSectionFileOrder(StringRef S) {
return rand() % INT32_MAX;
}
I have:
Performance counter stats for './clang-4.0 -help':
26.831371 task-clock (msec) # 0.742 CPUs utilized
2 context-switches # 0.075 K/sec
0 cpu-migrations # 0.000 K/sec
1,963 page-faults # 0.073 M/sec
....
So I observed major slowdown, just because of reordering sections. I think that can probably work as a prove that building
proper ordering file can help to boost the startup of application in theory.
Unfortunately I was unable to generate ordering file that works better than empty or absence of one. I tried to use next method for that:
1. Generated a list of pagefaults: perf trace -F all -o test.txt ./clang-4.0 -help
2. Applied filter: grep -E '(clang-4.0/57467 minfault)' test.txt. That produces a list of faults of next view:
0.092 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x54d] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x40 (d.)
0.097 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x6b1] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x881df50 (d.)
0.102 ( 0.000 ms): clang-4.0/57467 minfault [_dl_setup_hash+0x10] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x23455a8 (d.)
0.120 ( 0.000 ms): clang-4.0/57467 minfault [strlen+0x26] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x2a6ebcf (d.)
0.143 ( 0.000 ms): clang-4.0/57467 minfault [dl_main+0x1aab] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x881e128 (d.)
0.153 ( 0.000 ms): clang-4.0/57467 minfault [strchr+0x23] => /home/umb/LLVM/build_self/bin/clang-4.0 at 0x23f5363 (d.)
3. Generated list of clang binary symbols: readelf -W -s clang-4.0 > symbols.txt
4. Using self written tool generated a order list of sections (it take offset of each page fault, finds proper symbol name and just attach
prefixes like ".text." etc). I posted the result just for reference: https://justpaste.it/zrbs
I checked that sorting really works. But result was worse than without ordering file. Partially because
if getSectionFileOrder() which returns 0 by default, that is wrong I think, so I made it to return INT32_MAX (to place unlisted sections after listed ones).
After that change there is no difference with and without use of ordering file. Even the page faults looks to be almost the same.
https://reviews.llvm.org/D25766
More information about the llvm-commits
mailing list