[PATCH] D60958: [PPC64] toc-indirect to toc-relative relaxation
Sean Fertile via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Apr 23 09:29:41 PDT 2019
sfertile added a comment.
Hi MaskRay.
Thanks for doing this, when you originally described this too me I didn't realize you intended to partition and sort the relocations in the sections other then .rela.toc. This clears up my question regarding the implementation. I'm still a little hesitant on this approach though. Did you profile link times between the 2 approaches? I understand the speed up in number of access (you have outlined that very well in another comment), however that fails to consider both how small the number of relocation in rela.toc is compared to all the relocations in the text section, and how infrequently we have an object file that actually enters the loop.
The main reason I wasn't so concerned over the n^2 look up in the original patch is because we would hit that loop so infrequently, and even though it is technically n^2, in practice the typical object files gcc produces when it does put a constant in the TOC would typically lead to 1 or 2 extra array access rather then n extra array accesses. I've compiled a couple of projects to show what I mean.
Protobuf: 628 objects, 27 where there is a constant in the TOC. (4.3%)
Missing reloc count Frequency
1 19
2 8
Postgres: 704 objects 12 with a constant in the TOC (1.7%)
Missing reloc count Frequency
1 10
2 1
9 1
FMPEG: 681 objects, 28 with constants in the TOC (4.1%)
Missing reloc count Frequency
1 22
2 4
3 1
96 1
LLVM: 3294 Objects, 355 with constant in the TOC. (10.8%)
Missing reloc count Frequency
1 180
2 61
3 48
4 22
5 13
6 4
7 5
8 3
9 2
10 2
11 3
12 3
13 2
19 1
22 1
28 2
32 1
40 1
80 1
Clearly there are a few objects where the number of missing relocation's does start to get worryingly large (30/40/80/96), but 90%-95% of the files never hit the loop, and 70% of those that do need the loop will have at most 1 or 2 extra array accesses per lookup. Note that this is all compiled with gcc, when compiling with clang as the build compiler we end up with *no* objects falling though to the loop.
FWIW, I think this implementation is clean and understandable enough that we can switch to it, but I would like to know how this affects the link time of say llvm when clang is the build compiler and when gcc is the build compiler before deciding this is the best approach.
Repository:
rLLD LLVM Linker
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D60958/new/
https://reviews.llvm.org/D60958
More information about the llvm-commits
mailing list