[PATCH] D60958: [PPC64] toc-indirect to toc-relative relaxation

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 23 20:42:19 PDT 2019


MaskRay added a comment.

In D60958#1475755 <https://reviews.llvm.org/D60958#1475755>, @sfertile wrote:

> Hi MaskRay.
>
> Thanks for doing this, when you originally described this too me I didn't realize you intended to partition and sort the relocations in the sections other then .rela.toc. This clears up my question regarding the implementation.  I'm still a little hesitant on this approach though. Did you profile link times between the 2 approaches?  I understand the speed up in number of access (you have outlined that very well in another comment), however that fails to consider both how small the number of relocation in rela.toc is compared to all the relocations in the text section, and how infrequently we have an object file that actually enters the loop.
>
> The main reason I wasn't so concerned over the n^2 look up in the original patch is because we would hit that loop so infrequently, and even though it is technically n^2, in practice the typical object files gcc produces when it does put a constant in the TOC  would typically lead to 1 or 2  extra array access rather then n extra array accesses. I've compiled a couple of projects to show what I mean.
>
>   Protobuf: 628 objects, 27 where there is a constant in the TOC. (4.3%)
>    Missing reloc count               Frequency
>          1                             19
>          2                              8
>  
>   Postgres: 704 objects 12 with a constant in the TOC (1.7%)
>   Missing reloc count                Frequency
>           1                            10
>           2                             1
>           9                             1
>  
>   FMPEG: 681 objects,  28 with constants in the TOC (4.1%)
>   Missing reloc count                Frequency
>         1                               22
>         2                                4
>         3                                1
>         96                               1
>  
>   LLVM: 3294 Objects, 355 with constant in the TOC.  (10.8%)
>   Missing reloc count                Frequency
>        1                               180
>        2                                61
>        3                                48
>        4                                22
>        5                                13
>        6                                 4
>        7                                 5
>        8                                 3
>        9                                 2
>       10                                 2
>       11                                 3
>       12                                 3
>       13                                 2
>       19                                 1
>       22                                 1
>       28                                 2
>       32                                 1
>       40                                 1
>       80                                 1
>
>
> Clearly there are a few objects where the number of missing relocation's does start to get worryingly large (30/40/80/96), but 90%-95%  of the files never hit the loop, and  70% of those that do need the loop will have at most 1 or 2 extra array accesses per lookup. Note that this is all compiled with gcc, when compiling  with clang as the build compiler we end up with *no* objects falling though to the loop.
>
> FWIW, I think this implementation is clean and understandable enough that we can switch to it, but I would like to know how this affects the link time of say llvm when clang is the build compiler and when gcc is the build compiler  before deciding this is the best approach.


Hi Sean,

Thank you for the statistics and explanation!

As I updated some PPC64 tests yesterday, now I think I have a better understanding of what these relaxations are... I didn't really understand the R_PPC64_ADDR64 relocations in `.toc` well until now. I think your approach in D54720 <https://reviews.llvm.org/D54720> is superior. (Though I tested an earlier version of this revision on numerous internal targets and didn't see issues caused by it (there are many unrelated long-standing issues))

I now understand that `.rela.toc` consists of exclusively R_PPC64_ADDR64 relocations. What you mentioned before is that clang/llvm never emits constants into `.toc` so the linear loop can't be a problem. For gcc, as your statistics say, it isn't a problem either given the very small number of such cases.

I switched to your approach but tried inlining some functions (fixed an 32-bit compile error and an assertion error when `Relas` is empty, etc) and adjusted some code to make it (hopefully) easier to understand.


Repository:
  rLLD LLVM Linker

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D60958/new/

https://reviews.llvm.org/D60958





More information about the llvm-commits mailing list