[PATCH] D77694: [WIP][RISCV][ELF] Linker relaxation support

Tue Apr 7 18:32:40 PDT 2020

MaskRay added a comment.

Interestingly, I was thinking about the same thing on Saturday!  I wanted to
add `optimizeBasicBlockJumps()` in a proper place (D68065 <https://reviews.llvm.org/D68065>; basic block
sections; ok, they got impatient and committed it anyway).  The current place
is wrong for a thunk target (e.g. AArch64). I did not write code because I am
still unsure how to properly do linker relaxations.

Title: It should mention R_RISCV_ALIGN.

I believe the current framework can only handle the BFD counterpart of
relax_pass==2 (_bfd_riscv_relax_align). Other instruction rewriting may require
intertwined scanRelocations and finalizeAddressDependentContent. The relocation
scanning pass may have to be split, but I don't know whether the boundary is.

It has occurred to me that splitting the relocation scanning pass may be good for

- copy relocations and canonical PLT entries
- non-preemptible IFUNC
- .plt.got (https://bugs.llvm.org/show_bug.cgi?id=32938)
- RISC-V linker relaxations

The above is basically limitation of a linker with one-pass relocation scanning. To have a complete support we may have to bite the bullet. More bookkeeping may
be needed. InputSectionBase::relocations will not be sufficient.
finalizeSynthetic() may be called more than once.

An incomplete list of passes we do after `scanRelocations`:

  forEachRelSec(scanRelocations<ELFT>);
  add symbols to in.symTab and partitions[*].dynSymTab
  removeUnusedSyntheticSections()
  sortSections()
  finalizeSynthetic(in.*)
  fixSectionAlignments()
  finalizeAddressDependentContext // 
  finalizeSynthetic(in.symTab)
  finalizeSynthetic(in.ppc64LongBranchTarget) // conceptually it should be done after thunks are finalized

We need to move scanRelocations() as late as possible and move some passes (including finalizeSynthetic()) into finalizeAddressDependentContent(). These passes need to refactored to work if called more than once.

The finalizeAddressDependentContent() should be changed to several rounds of iterations (relax_pass). The last round handles `R_RISCV_ALIGN`.

A bit off-topic. For RISC-V's (ab)use of linker relaxations, my feeling is still complex. It is indeed a very convenient approach toward a good balance of code size/speed/convenience, but if we want to achieve more, some post-link time optimization frameworks may be more suitable. I don't really have enough experience with link time optimization but my understanding is that we currently use the term link time optimization (especially in the LLVM context) for optimizations performed on the LLVM IR level. Those low-level machine representations are not categorized as LTO.

================
Comment at: lld/ELF/Arch/RISCV.cpp:551
+  }
+  isec->relocations.resize(dest - brel);
+
----------------
llvm::erase_if

================
Comment at: lld/ELF/Writer.cpp:1674
+  // code to increase in size and potentially invalidate some relaxations.
+  for (int pass : target->relaxPasses) {
+    assignPasses = 0;
----------------
This loop should be merged with the previous for loop.

================
Comment at: lld/ELF/Writer.cpp:1679
+      for (OutputSection *osec : outputSections)
+        for (InputSection *isec : getInputSections(osec))
+          changed |= target->relaxSection(isec, pass);
----------------
Check `SHF_EXECINSTR`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77694/new/

https://reviews.llvm.org/D77694