[PATCH] D39152: [wip] ELF: Add support for emitting dynamic relocations in the Android relocation packing format.

Peter Smith via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Oct 22 01:39:05 PDT 2017


peter.smith added a comment.

I think that this is worthwhile putting into the linker as relocations can be sorted to optimise packing and make it easier to account for address changes to the rest of the ELF file.

Some overall thoughts and suggestions:

- Is it likely that other packing formats may come along? In this case we may be better to have an interface like --relocation-packing=(none | android | ... )
- It would be good to have a tool to expand and display textually the packed relocations, for example llvm-objdump -r could naturally display the packed relocation sections as if they were normal relocations.
- Is it worth adding an option to sort the relocations for packing, but not do the packing. This would make it easier to write tests, for example we could check that the relocations were sorted in the order that we would expect, with a suitable tool to unpack the encoded form into human readable text we could also check the encoding process. A secondary benefit would be that people using an external relocation packer could still benefit from the sorting.



================
Comment at: lld/ELF/Options.td:15
 
+def android_pack_relocs: F<"android-pack-relocs">, HelpText<"Use the Android relocation packing format">;
+
----------------
Would it be better to have --pack-dyn-relocs=<packing format>. Where we have none or Android as the initial parameters. I'm just thinking of other OSs or even Android having more than one format.


================
Comment at: lld/ELF/SyntheticSections.cpp:1109
   }
+
   if (In<ELFT>::RelaPlt->getParent() && !In<ELFT>::RelaPlt->empty()) {
----------------
Just checking to see if the new line was intentional? I've no objections to adding it.


================
Comment at: lld/ELF/SyntheticSections.cpp:1246
+template <class ELFT>
+static void encodeReloc(typename ELFT::Rela *P, const DynamicReloc &Rel) {
+  if (Config->IsRela)
----------------
I think that this is converting DynamicReloc to an ELFT::Rela? Would it be better to use a more specific name?


================
Comment at: lld/ELF/SyntheticSections.cpp:1251
+  if (Config->EMachine == EM_MIPS && Rel.getInputSec() == InX::MipsGot)
+    // Dynamic relocation against MIPS GOT section make deal TLS entries
+    // allocated in the end of the GOT. We need to adjust the offset to take
----------------
I don't think that this comment reads well and I'm not sure I understand it? I know it is just a move from another section but is it worth fixing it up?


================
Comment at: lld/ELF/SyntheticSections.cpp:1319
+  // section. An overview of the format can be found in the Android source code:
+  // https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h
+  enum {
----------------
In the final version I think it will be worth expanding the necessary details inline. I'm thinking of the stability of the link here. It also doesn't mention the LEB encoding either.


================
Comment at: lld/ELF/SyntheticSections.cpp:1330
+  RelocData.clear();
+  RelocData.push_back('A');
+  RelocData.push_back('P');
----------------
I guess this is some kind of header?


================
Comment at: lld/ELF/SyntheticSections.cpp:1340
+    std::vector<uint64_t> Relatives;
+    std::vector<Elf_Rel> OtherRelocs;
+
----------------
Is it worth using similarly structured names for the two partitions, for example RelativeRelocs and NonRelativeRelocs?


================
Comment at: lld/ELF/SyntheticSections.cpp:1353
+    std::sort(Relatives.begin(), Relatives.end());
+    std::sort(OtherRelocs.begin(), OtherRelocs.end(),
+              [](const Elf_Rel &A, const Elf_Rel &B) {
----------------
Is it worth moving the sort to just before the OtherRelocs are processed (near the end of the function)?


================
Comment at: lld/ELF/SyntheticSections.cpp:1355
+              [](const Elf_Rel &A, const Elf_Rel &B) {
+                return A.r_offset < B.r_offset;
+              });
----------------
The -zcombreloc sorting order tries to order relocations by the destination symbol first so that the dynamic loader (at least on Linux) can reuse the symbol lookup result. Is there a trade off between packing efficiency and load time here? This may be handled differently by the Android dynamic linker?


================
Comment at: lld/ELF/SyntheticSections.cpp:1364
+    // size 8 or larger.
+    std::vector<uint64_t> UngroupedRelatives;
+    std::vector<std::pair<uint64_t, uint64_t>> RelativeGroups;
----------------
Is it worth having a similarly structured name such as UngroupedRelatives and GroupedRelatives?


================
Comment at: lld/ELF/SyntheticSections.cpp:1369
+      size_t GroupSize = 1;
+      while (I != E && *(I - 1) + Config->Wordsize == *I) {
+        ++I;
----------------
I found rewriting this as *I - *(I - 1) == Config->Wordsize made it easier to understand, but this may not be universal.


================
Comment at: lld/ELF/SyntheticSections.cpp:1375
+      if (GroupSize < 8) {
+        for (unsigned J = 0; J != GroupSize; ++J)
+          UngroupedRelatives.push_back(GroupStart + J * Config->Wordsize);
----------------
Should J have a type of size_t?


================
Comment at: lld/ELF/SyntheticSections.h:405
+template <class ELFT>
+class AndroidRelocationSection final : public RelocationBaseSection {
+  typedef typename ELFT::Rel Elf_Rel;
----------------
Is it worth having Packed as part of the name, either PackedRelocationSection or the slightly overlong AndroidPackedRelocationSection. I think it will make it more obvious what the section does from the name.


================
Comment at: lld/ELF/Writer.cpp:1381
+        InX::MipsGot->updateAllocSize();
+      Changed |= In<ELFT>::RelaDyn->updateAllocSize();
+    } while (Changed);
----------------
Do we need a call to Script->assignAddresses() before updateAllocSize()? I'm thinking of the case where createThunks() affects the r_offsets in RelaDyn->updateAllocSize() that affects the size of RelaDyn, that affects the addresses in Thunks. I think that for correctness this all works out as we can't exit the loop unless there are no changes, but it may take longer to converge in some situations.


================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:733
   SHT_LOOS = 0x60000000,           // Lowest operating system-specific type.
+  SHT_ANDROID_REL = 0x60000001,
+  SHT_ANDROID_RELA = 0x60000002,
----------------
Would be good to have a reference to where these are defined (public specification?), similarly for the DT tags.


https://reviews.llvm.org/D39152





More information about the llvm-commits mailing list