[PATCH] D39152: [wip] ELF: Add support for emitting dynamic relocations in the Android relocation packing format.

Peter Collingbourne via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 23 14:16:03 PDT 2017


pcc added a comment.

In https://reviews.llvm.org/D39152#903029, @peter.smith wrote:

> I think that this is worthwhile putting into the linker as relocations can be sorted to optimise packing and make it easier to account for address changes to the rest of the ELF file.
>
> Some overall thoughts and suggestions:
>
> - Is it likely that other packing formats may come along? In this case we may be better to have an interface like --relocation-packing=(none | android | ... )


I think that makes sense. There's been some talk of a new packing format on the gnu-gabi lists, so it would be good to be able to accommodate whatever comes out of that.

> - It would be good to have a tool to expand and display textually the packed relocations, for example llvm-objdump -r could naturally display the packed relocation sections as if they were normal relocations.

Right, I was planning to teach llvm-objdump about this format as part of writing the test cases for this patch.

> - Is it worth adding an option to sort the relocations for packing, but not do the packing. This would make it easier to write tests, for example we could check that the relocations were sorted in the order that we would expect, with a suitable tool to unpack the encoded form into human readable text we could also check the encoding process. A secondary benefit would be that people using an external relocation packer could still benefit from the sorting.

It doesn't really seem worth it to me, as long as we have llvm-objdump support for testing purposes. I also don't see much value in making the output compatible with external relocation packers, as that cannot be done soundly in general.



================
Comment at: lld/ELF/Options.td:15
 
+def android_pack_relocs: F<"android-pack-relocs">, HelpText<"Use the Android relocation packing format">;
+
----------------
peter.smith wrote:
> Would it be better to have --pack-dyn-relocs=<packing format>. Where we have none or Android as the initial parameters. I'm just thinking of other OSs or even Android having more than one format.
Will do.


================
Comment at: lld/ELF/SyntheticSections.cpp:1109
   }
+
   if (In<ELFT>::RelaPlt->getParent() && !In<ELFT>::RelaPlt->empty()) {
----------------
peter.smith wrote:
> Just checking to see if the new line was intentional? I've no objections to adding it.
No, I'll remove it.


================
Comment at: lld/ELF/SyntheticSections.cpp:1246
+template <class ELFT>
+static void encodeReloc(typename ELFT::Rela *P, const DynamicReloc &Rel) {
+  if (Config->IsRela)
----------------
peter.smith wrote:
> I think that this is converting DynamicReloc to an ELFT::Rela? Would it be better to use a more specific name?
encodeDynamicReloc maybe?


================
Comment at: lld/ELF/SyntheticSections.cpp:1251
+  if (Config->EMachine == EM_MIPS && Rel.getInputSec() == InX::MipsGot)
+    // Dynamic relocation against MIPS GOT section make deal TLS entries
+    // allocated in the end of the GOT. We need to adjust the offset to take
----------------
peter.smith wrote:
> I don't think that this comment reads well and I'm not sure I understand it? I know it is just a move from another section but is it worth fixing it up?
I don't understand it either :)

I'll see if I can figure out what it is trying to say and fix it.


================
Comment at: lld/ELF/SyntheticSections.cpp:1319
+  // section. An overview of the format can be found in the Android source code:
+  // https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h
+  enum {
----------------
peter.smith wrote:
> In the final version I think it will be worth expanding the necessary details inline. I'm thinking of the stability of the link here. It also doesn't mention the LEB encoding either.
That makes sense. I'll try to write something up.


================
Comment at: lld/ELF/SyntheticSections.cpp:1330
+  RelocData.clear();
+  RelocData.push_back('A');
+  RelocData.push_back('P');
----------------
peter.smith wrote:
> I guess this is some kind of header?
Right.


================
Comment at: lld/ELF/SyntheticSections.cpp:1340
+    std::vector<uint64_t> Relatives;
+    std::vector<Elf_Rel> OtherRelocs;
+
----------------
peter.smith wrote:
> Is it worth using similarly structured names for the two partitions, for example RelativeRelocs and NonRelativeRelocs?
Makes sense, I'll do that.


================
Comment at: lld/ELF/SyntheticSections.cpp:1353
+    std::sort(Relatives.begin(), Relatives.end());
+    std::sort(OtherRelocs.begin(), OtherRelocs.end(),
+              [](const Elf_Rel &A, const Elf_Rel &B) {
----------------
peter.smith wrote:
> Is it worth moving the sort to just before the OtherRelocs are processed (near the end of the function)?
Sounds fine to me.


================
Comment at: lld/ELF/SyntheticSections.cpp:1355
+              [](const Elf_Rel &A, const Elf_Rel &B) {
+                return A.r_offset < B.r_offset;
+              });
----------------
peter.smith wrote:
> The -zcombreloc sorting order tries to order relocations by the destination symbol first so that the dynamic loader (at least on Linux) can reuse the symbol lookup result. Is there a trade off between packing efficiency and load time here? This may be handled differently by the Android dynamic linker?
The Android dynamic loader doesn't appear to have any optimization for this case. It looks like it just performs a symbol lookup for each relocation.

https://android.googlesource.com/platform/bionic/+/refs/heads/master/linker/linker.cpp#2507


================
Comment at: lld/ELF/SyntheticSections.cpp:1364
+    // size 8 or larger.
+    std::vector<uint64_t> UngroupedRelatives;
+    std::vector<std::pair<uint64_t, uint64_t>> RelativeGroups;
----------------
peter.smith wrote:
> Is it worth having a similarly structured name such as UngroupedRelatives and GroupedRelatives?
I kind of prefer the name RelativeGroups over GroupedRelatives because the latter implies that the vector contains relocations as opposed to groups, when in fact it contains groups. But I don't feel too strongly about it.


================
Comment at: lld/ELF/SyntheticSections.cpp:1369
+      size_t GroupSize = 1;
+      while (I != E && *(I - 1) + Config->Wordsize == *I) {
+        ++I;
----------------
peter.smith wrote:
> I found rewriting this as *I - *(I - 1) == Config->Wordsize made it easier to understand, but this may not be universal.
I found the form I wrote it in easier to understand, but again don't feel too strongly about it.


================
Comment at: lld/ELF/SyntheticSections.cpp:1375
+      if (GroupSize < 8) {
+        for (unsigned J = 0; J != GroupSize; ++J)
+          UngroupedRelatives.push_back(GroupStart + J * Config->Wordsize);
----------------
peter.smith wrote:
> Should J have a type of size_t?
It probably doesn't matter because J can never be larger than 7.


================
Comment at: lld/ELF/SyntheticSections.h:405
+template <class ELFT>
+class AndroidRelocationSection final : public RelocationBaseSection {
+  typedef typename ELFT::Rel Elf_Rel;
----------------
peter.smith wrote:
> Is it worth having Packed as part of the name, either PackedRelocationSection or the slightly overlong AndroidPackedRelocationSection. I think it will make it more obvious what the section does from the name.
AndroidPacked seems fine to me.


================
Comment at: lld/ELF/Writer.cpp:1381
+        InX::MipsGot->updateAllocSize();
+      Changed |= In<ELFT>::RelaDyn->updateAllocSize();
+    } while (Changed);
----------------
peter.smith wrote:
> Do we need a call to Script->assignAddresses() before updateAllocSize()? I'm thinking of the case where createThunks() affects the r_offsets in RelaDyn->updateAllocSize() that affects the size of RelaDyn, that affects the addresses in Thunks. I think that for correctness this all works out as we can't exit the loop unless there are no changes, but it may take longer to converge in some situations.
That wouldn't seem to be a concern for REL (because of delta encoding, the section (in general) will only contain relative offsets within the r/w segment (with one exception for the initial delta into the r/w segment), and r/w won't contain anything whose size will change, so I'd expect there to be at most two convergent steps), but it may be a concern for RELA (because it will also end up containing deltas within text and other sections). I'll have to experiment with this once I have RELA implemented.


================
Comment at: llvm/include/llvm/BinaryFormat/ELF.h:733
   SHT_LOOS = 0x60000000,           // Lowest operating system-specific type.
+  SHT_ANDROID_REL = 0x60000001,
+  SHT_ANDROID_RELA = 0x60000002,
----------------
peter.smith wrote:
> Would be good to have a reference to where these are defined (public specification?), similarly for the DT tags.
I haven't seen them formally defined anywhere. Probably the best we can do is link to the Android source code.


https://reviews.llvm.org/D39152





More information about the llvm-commits mailing list