[PATCH] D13815: [ELF2] Add support for Gnu Hash section

Fri Oct 16 23:01:24 PDT 2015

ikudrin added inline comments.

================
Comment at: ELF/OutputSections.cpp:246-247
@@ +245,4 @@
+                              llvm::ELF::SHF_ALLOC) {
+  this->Header.sh_entsize =
+      sizeof(Elf_Word) == sizeof(Elf_Off) ? sizeof(Elf_Word) : 0;
+  this->Header.sh_addralign = sizeof(Elf_Off);
----------------
ruiu wrote:
> What is this condition for?
It may be rewritten as (Is64Bits ? 0 : 4), but my variant shows the reason for this, I hope.
See http://sourceware.org/ml/binutils/2006-10/msg00377.html:
>The .gnu.hash section uses sh_entsize 4 for 32-bit ELF and sh_entsize 0
> for 64-bit ELF (as it has then non-uniform entity size; 4 32-bit words,
> variable count of ELFCLASS sized words and then variable count of 32-bit
> words; it doesn't repeat the historic mistakes on Alpha/s390x with .hash).


================
Comment at: ELF/OutputSections.cpp:255-276
@@ +254,24 @@
+
+template <class ELFT>
+const unsigned GnuHashTableSection<ELFT>::NBucketsEstimation[] = {
+    1,     //      0..     1
+    1,     //              2
+    3,     //      3..     4
+    3,     //      5..     8
+    7,     //      9..    16
+    13,    //     17..    32
+    31,    //     33..    64
+    61,    //     65..   128
+    127,   //    129..   256
+    251,   //    257..   512
+    509,   //    513..  1024
+    1021,  //   1025..  2048
+    2039,  //   2049..  4096
+    4093,  //   4097..  8192
+    8191,  //   8193.. 16384
+    16381, //  16385.. 32768
+    32749, //  32769.. 65536
+    65521, //  65537..131072
+    131071 // 131073..
+};
+
----------------
ruiu wrote:
> Where does this implementation come from? 
I've just selected prime numbers which are not greater than lower bound of each range. Haven't done any deep investigations, it just looks reasonable for the first try.

================
Comment at: ELF/OutputSections.cpp:786-804
@@ -632,17 +785,21 @@
 template <class ELFT>
-void SymbolTableSection<ELFT>::writeGlobalSymbols(uint8_t *&Buf) {
+void SymbolTableSection<ELFT>::writeGlobalSymbols(uint8_t *Buf) {
   // Write the internal symbol table contents to the output symbol table
   // pointed by Buf.
-  uint8_t *Start = Buf;
+  auto *Start = reinterpret_cast<Elf_Sym *>(Buf);
+  auto *Current = Start;
   for (const std::pair<StringRef, Symbol *> &P : Table.getSymbols()) {
     StringRef Name = P.first;
     Symbol *Sym = P.second;
     SymbolBody *Body = Sym->Body;
     if (!includeInSymtab<ELFT>(*Body))
       continue;
-    if (StrTabSec.isDynamic() && !includeInDynamicSymtab(*Body))
-      continue;
-
-    auto *ESym = reinterpret_cast<Elf_Sym *>(Buf);
-    Buf += sizeof(*ESym);
+    Elf_Sym *ESym;
+    if (StrTabSec.isDynamic()) {
+      if (!includeInDynamicSymtab(*Body))
+        continue;
+      ESym = Start + Body->getDynamicSymbolTableIndex() - 1;
+    } else {
+      ESym = Current++;
+    }
 
----------------
ruiu wrote:
> What is this change for?
Gnu hash section requires reordering of dynamic symbol table entries. If Gnu hash section is in action, it sets correct indexes in its finalize() method; otherwise, indexes just follow the original order. So, in case of a dynamic symbol table, we write the entry right into its place without any additional sorting.

================
Comment at: ELF/OutputSections.h:45-46
@@ -44,3 +44,4 @@
 
 bool includeInDynamicSymtab(const SymbolBody &B);
+bool includeInGnuHashTable(const SymbolBody &B);
 
----------------
ruiu wrote:
> I'm wondering if we really need two distinct functions for the traditional hash table and GNU's. GNU hash table is smaller than the traditional one because it omits some symbols. Technically, such symbols can be removed from the traditional hash table without hurting correctness.
It sounds reasonable, but I tried to make this patch to solve just one problem. You may notice that all existent tests are still intact. We may optimize a traditional hash section in a separate patch.


http://reviews.llvm.org/D13815