[PATCH] D76977: [lld-macho] Implement basic export trie

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Apr 28 18:22:50 PDT 2020


int3 marked 5 inline comments as done.
int3 added a comment.

Post-order wouldn't work because the offsets are from the start of the export trie section, not from the parent node... not sure why it was designed that way :/



================
Comment at: lld/MachO/ExportTrie.cpp:15
+//
+//             +--+--+
+//             | _ba |
----------------
smeenai wrote:
> From what I understand of the trie structure, the strings for children are actually stored in the parent. This is interesting by itself, and it also implies that the root node is always the empty string. I think it's worth making a note of that and reflecting that in the diagram as well.
good catch, will fix


================
Comment at: lld/MachO/ExportTrie.cpp:99
+    uint32_t nodeSize = getULEB128Size(flags) + getULEB128Size(info->address);
+    assert(nodeSize < 256);
+    *buf++ = nodeSize;
----------------
smeenai wrote:
> Why this assert? The nodeSize (or TerminalSize as macho2yaml terms it) is a ULEB128, right?
nope, it's a byte. Only the offsets are ulebs


================
Comment at: lld/MachO/ExportTrie.cpp:107
+  }
+  // Add number of children. TODO: Handle case where we have more than 256.
+  assert(edges.size() < 256);
----------------
smeenai wrote:
> You'd just have to split up the node in that case, right?
yeah


================
Comment at: lld/MachO/ExportTrie.cpp:152
+  // the pivot.
+  const Symbol *pivotSymbol = vec[0];
+  int pivot = charAt(pivotSymbol, pos);
----------------
smeenai wrote:
> Would it be preferable to use either the median or a random pivot? (I'm assuming the standard quicksort considerations for pivots apply, namely that always using the first element would trigger the quadratic worst case for an already sorted input.)
yeah they do apply. I'll pick the median


================
Comment at: lld/MachO/ExportTrie.cpp:184
+    // This is the tail-call-optimized version of the following:
+    // multikeySort(vec.slice(i, j - i), lastPos, pos + 1, node);
+    vec = vec.slice(i, j - i);
----------------
smeenai wrote:
> Did you mean sortAndBuild?
> 
> Is there an easy way to either express this as a loop or trust that the compiler will do the tail call optimization? I don't mind the goto very much, but I'm wondering how it compares to the alternatives.
Oops, this was copypasta from llvm-mc's original implementation (they called it multikeySort there)

The TCO was also from that implementation, but I think it's a "standard" quicksort optimization too, at least for the normal two-way quicksort. Wikipedia says that it's "suggested by Sedgwick and widely used in practice": https://en.wikipedia.org/wiki/Quicksort#Optimizations


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76977/new/

https://reviews.llvm.org/D76977





More information about the llvm-commits mailing list