[PATCH] D76977: [lld-macho] Implement basic export trie

Jez Ng via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 9 16:21:30 PDT 2020


int3 planned changes to this revision.
int3 marked 2 inline comments as done.
int3 added inline comments.


================
Comment at: lld/MachO/ExportTrie.cpp:192-195
+  std::vector<TrieNode *> orderedNodes;
+  orderedNodes.reserve(allNodes.size());
+  for (const Symbol *sym : exported)
+    rootNode->orderNodes(*sym, orderedNodes);
----------------
smeenai wrote:
> int3 wrote:
> > This is basically a topological sort. It's unclear to me though why this approach (of finding every element starting from the root) was chosen vs doing a single preorder traversal of the trie...
> Not having looked at this in detail yet, would they give comparable results?
I think the only difference would be the order (and therefore the offsets) of the serialized nodes. Might make things more or less compact, it's unclear to me...


================
Comment at: lld/MachO/ExportTrie.cpp:75
+
+void TrieNode::addSymbol(const Symbol &entry,
+                         std::vector<TrieNode *> &allNodes) {
----------------
ruiu wrote:
> smeenai wrote:
> > ruiu wrote:
> > > ruiu wrote:
> > > > int3 wrote:
> > > > > ruiu wrote:
> > > > > > `addSymbol` seem to keep the internal trie consistent all the time, but I don't think we need to split the work between `addSymbol` and `build`. We can make `addSymbol` just to store an argument to an array, and move the code for trie construction to `build`.
> > > > > Not sure I understand what you're going for. Do you mean to have TrieBuilder::addSymbol collect the symbols to export in a vector, and then have `build()` call `TrieNode::addSymbol()` for all the symbols right before serializing the trie?
> > > > Yes, that's what I meant.
> > > What I wanted to say is that it looks like the trie implemented in this file is designed with the generic Map use cases in mind. You can add a new string one at a time to an existing trie, and you can also look it up as if it were a map (it is actually a map). In order to that, we create a trie in memory. But I think we can avoid that. We can just accumulate added strings to a vector, and in build() we sort the strings and directly emit a serialized trie from the sorted strings.
> > +1 – we should only need the trie for output purposes here.
> It is probably better to rewrite the code instead of borrowing it from the existing code. I believe it shouldn't take too much time, and it might even be a bit of fun to write from scratch.
> We can just accumulate added strings to a vector, and in build() we sort the strings and directly emit a serialized trie from the sorted strings.

Ah, I see what you mean. That would definitely be more memory/allocation-efficient. I guess we should do a radix sort instead of a simple std::sort so we don't compare prefixes needlessly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76977/new/

https://reviews.llvm.org/D76977





More information about the llvm-commits mailing list