[PATCH] D76977: [lld-macho] Implement basic export trie

Tue Apr 28 23:10:51 PDT 2020

int3 added inline comments.

================
Comment at: lld/MachO/ExportTrie.cpp:184
+    // This is the tail-call-optimized version of the following:
+    // multikeySort(vec.slice(i, j - i), lastPos, pos + 1, node);
+    vec = vec.slice(i, j - i);
----------------
smeenai wrote:
> int3 wrote:
> > int3 wrote:
> > > smeenai wrote:
> > > > Did you mean sortAndBuild?
> > > > 
> > > > Is there an easy way to either express this as a loop or trust that the compiler will do the tail call optimization? I don't mind the goto very much, but I'm wondering how it compares to the alternatives.
> > > Oops, this was copypasta from llvm-mc's original implementation (they called it multikeySort there)
> > > 
> > > The TCO was also from that implementation, but I think it's a "standard" quicksort optimization too, at least for the normal two-way quicksort. Wikipedia says that it's "suggested by Sedgwick and widely used in practice": https://en.wikipedia.org/wiki/Quicksort#Optimizations
> > It seems like there's no way to force the compiler to do TCO. There are only attributes to tell clang *not* to do TCO.
> Yeah. LLVM has a `musttail` attribute, but I don't think that's exposed to clang, and IIRC even that doesn't guarantee a tail call. Ideally the optimizer would be able to figure this out on its own, but there's no guarantee of that.
> 
> Any opinions on the current `goto` structure vs. wrapping the whole thing in a `while (!vec.empty())` and adding an explicit return in the `isTerminal` case? I think that'd be equivalent, though I guess the `goto` makes the tail recursive call more apparent.
Yeah I think `while()` here would make for more confusing code for the reason you mentioned (plus add more indentation :P)

================
Comment at: lld/test/MachO/symtab.s:6

 # CHECK:      Symbols [
 # CHECK-NEXT:   Symbol {
----------------
MaskRay wrote:
> I wish we can have a conciser `llvm-readobj` output... Due to llvm-readobj's verbosity, I prefer `llvm-readelf` for ELF objects...
Hm, `llvm-objdump --syms` is terser, though I'm not sure it prints all the symbol info (in particular the flags). For the other tests where I just want the addresses of the symbols, I use `objdump`; but since this test is specifically for the symtab, I think it makes sense to get the most detailed representation of the symbols.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76977/new/

https://reviews.llvm.org/D76977