<div dir="ltr"><div dir="ltr">On Thu, Apr 8, 2021 at 9:13 AM Jez via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Alexandre Ganea <<a href="mailto:alexandre.ganea@ubisoft.com" target="_blank">alexandre.ganea@ubisoft.com</a>> 9:05 AM (2 hours ago) to<br>
me, <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> When you say you'd like to parallelize LLD, which driver do you mean?<br>
<br>
I meant the Mach-O driver; sorry for omitting context.<br>
<br>
> There's already a specialized lock-free hashtable for the Debug Types merging the COFF driver<br>
<br>
Oh, that's cool! Let me see if I can build upon it...<br></blockquote><div><br></div><div>If you are attempting to parallelize the symbol resolution phase, I can try to share some thoughts on how to generalize that table over to parallel symbol resolution.</div><div><br></div><div>Right now, symbol resolution in COFF, and presumably the other linkers, is interleaved with input loading, since referencing a symbol can pull in stuff from an archive. The first thing that needs to be done is to split that up into phases, where the linker merges all the symbols of all of the inputs it knows about, and then successively loads more inputs and symbols. Some of this work is already done, but at least for COFF, more restructuring needs to be done.</div><div><br></div><div>Once the restructuring is done, the merging algorithm should be pretty simple. As inputs, there are:</div><div>- Symbol hash table from previous round of symbol resolution</div><div>- New object files to load, each with a symbol table</div><div><br></div><div>The output should be:</div><div>- The hashtable, composed of 64-bit cells indicating which symbol prevailed, identified by a pair of 32-bit numbers, object file index, and symbol table index</div><div><div><br></div>I recommend splitting up symbol resolution into a fully data parallel phase and a "shuffle" phase that does the concurrent hash table insertion. This makes rehashing very simple: if the table gets too full, simply abort the concurrent insertions, sync up the worker threads, and retry with a bigger table. The map phase should read the object file and pre-calculate a good hash for every external symbol. After all symbols have been hashed, the shuffle phase should attempt to concurrently insert into the table, and CAS into the table if the current symbol has a higher precedence than the symbol in the table. The algorithm for deciding which symbol prevails is format-specific and implements the rules for things like weak symbols and link line order. It should be deterministic.</div><div><br></div><div>That sketch leaves a lot to be desired, but that's the project I've been planning for COFF for a while now.</div><div><div></div></div></div></div>