<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Sep 15, 2014 at 4:00 PM, Nick Kledzik <span dir="ltr"><<a href="mailto:kledzik@apple.com" target="_blank">kledzik@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><span class=""><div>On Sep 12, 2014, at 4:38 PM, Rui Ueyama <<a href="mailto:ruiu@google.com" target="_blank">ruiu@google.com</a>> wrote:</div><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Fri, Sep 12, 2014 at 4:22 PM, <a href="mailto:kledzik@apple.com" target="_blank">kledzik@apple.com</a> <span dir="ltr"><<a href="mailto:kledzik@apple.com" target="_blank">kledzik@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I assume that preload() return immediately, and that it is expected to spin off some thread to parse an archive member?  If so, we have no overall throttle on how many threads will be started (a hundred undefines could spin up 100 threads).  Also, how is the archive reader to coordinate if the Resolver gets to the point it really wants an object file to fulfill and undefine but some other thread is busy parsing that member?<br></blockquote><div><br></div><div>We should throttle on how many thread will be started using a tasks scheduler. The global scheduler should restrict the number of tasks less than some number that varies depending on the number of available cores, and that should also reuse (kernel) threads.</div><div><br></div><div>We also should make a rendezvous point at where the main thread wait for the speculative load task if they are trying to parse the same file. The easiest way to do that is probably representing a file begin parsed as a future. A pre-parse task first create a future and sets the value to it when it's done. The main thread is blocked if the result is not yet available.</div></div></div></div></blockquote></span><div>This generally an interesting direction.  The devil is in the details.   </div><span class=""><br><blockquote type="cite"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

Don't we just have a producer/consumer problem where the archive reader is the producer and the resolver is the consumer.  The consumer is single threaded and currently queries (pulls from) the producer on the consumer thread.  Can the driver start up some producer task for archive reading to pre-parse archives?  Rather than blindly parsing all members, your idea of passing undefined symbols names to the producer is a good idea.<br></blockquote><div><br></div><div>The driver could trigger pre-parsing archives, but I'm afraid it's inefficient, as you might have implied. The driver does not know anything about symbols, so it has no idea which files in an archive file is going to be used. If most of the files in an archive file are going to be used, we could blindly parse all of them, but I think that's not the case. Only the resolver can make a good guess.</div></div></div></div>

</blockquote></span></div>Yes, and I’d imagine in most cases there will not be a false positive because it is unlikely for an archive to implement something that is also in a dylib or other .o file.<div><br></div><div>Do you think this could handle —whole-archive processing too? I think right now a single thread is used to parse everything in an archive in —whole-archive mode.  It would be nice to parallelize that too.</div></div></blockquote><div><br></div><div>Yeah, I have never thought about that, and that's independent from this proposal, but it's definitely doable. That's a good idea.</div></div></div></div>