<div dir="ltr">Hi Duncan,<div><br></div><div>Ping - Please let me know if the explanation from my last response satisfies your concern about why this information is separate for ThinLTO***. </div><div><br></div><div>Two more questions:</div><div>1) I tried to split up the first round into smallish patches for easier review and to incrementally add support - for example this one is dependent on a small patch to add the LLVM -fthinlto option support (D11907). Let me know if that is reasonable or if the patches need to be combined.</div><div>2) As mentioned here and in the RFC email I have not specified the records within the new bitcode blocks yet, in order to keep this initial patch small. Would it be better to propose and include support for the records in the bitcode blocks here so that I can show the full integration between this patch and the data structure patch (D11721)?</div><div><br></div><div>*** Specifically, the biggest reason was that the function's bitcode offset within the Module is only one small part of the ThinLTO index (the other parts being the path to the module needed in the combined index for use in importing, and summary information about the function that will evolve/grow over time to aid in importing decisions). See also the ThinLTOFunctionSummary struct in the latest D11721 patch. I also want to clarify that the eventual function importing patch will leverage some parts of the existing lazy-loader support when importing the functions. It would be technically possible to get rid of the bitcode index in the summary, but we would still need the rest of the summary info so most of the ThinLTO patches wouldn't change. Doing so would require scanning the whole module and building up the DeferredFunctionInfo array every time we open a module for importing a function. Adding the bitcode offset to the ThinLTO function summary allows us to simply set up the DeferredFunctionInfo for the single function being imported and then materialize() it without having to build up the DeferredFunctionInfo for all functions up to the one we are interested in. I.e. there is going to be no change to materialize() or findFunctionInStream() - it will immediately find the DeferredFunctionInfo entry for the function of interest and parse it. That's again coming in a later patch that is dependent on the ones I've already sent for review, but hopefully provides some more background on why I'm including the bitcode index into the ThinLTO function summary.</div><div><br></div><div><br></div><div>Thanks,</div><div>Teresa</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Aug 13, 2015 at 10:22 AM, Teresa Johnson <span dir="ltr"><<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Thu, Aug 13, 2015 at 9:42 AM, Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:dexonsmith@apple.com" target="_blank">dexonsmith@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div><br>

> On 2015-Aug-12, at 21:23, Teresa Johnson <<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>> wrote:<br>

><br>

> tejohnson updated the summary for this revision.<br>

> tejohnson updated this revision to Diff 32027.<br>

> tejohnson added a comment.<br>

><br>

> Removed native object wrapper support from this patch.<br>

><br>

><br>

> <a href="http://reviews.llvm.org/D11722" rel="noreferrer" target="_blank">http://reviews.llvm.org/D11722</a><br>

><br>

> Files:<br>

>  include/llvm/Bitcode/LLVMBitCodes.h<br>

>  include/llvm/Bitcode/ReaderWriter.h<br>

>  lib/Bitcode/Reader/BitcodeReader.cpp<br>

>  lib/Bitcode/Writer/BitcodeWriter.cpp<br>

>  tools/llvm-bcanalyzer/llvm-bcanalyzer.cpp<br>

><br>

</div></div>> <D11722.32027.patch><br>

<br>

(Sorry I'm so late getting to this patch.)<br></blockquote><div><br></div></span><div>Hi Duncan,</div><div>Thanks for the comments.</div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

I'm concerned about the direction.  IIUC, the purpose of this is faster<br>

lazy-loading.  Why do this in a side-channel?  Why not optimize the<br>

existing lazy-loader?  I imagine the LTO model implemented by the gold<br>

plugin would benefit from this, for example.  If we need an extra index<br>

to make lazy-loading "fast enough", it seems like the index should be<br>

available (as an opt-in for time/space tradeoff) for other consumers as<br>

well.  Alternatively, maybe there's something we can change about how<br>

the existing lazy-loader works (or the layout of bitcode) to be more<br>

amenable to the partial loading ThinLTO needs.<br></blockquote><div><br></div></span><div>Only one part of the index (the function's bitcode offset) is used to do the fast loading of the function from a given module. Much of the other information in the ThinLTO sections is to identify/locate the module to import from (via the module strtab index which is populated when creating the combined (global) index), and information used in making importing decisions from another module (like the function's size, hotness when there is profile, etc, which we don't have since we haven't yet parsed that other module).</div><div><br></div><div>The other issue with using the existing lazy loading support is that we may import from a number of other modules in some interleaved fashion, so we may open/import a function/close a module multiple times. My  understanding is that the lazy loading is more for the case of loading a few different functions in sequence.</div><div><br></div><div>So for example, if we have:</div><div><br></div><div>A.cc:</div><div><br></div><div>a() {</div><div>   b1();</div><div>   ... </div><div>   c1();</div><div>   ...</div><div>   b2();</div><div><div>   ...</div><div>   c2();</div></div><div>}</div><div><br></div><div>B.cc:</div><div><br></div><div>b1() { d1(); }</div><div>b2() { ... }</div><div><br></div><div>C.cc:</div><div><br></div><div>c1() { d2(); }</div><div>c2() { ... }</div><div><div><br>D.cc:</div><div><br></div><div>d1() { ... }</div><div>d2() { ... }</div></div><div><br></div><div><br></div><div>When compiling A.cc through the parallel backend's importing pass, we may have the following sequence of events:</div><div><br></div><div>Consider importing b1  -> Decide to import b1  (exposes call to d1)</div><div>Consider importing d1  -> Decide to import d1</div><div>Consider importing c1  -> Decide to import c1  (exposes call to d2)</div><div>Consider importing d2  -> Decide to import d2</div><div>Consider importing b2  -> Decide *not* to import b2</div><div>Consider importing c2  -> Decide to import c2</div><div><br></div><div>For each of the inter-module calls considered in some priority based order, we would look up the callee in the combined index, which provides some information such as the callee's size, hotness, etc. A decision on whether to import is made based on that information (without doing any parsing of the callee's module). If we decide to import, then the combined index entry for the callee provides the callee module path and the bitcode index, which is used to open the module and import just that function, then the callee's module is closed. Some of the lazy loading support could potentially be used in place of the bitcode index, although I am not convinced it is suited for that purpose. In any case, that is just a small part of what is contained in the function summary/index as described earlier.</div><div><br></div><div>On the flip side, the bitcode index could be used by the lazy loader, in place of building the index on the fly. I'm not sure how much benefit there is to lazy loading, which eventually needs to read the entire module once a function is materialized.</div><span class=""><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

Not included in this patch is support for lazy-loading of metadata,<br>

something you suggested you had support for in your prototype (and I<br>

assume ThinLTO relies on it).  I think it's particularly important that<br>

whatever you do there isn't ThinLTO-specific.<br>

<br>

</blockquote></span></div><div class="gmail_extra"><br></div>Right, that is needed when we do the importing. This initial patch is just to write/read the function summary/index. I had briefly outlined the way I was hoping to stage patches earlier today (<a href="http://lists.llvm.org/pipermail/llvm-dev/2015-August/089222.html" target="_blank">http://lists.llvm.org/pipermail/llvm-dev/2015-August/089222.html</a>). I was planning to do the importing as a separate set of patches once I have the infrastructure for generating/reading the function indexes as that is needed first to drive the importing. Let me know if the staging of work I describe there makes sense.</div><div class="gmail_extra"><br></div><div class="gmail_extra">To facilitate the process of reading/materializing a single function each time a module is imported from, as described above, I we be parsing the module level metadata once as a post-pass. Otherwise supporting the interleaved importing from multiple modules is probably very difficult. This is again different than the existing lazy loading, which is parsing/materializing all necessary functions and metadata out of each module in a single pass.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Let me know if this seems reasonable given the above example and description.</div><div class="gmail_extra"><br></div><div class="gmail_extra">Thanks,</div><div class="gmail_extra">Teresa</div><span class="HOEnZb"><font color="#888888"><div class="gmail_extra"><br clear="all"><div><br></div>-- <br><div><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> <a href="tel:408-460-2413" value="+14084602413" target="_blank">408-460-2413</a></td></tr></tbody></table></span></div>

</div></font></span></div>

</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> 408-460-2413</td></tr></tbody></table></span></div>

</div>