<div dir="ltr">Ugh, my earlier response is awaiting moderator approval. Apparently with the response and all the other earlier emails quoted it was too long. Resending with a bunch of old quoted emails trimmed.<div>Teresa<br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, May 3, 2018 at 2:58 PM, Teresa Johnson <span dir="ltr"><<a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote"><span class="">On Thu, May 3, 2018 at 2:16 PM, Steven Wu <span dir="ltr"><<a href="mailto:stevenwu@apple.com" target="_blank">stevenwu@apple.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><br><div><span><br><blockquote type="cite"><div>On May 1, 2018, at 4:50 PM, Teresa Johnson via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="m_-3249515721473403394m_-5756488385649837037Apple-interchange-newline"><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none">Hi Mehdi, thanks for the comments, responses and a tweaked proposal below. Teresa<div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 1, 2018 at 11:37 AM, Mehdi AMINI<span class="m_-3249515721473403394m_-5756488385649837037Apple-converted-space"> </span><span dir="ltr"><<a href="mailto:joker.eph@gmail.com" target="_blank">joker.eph@gmail.com</a>></span><span class="m_-3249515721473403394m_-5756488385649837037Apple-converted-space"> </span>wr<wbr>ote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi,<div><br></div><div>My main concern is this one:</div><span><div><br></div><div>> Currently, I am emitting the summary entries at the end, after the metadata nodes. Note that the ModuleSummaryIndex is not currently referenced from the Module, and isn’t currently created when parsing the Module IR bitcode (there is a separate derived class for reading the ModuleSummaryIndex from bitcode). This is because they are not currently used at the same time. However, in the future there is no reason why we couldn’t tag the global values in the Module’s LLVM assembly with the corresponding summary entry if the ModuleSummaryIndex is available when printing the Module in the assembly writer. I.e. we could do the following for “main” from the above example when printing the IR definition (note the “^3” at the end):</div><div><br></div><div><br></div></span><div>I believe the reason that the ModuleSummaryIndex is not attached to the Module is that it is fundamentally not a piece of IR, but it is conceptually really an Analysis result.</div><div>Usually other analyses don't serialize their result, we happen to serialize this one for an optimization purpose (reloading it and making the thin-link faster).</div></div></blockquote><div><br></div><div>True. My understanding is that the push for having it serialized via assembly is due to the fact that it is emitted into the bitcode. I know there is disagreement on this reasoning, I am hoping to have a proposal that is acceptable to everyone. =)</div></div></div></div></div></div></blockquote><div><br></div></span><div>I totally agree with Mehdi’s concern. It is probably much easier to treat module summary as analysis result, rather than part of the module. Making it part of IR might be overkill just to fix the debugging and readability of the summary. </div><span><br><blockquote type="cite"><div><div dir="ltr" style="font-family:Helvetica;font-size:12px;font-style:normal;font-variant-caps:normal;font-weight:normal;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;text-decoration:none"><div><div class="gmail_extra"><div class="gmail_quote"><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>The fundamental problem is that an analysis result has to be able to be invalidated with IR changes, attaching this directly to the module wouldn't achieve this. The risk is that when the IR and the summary get out-of-sync (`clang -O2 my_module_with_summaries.ll -emit-llvm -o my_optimized module_with_summaries.ll`) the summaries would be badly wrong.</div><div><br></div><div>Have you looked into what it'd take to make it a "real" analysis in the pass manager?</div></div></blockquote><div><br></div><div>Thanks for raising this issue specifically, I hadn't addressed it in my proposal and it is a big one. I am not proposing that we attempt to maintain the summary through optimization passes, and definitely don't think we should do that. IMO deserializing it should be for testing the thin link and the combined summaries in the backends only. To that end, I have an idea (below some background first).</div><div><br></div><div>Note that in some cases the module summary analysis is an analysis pass. I.e. when invoked by "opt -module-summary=". However, some time ago when Peter added the support for splitting the bitcode (for CFI purposes) and therefore needed to generate a summary in each partition (Thin and Regular), he added the ThinLTOBitcodeWriterPass, which invokes the module summary builder directly (twice). This writer is what gets invoked now when building via "clang -flto=thin", and with "opt -thinlto-bc". So there it is not invoked/maintained as an analysis pass/result. It would be tricky to figure out how to even split rather than recompute the module summary index in that case. Even in the case where we are still invoking as an analysis pass (opt -module-summary), we would need to figure out how to read in the module summary to use as the analysis result when available (so that it could be invalidated and recomputed when stale).</div><div><br></div><div>Rather than add this plumbing, and just have it discarded if opt does any optimization, I think we should focus at least for the time being on supporting reading the summary from assembly exactly where we currently read in the summary from bitcode:</div><div>1) For the thin link (e.g. tools such as llvm-lto2 or llvm-lto, which currently have to be preceded by "opt -module-summary/-thinlto-bc" to generate an index, but could just build it from assembly instead).</div><div>2) For the LTO backends (e.g. tools such as llvm-lto which can consume a combined index and invoke the backends, or "clang -fthinlto-index=" for distributed ThinLTO backend testing), where we could build the combined summary index from assembly instead.</div><div><br></div><div>This greatly simplifies the reading side, as there are no optimizations performed on the IR after the index is read in these cases that would require invalidation. It also simplifies adding the parsing support, since it gets invoked exactly where we expect to build an index currently (i.e. since we don't currently build or store the ModuleSummaryIndex when parsing the Module from bitcode). It doesn't preclude someone from figuring out how to compute the module summary analysis result from the assembly, and invalidating it after optimization, when reading the Module IR via 'opt' in the future.</div><div><br></div><div>Does this seem like a reasonable proposal to everyone?</div></div></div></div></div></div></blockquote><div><br></div></span><div>I would +1 for making this an analysis pass. How about a ModuleSummaryAnalysis that knows how to serialize its result into bitcode and YAML format? We can also add a module flag to indicate if the module should contain module summary or not.</div><div><br></div><div>* For thinLTO compilation, bitcode writer runs the analysis pass if the module flag is set, and emit module summary into bitcode.</div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div><div>* When reading bitcode with module summary, the summary is parsed into ModuleSummaryAnalysis result. </div></div></div></blockquote><div><br></div></span><div>Note the reason we don't do this now is that we don't ever need to use a the summary at the same time as when we use the IR. We only read the summary for the thin link (no IR involved), or for distributed ThinLTO build backends (but only the combined index is read, the summary attached to the module is not read as it is not needed at all). Which is why I suggested the simplification in my proposal above of only reading the summary from assembly where we currently read the bitcode summary (which is not when we normally read the IR).</div><div><br></div><div>Also, note as I mentioned above that the default path may split the IR into regular LTO and ThinLTO partitions, which is why we stopped invoking it as analysis pass on the default path, and instead invoke the index builder directly for each partition. Figuring out how to split a single module summary analysis result into two parts is not something I think we should block this on.</div><span class=""><div><br></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div><div>* If there are any transform invalidates the analysis, module summary will be recomputed automatically when writing bitcode, otherwise, it will serialize out the same result as input (probably with some auto upgrade).</div><div>* For debugging, ModuleSummaryAnalysis can be serialized out to YAML</div><div>* For testing, ModuleSummaryAnalysis result can be created from YAML</div><div><br></div><div>I suggested YAML but it can be any other format, including LLVM assembly. For readability, I would prefer YAML.</div></div></div></blockquote><div><br></div></span><div>That is similar to the original proposal from last year, but there was not agreement from other upstream maintainers to dump as something other than assembly that could be serialized in.</div><div><br></div><div>That's why I'm proposing the version above - serialize in as assembly exactly where we consume the bitcode version of the module summary index today. The advantage is that it doesn't require analysis pass work and figuring out how to make the analysis pass work with split LTO partitions. At the same time, it doesn't preclude doing that in the future either. I think my proposal enables testing of the ThinLTO pipeline via textual assembly, which is one of the main goals of being able to serialize it back in.</div><span class="HOEnZb"><font color="#888888"><div><br></div><div>Teresa</div></font></span><div><div class="h5"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space"><div><div></div><div>Thanks</div><span class="m_-3249515721473403394HOEnZb"><font color="#888888"><div><br></div><div>Steven</div><br></font></span></div></div></blockquote></div></div></div></div></div></blockquote></div><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> 408-460-2413</td></tr></tbody></table></span></div>

</div></div></div>