<div dir="ltr">Hi David,<div><br></div><div>Thanks for the review! Responses below.</div><div><br></div><div>Teresa</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 11, 2016 at 10:23 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 11, 2016 at 10:52 AM, Teresa Johnson via llvm-commits <span dir="ltr"><<a href="mailto:llvm-commits@lists.llvm.org" target="_blank">llvm-commits@lists.llvm.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Author: tejohnson<br>
Date: Fri Mar 11 12:52:24 2016<br>
New Revision: 263275<br>
<br>
URL: <a href="http://llvm.org/viewvc/llvm-project?rev=263275&view=rev" rel="noreferrer" target="_blank">http://llvm.org/viewvc/llvm-project?rev=263275&view=rev</a><br>
Log:<br>
[ThinLTO] Support for reference graph in per-module and combined summary.<br>
<br>
Summary:<br>
This patch adds support for including a full reference graph including<br>
call graph edges and other GV references in the summary.<br>
<br>
The reference graph edges can be used to make importing decisions<br>
without materializing any source modules, can be used in the plugin<br>
to make file staging decisions for distributed build systems, and is<br>
expected to have other uses.<br>
<br>
The call graph edges are recorded in each function summary in the<br>
bitcode via a list of <CalleeValueIds, StaticCount> tuples when no PGO<br>
data exists, or <CalleeValueId, StaticCount, ProfileCount> pairs when<br>
there is PGO, where the ValueId can be mapped to the function GUID via<br>
the ValueSymbolTable. In the function index in memory, the call graph<br>
edges reference the target via the CalleeGUID instead of the<br>
CalleeValueId.<br>
<br>
The reference graph edges are recorded in each summary record with a<br>
list of referenced value IDs, which can be mapped to value GUID via the<br>
ValueSymbolTable.<br>
<br>
Addtionally, a new summary record type is added to record references<br>
from global variable initializers. A number of bitcode records and data<br>
structures have been renamed to reflect the newly expanded scope of the<br>
summary beyond functions. More cleanup will follow.<br>
<br>
Reviewers: joker.eph, davidxl<br>
<br>
Subscribers: joker.eph, llvm-commits<br>
<br>
Differential Revision: <a href="http://reviews.llvm.org/D17212" rel="noreferrer" target="_blank">http://reviews.llvm.org/D17212</a><br>
<br></blockquote></div></div></div></blockquote><div> </div><div><snipped></div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br></blockquote></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+<br>
+/// \brief Function summary information to aid decisions and implementation of<br>
+/// importing.<br>
+class FunctionSummary : public GlobalValueSummary {<br></blockquote><div><br></div><div>This is a slightly odd inheritance, having no virtual functions (had it had virtual functions the warnings related to missing virtual dtors would've fired).<br></div></div></div></div></blockquote><div><br></div><div>Ah, good to know that distinction in getting the warning.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>Should some functionality be rolled into these classes to generalize their use, that would be virtual? (ie: the only way to use this class is with a dyn_cast to access the members - perhaps the code that dyn_casts could instead be calling a virtual function to dispatch between the two cases?)<br></div></div></div></div></blockquote><div><br></div><div>I suspect as we add uses of the combined index in the function importing and other places that we will need to add more methods that would be appropriate as virtual functions. Currently there wasn't a need though (the one place doing the dyn_casts below seems clearer to me with casting).</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><br>Speaking of dyn casts... </div><div> </div></div></div></div></blockquote><div><br></div><div><snipped></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
-/// Emit the per-module function summary section alongside the rest of<br>
+// Collect the global value references in the given variable's initializer,<br>
+// and emit them in a summary record.<br>
+static void WriteModuleLevelReferences(const GlobalVariable &V,<br>
+                                       const ValueEnumerator &VE,<br>
+                                       SmallVector<uint64_t, 64> &NameVals,<br>
+                                       unsigned FSModRefsAbbrev,<br>
+                                       BitstreamWriter &Stream) {<br>
+  DenseSet<unsigned> RefEdges;<br>
+  SmallPtrSet<const User *, 8> Visited;<br>
+  findRefEdges(&V, VE, RefEdges, Visited);<br>
+  unsigned RefCount = RefEdges.size();<br>
+  if (RefCount) {<br>
+    NameVals.push_back(VE.getValueID(&V));<br>
+    NameVals.push_back(getEncodedLinkage(V.getLinkage()));<br>
+    for (auto RefId : RefEdges) {<br>
+      NameVals.push_back(RefId);<br>
+    }<br>
+    Stream.EmitRecord(bitc::FS_PERMODULE_GLOBALVAR_INIT_REFS, NameVals,<br>
+                      FSModRefsAbbrev);<br>
+    NameVals.clear();<br>
+  }<br>
+}<br>
+<br>
+/// Emit the per-module summary section alongside the rest of<br>
 /// the module's bitcode.<br>
-static void WritePerModuleFunctionSummary(<br>
-    DenseMap<const Function *, std::unique_ptr<FunctionInfo>> &FunctionIndex,<br>
+static void WritePerModuleGlobalValueSummary(<br>
+    DenseMap<const Function *, std::unique_ptr<GlobalValueInfo>> &FunctionIndex,<br>
     const Module *M, const ValueEnumerator &VE, BitstreamWriter &Stream) {<br>
-  Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);<br>
+  if (M->empty())<br>
+    return;<br>
<br>
-  // Abbrev for FS_CODE_PERMODULE_ENTRY.<br>
+  Stream.EnterSubblock(bitc::GLOBALVAL_SUMMARY_BLOCK_ID, 3);<br>
+<br>
+  // Abbrev for FS_PERMODULE.<br>
   BitCodeAbbrev *Abbv = new BitCodeAbbrev();<br>
-  Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_PERMODULE_ENTRY));<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE));<br>
   Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // valueid<br>
   Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
   Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount<br>
-  unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs<br>
+  // numrefs x valueid, n x (valueid, callsitecount)<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);<br>
<br>
-  SmallVector<unsigned, 64> NameVals;<br>
+  // Abbrev for FS_PERMODULE_PROFILE.<br>
+  Abbv = new BitCodeAbbrev();<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_PROFILE));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // valueid<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs<br>
+  // numrefs x valueid, n x (valueid, callsitecount, profilecount)<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);<br>
+<br>
+  // Abbrev for FS_PERMODULE_GLOBALVAR_INIT_REFS.<br>
+  Abbv = new BitCodeAbbrev();<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_PERMODULE_GLOBALVAR_INIT_REFS));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // valueid<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));  // valueids<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSModRefsAbbrev = Stream.EmitAbbrev(Abbv);<br>
+<br>
+  SmallVector<uint64_t, 64> NameVals;<br>
   // Iterate over the list of functions instead of the FunctionIndex map to<br>
   // ensure the ordering is stable.<br>
   for (const Function &F : *M) {<br>
@@ -2817,9 +2974,9 @@ static void WritePerModuleFunctionSummar<br>
     assert(FunctionIndex.count(&F) == 1);<br>
<br>
     WritePerModuleFunctionSummaryRecord(<br>
-        NameVals, FunctionIndex[&F]->functionSummary(),<br>
-        VE.getValueID(M->getValueSymbolTable().lookup(F.getName())), FSAbbrev,<br>
-        Stream);<br>
+        NameVals, dyn_cast<FunctionSummary>(FunctionIndex[&F]->summary()),<br></blockquote><div><br></div><div>Given this is a FunctionIndex, I assume all its summaries are FunctionSummaries? So presumably this could be a cast instead of a dyn_cast?</div></div></div></div></blockquote><div><br></div><div>Good point, this particular map (unlike the full index built by the reader) will always contain only function summaries. Will change to cast.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+        VE.getValueID(M->getValueSymbolTable().lookup(F.getName())),<br>
+        FSCallsAbbrev, FSCallsProfileAbbrev, Stream, F);<br>
   }<br>
<br>
   for (const GlobalAlias &A : M->aliases()) {<br>
@@ -2830,10 +2987,25 @@ static void WritePerModuleFunctionSummar<br>
       continue;<br>
<br>
     assert(FunctionIndex.count(F) == 1);<br>
+    FunctionSummary *FS =<br>
+        dyn_cast<FunctionSummary>(FunctionIndex[F]->summary());<br></blockquote><div><br>Looks like this ^ should be a cast, not a dyn_cast, since the result is immediately dereferenced in the following line \/<br></div></div></div></div></blockquote><div><br></div><div>Will change.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+    // Add the alias to the reference list of aliasee function.<br>
+    FS->addRefEdge(<br>
+        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())));<br>
     WritePerModuleFunctionSummaryRecord(<br>
-        NameVals, FunctionIndex[F]->functionSummary(),<br>
-        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())), FSAbbrev,<br>
-        Stream);<br>
+        NameVals, FS,<br>
+        VE.getValueID(M->getValueSymbolTable().lookup(A.getName())),<br>
+        FSCallsAbbrev, FSCallsProfileAbbrev, Stream, *F);<br>
+  }<br>
+<br>
+  // Capture references from GlobalVariable initializers, which are outside<br>
+  // of a function scope.<br>
+  for (const GlobalVariable &G : M->globals())<br>
+    WriteModuleLevelReferences(G, VE, NameVals, FSModRefsAbbrev, Stream);<br>
+  for (const GlobalAlias &A : M->aliases()) {<br>
+    const auto *GV = dyn_cast<GlobalVariable>(A.getBaseObject());<br>
+    if (GV)<br></blockquote><div><br></div><div>If you like, you could roll this ^ declaration into the if. (& then drop the {} around the loop too, if that suits)</div></div></div></div></blockquote><div><br></div><div>Yes, will fix.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
+      WriteModuleLevelReferences(*GV, VE, NameVals, FSModRefsAbbrev, Stream);<br>
   }<br>
<br>
   Stream.ExitBlock();<br>
@@ -2841,35 +3013,132 @@ static void WritePerModuleFunctionSummar<br>
<br>
 /// Emit the combined function summary section into the combined index<br>
 /// file.<br>
-static void WriteCombinedFunctionSummary(const FunctionInfoIndex &I,<br>
-                                         BitstreamWriter &Stream) {<br>
-  Stream.EnterSubblock(bitc::FUNCTION_SUMMARY_BLOCK_ID, 3);<br>
+static void WriteCombinedGlobalValueSummary(<br>
+    const FunctionInfoIndex &I, BitstreamWriter &Stream,<br>
+    std::map<uint64_t, unsigned> &GUIDToValueIdMap, unsigned GlobalValueId) {<br>
+  Stream.EnterSubblock(bitc::GLOBALVAL_SUMMARY_BLOCK_ID, 3);<br>
<br>
-  // Abbrev for FS_CODE_COMBINED_ENTRY.<br>
+  // Abbrev for FS_COMBINED.<br>
   BitCodeAbbrev *Abbv = new BitCodeAbbrev();<br>
-  Abbv->Add(BitCodeAbbrevOp(bitc::FS_CODE_COMBINED_ENTRY));<br>
-  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // modid<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid<br>
   Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
-  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8)); // instcount<br>
-  unsigned FSAbbrev = Stream.EmitAbbrev(Abbv);<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs<br>
+  // numrefs x valueid, n x (valueid, callsitecount)<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSCallsAbbrev = Stream.EmitAbbrev(Abbv);<br>
<br>
-  SmallVector<unsigned, 64> NameVals;<br>
+  // Abbrev for FS_COMBINED_PROFILE.<br>
+  Abbv = new BitCodeAbbrev();<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_PROFILE));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // instcount<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 4));   // numrefs<br>
+  // numrefs x valueid, n x (valueid, callsitecount, profilecount)<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSCallsProfileAbbrev = Stream.EmitAbbrev(Abbv);<br>
+<br>
+  // Abbrev for FS_COMBINED_GLOBALVAR_INIT_REFS.<br>
+  Abbv = new BitCodeAbbrev();<br>
+  Abbv->Add(BitCodeAbbrevOp(bitc::FS_COMBINED_GLOBALVAR_INIT_REFS));<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));   // modid<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5)); // linkage<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));    // valueids<br>
+  Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));<br>
+  unsigned FSModRefsAbbrev = Stream.EmitAbbrev(Abbv);<br>
+<br>
+  SmallVector<uint64_t, 64> NameVals;<br>
   for (const auto &FII : I) {<br>
     for (auto &FI : FII.second) {<br>
-      FunctionSummary *FS = FI->functionSummary();<br>
-      assert(FS);<br>
+      GlobalValueSummary *S = FI->summary();<br>
+      assert(S);<br>
+<br>
+      if (auto *VS = dyn_cast<GlobalVarSummary>(S)) {<br>
+        assert(!VS->refs().empty() && "Expected at least one ref edge");<br>
+        NameVals.push_back(I.getModuleId(VS->modulePath()));<br>
+        NameVals.push_back(getEncodedLinkage(VS->linkage()));<br>
+        for (auto &RI : VS->refs()) {<br>
+          const auto &VMI = GUIDToValueIdMap.find(RI);<br>
+          unsigned RefId;<br>
+          // If this GUID doesn't have an entry, assign one.<br>
+          if (VMI == GUIDToValueIdMap.end()) {<br>
+            GUIDToValueIdMap[RI] = ++GlobalValueId;<br>
+            RefId = GlobalValueId;<br>
+          } else {<br>
+            RefId = VMI->second;<br>
+          }<br>
+          NameVals.push_back(RefId);<br>
+        }<br>
+<br>
+        // Record the starting offset of this summary entry for use<br>
+        // in the VST entry. Add the current code size since the<br>
+        // reader will invoke readRecord after the abbrev id read.<br>
+        FI->setBitcodeIndex(Stream.GetCurrentBitNo() +<br>
+                            Stream.GetAbbrevIDWidth());<br>
+<br>
+        // Emit the finished record.<br>
+        Stream.EmitRecord(bitc::FS_COMBINED_GLOBALVAR_INIT_REFS, NameVals,<br>
+                          FSModRefsAbbrev);<br>
+        NameVals.clear();<br>
+        continue;<br>
+      }<br>
<br>
+      auto *FS = dyn_cast<FunctionSummary>(S);<br>
+      assert(FS);<br></blockquote><div><br></div><div>Replace dyn_cast + assert with cast.</div></div></div></div></blockquote><div><br></div><div>ok</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
       NameVals.push_back(I.getModuleId(FS->modulePath()));<br>
-      NameVals.push_back(getEncodedLinkage(FS->getFunctionLinkage()));<br>
+      NameVals.push_back(getEncodedLinkage(FS->linkage()));<br>
       NameVals.push_back(FS->instCount());<br>
+      NameVals.push_back(FS->refs().size());<br>
+<br></blockquote></div></div></div></blockquote><div> </div><div><snipped> </div></div><div><br></div>-- <br><div class="gmail_signature"><span style="font-family:Times;font-size:medium"><table cellspacing="0" cellpadding="0"><tbody><tr style="color:rgb(85,85,85);font-family:sans-serif;font-size:small"><td nowrap style="border-top-style:solid;border-top-color:rgb(213,15,37);border-top-width:2px">Teresa Johnson |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(51,105,232);border-top-width:2px"> Software Engineer |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(0,153,57);border-top-width:2px"> <a href="mailto:tejohnson@google.com" target="_blank">tejohnson@google.com</a> |</td><td nowrap style="border-top-style:solid;border-top-color:rgb(238,178,17);border-top-width:2px"> 408-460-2413</td></tr></tbody></table></span></div>
</div></div>