[PATCH] D17212: [ThinLTO] Support for call graph in per-module and combined summary.

Teresa Johnson via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 25 07:12:02 PST 2016

tejohnson added a comment.

Responses to Mehdi's comments.

Comment at: include/llvm/Bitcode/LLVMBitCodes.h:190-203
@@ -189,5 +189,16 @@
   enum FunctionSummarySymtabCodes {
-    FS_CODE_PERMODULE_ENTRY = 1,  // FS_ENTRY: [valueid, linkage, instcount]
-    FS_CODE_COMBINED_ENTRY  = 2,  // FS_ENTRY: [modid, linkage, instcount]
+    // PERMODULE_NOCALLS: [valueid, linkage, instcount]
+    // PERMODULE_CALLS: [valueid, linkage, instcount, n x valueid]
+    // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
+    //                           n x (valueid, count)]
+    // COMBINED_NOCALLS: [modid, linkage, instcount]
+    // COMBINED_CALLS: [modid, linkage, instcount, n x valueid]
+    // COMBINED_CALLS_PROFILE: [modid, linkage, instcount, n x (valueid, count)]
Agreed, although I think we are still in the heavy churn tuning phase. 

Comment at: include/llvm/Bitcode/LLVMBitCodes.h:195
@@ +194,3 @@
+    // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
+    //                           n x (valueid, count)]
Actually my first implementation of this used a single record id. What I did was add a bit to indicate whether there was any profile data, so that the reader knew how to interpret the array data at the end (as either a list of just callee ids, or <callee id, count> pairs). I think I used the size of the record to deduce whether there was any call information. I decided to switch to explicitly encoding this information in the record id because it seemed cleaner and more consistent, and also removed a bit from each record (the profile flag). This reduces the size for large .o bitcode files and for the combined index, although the size for small .o bitcode files is a bit larger due to the 2 extra abbrev ids.

I could probably go back to using the size to deduce the presence or absence of call information, and reduce this to 2 per summary type. WDYT?

Comment at: include/llvm/IR/FunctionInfo.h:97
@@ +96,3 @@
+  /// Record a call graph edge from this function to the function identified
+  /// by \p CalleeGUID, with cumulative profile count (across all calls from
joker.eph wrote:
> > Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC).
> Can you elaborate how would you use this information and how it would help inlining and GC? 
I think what David was alluding to was similar to what I was describing on IRC yesterday - if you know you have a single static callsite via the summary, you can give the same inline cost benefit currently given for internal(ized) single-callsite functions. Then if it is inlined, hopefully the original out-of-line function can be eliminated via linker GC.

To do this, the combined index creation step would need to aggregate the # static callsites to each function and put that into the callee function's combined summary.

Comment at: include/llvm/IR/FunctionInfo.h:298
@@ +297,3 @@
+  /// Get ownership of the function summary unique_ptr.
+  std::unique_ptr<FunctionSummary> getFunctionSummary() {
+    return std::move(Summary);
joker.eph wrote:
> I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred.
Ok, will fix.


More information about the llvm-commits mailing list