[PATCH] D17212: [ThinLTO] Support for call graph in per-module and combined summary.
Teresa Johnson via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 25 07:12:02 PST 2016
tejohnson added a comment.
Responses to Mehdi's comments.
================
Comment at: include/llvm/Bitcode/LLVMBitCodes.h:190-203
@@ -189,5 +189,16 @@
enum FunctionSummarySymtabCodes {
- FS_CODE_PERMODULE_ENTRY = 1, // FS_ENTRY: [valueid, linkage, instcount]
- FS_CODE_COMBINED_ENTRY = 2, // FS_ENTRY: [modid, linkage, instcount]
+ // PERMODULE_NOCALLS: [valueid, linkage, instcount]
+ FS_PERMODULE_NOCALLS = 1,
+ // PERMODULE_CALLS: [valueid, linkage, instcount, n x valueid]
+ FS_PERMODULE_CALLS = 2,
+ // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
+ // n x (valueid, count)]
+ FS_PERMODULE_CALLS_PROFILE = 3,
+ // COMBINED_NOCALLS: [modid, linkage, instcount]
+ FS_COMBINED_NOCALLS = 4,
+ // COMBINED_CALLS: [modid, linkage, instcount, n x valueid]
+ FS_COMBINED_CALLS = 5,
+ // COMBINED_CALLS_PROFILE: [modid, linkage, instcount, n x (valueid, count)]
+ FS_COMBINED_CALLS_PROFILE = 6,
};
----------------
Agreed, although I think we are still in the heavy churn tuning phase.
================
Comment at: include/llvm/Bitcode/LLVMBitCodes.h:195
@@ +194,3 @@
+ // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
+ // n x (valueid, count)]
+ FS_PERMODULE_CALLS_PROFILE = 3,
----------------
Actually my first implementation of this used a single record id. What I did was add a bit to indicate whether there was any profile data, so that the reader knew how to interpret the array data at the end (as either a list of just callee ids, or <callee id, count> pairs). I think I used the size of the record to deduce whether there was any call information. I decided to switch to explicitly encoding this information in the record id because it seemed cleaner and more consistent, and also removed a bit from each record (the profile flag). This reduces the size for large .o bitcode files and for the combined index, although the size for small .o bitcode files is a bit larger due to the 2 extra abbrev ids.
I could probably go back to using the size to deduce the presence or absence of call information, and reduce this to 2 per summary type. WDYT?
================
Comment at: include/llvm/IR/FunctionInfo.h:97
@@ +96,3 @@
+
+ /// Record a call graph edge from this function to the function identified
+ /// by \p CalleeGUID, with cumulative profile count (across all calls from
----------------
joker.eph wrote:
> > Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC).
>
> Can you elaborate how would you use this information and how it would help inlining and GC?
I think what David was alluding to was similar to what I was describing on IRC yesterday - if you know you have a single static callsite via the summary, you can give the same inline cost benefit currently given for internal(ized) single-callsite functions. Then if it is inlined, hopefully the original out-of-line function can be eliminated via linker GC.
To do this, the combined index creation step would need to aggregate the # static callsites to each function and put that into the callee function's combined summary.
================
Comment at: include/llvm/IR/FunctionInfo.h:298
@@ +297,3 @@
+ /// Get ownership of the function summary unique_ptr.
+ std::unique_ptr<FunctionSummary> getFunctionSummary() {
+ return std::move(Summary);
----------------
joker.eph wrote:
> I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred.
Ok, will fix.
http://reviews.llvm.org/D17212
More information about the llvm-commits
mailing list