[PATCH] D17212: [ThinLTO] Support for call graph in per-module and combined summary.

Mehdi Amini via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 25 08:35:44 PST 2016


> On Feb 25, 2016, at 7:12 AM, Teresa Johnson <tejohnson at google.com> wrote:
> 
> tejohnson added a comment.
> 
> Responses to Mehdi's comments.
> 
> 
> ================
> Comment at: include/llvm/Bitcode/LLVMBitCodes.h:190-203
> @@ -189,5 +189,16 @@
>   enum FunctionSummarySymtabCodes {
> -    FS_CODE_PERMODULE_ENTRY = 1,  // FS_ENTRY: [valueid, linkage, instcount]
> -    FS_CODE_COMBINED_ENTRY  = 2,  // FS_ENTRY: [modid, linkage, instcount]
> +    // PERMODULE_NOCALLS: [valueid, linkage, instcount]
> +    FS_PERMODULE_NOCALLS = 1,
> +    // PERMODULE_CALLS: [valueid, linkage, instcount, n x valueid]
> +    FS_PERMODULE_CALLS = 2,
> +    // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
> +    //                           n x (valueid, count)]
> +    FS_PERMODULE_CALLS_PROFILE = 3,
> +    // COMBINED_NOCALLS: [modid, linkage, instcount]
> +    FS_COMBINED_NOCALLS  = 4,
> +    // COMBINED_CALLS: [modid, linkage, instcount, n x valueid]
> +    FS_COMBINED_CALLS  = 5,
> +    // COMBINED_CALLS_PROFILE: [modid, linkage, instcount, n x (valueid, count)]
> +    FS_COMBINED_CALLS_PROFILE  = 6,
>   };
> 
> ----------------
> Agreed, although I think we are still in the heavy churn tuning phase. 
> 
> ================
> Comment at: include/llvm/Bitcode/LLVMBitCodes.h:195
> @@ +194,3 @@
> +    // PERMODULE_CALLS_PROFILE: [valueid, linkage, instcount,
> +    //                           n x (valueid, count)]
> +    FS_PERMODULE_CALLS_PROFILE = 3,
> ----------------
> Actually my first implementation of this used a single record id. What I did was add a bit to indicate whether there was any profile data, so that the reader knew how to interpret the array data at the end (as either a list of just callee ids, or <callee id, count> pairs). I think I used the size of the record to deduce whether there was any call information. I decided to switch to explicitly encoding this information in the record id because it seemed cleaner and more consistent, and also removed a bit from each record (the profile flag). This reduces the size for large .o bitcode files and for the combined index, although the size for small .o bitcode files is a bit larger due to the 2 extra abbrev ids.
> 
> I could probably go back to using the size to deduce the presence or absence of call information, and reduce this to 2 per summary type. WDYT?

I think this is good as is considering what we can do within the limit of the bitcode format. Either way is fine!

-- 
Mehdi


> 
> ================
> Comment at: include/llvm/IR/FunctionInfo.h:97
> @@ +96,3 @@
> +
> +  /// Record a call graph edge from this function to the function identified
> +  /// by \p CalleeGUID, with cumulative profile count (across all calls from
> ----------------
> joker.eph wrote:
>>> Without profile data, it might be useful to record the number of static callsites from a function to the callee. This can help compiler backend to make better global decisions later (e.g. better inlining and enable more function GC).
>> 
>> Can you elaborate how would you use this information and how it would help inlining and GC? 
> I think what David was alluding to was similar to what I was describing on IRC yesterday - if you know you have a single static callsite via the summary, you can give the same inline cost benefit currently given for internal(ized) single-callsite functions. Then if it is inlined, hopefully the original out-of-line function can be eliminated via linker GC.
> 
> To do this, the combined index creation step would need to aggregate the # static callsites to each function and put that into the callee function's combined summary.
> 
> ================
> Comment at: include/llvm/IR/FunctionInfo.h:298
> @@ +297,3 @@
> +  /// Get ownership of the function summary unique_ptr.
> +  std::unique_ptr<FunctionSummary> getFunctionSummary() {
> +    return std::move(Summary);
> ----------------
> joker.eph wrote:
>> I think we usually prefix `takeXXX` and not `getXXX` when ownership is transferred.
> Ok, will fix.
> 
> 
> http://reviews.llvm.org/D17212
> 
> 
> 



More information about the llvm-commits mailing list