[PATCH] Indirect call target profiling related profile reader/writer changes

Thu Apr 9 14:23:45 PDT 2015

On 04/09/2015 11:06 AM, Betul Buyukkurt wrote:
>
> In http://reviews.llvm.org/D8908#153838, @reames wrote:
>
>> Have the IR level construct patches made it up for review?  If so, can
>
> So far I've posted two patches. These two patches should apply cleanly to the tip, working with the present profile infrastructure. The next set of patches will be the enabler ones: i.e. three more patches one for each of clang, llvm and compiler-rt. Clang patch will be up for review later today.
>
>> you send me a link?  I managed to miss them.
>
> So far there is this patch and the instrinsic instruction definitions: http://reviews.llvm.org/D8877. All patches are necessary for getting the IC targets and having them displayed by the llvm-profdata.
Ok, I'm really not convinced that the instrumentation code needs to be 
or should be an intrinsic.  This seems like something which should be 
emitted by the frontend and optimized like any other code.  To say this 
a different way, my instrumentation is going to be entirely different 
than your instrumentation.

Having said that, I really don't care about this part of the proposed 
changes since they aren't going to impact me at all.  I'm am 
specifically not objecting to the changes, just commenting.  :)
>
>> I'm assuming this will be some type of per call site metadata?
>
> We do assign metadata at the indirect call sites. Format looks like as follows:
>
> !33 = metadata !{metadata !"indirect_call_targets", i64 <total_exec_count>, metadata !"target_fn1”, i64 <target_fn1_count>, metadata !"target_fn2”, i64 <target_fn2_count>, ….}
>
> Currently, we're recording only the top most called five function names at each indirect call site. Following the string literal “indirect_call_targets” are the fields  <total_exec_count> i.e. a 64 bit value for the total number of times the indirect call is executed followed by the function names and execution counts of each target.
This was the part I was trying to ask about.  I really want to see where 
you're going with this optimization wise.  My naive guess is that this 
is going to be slightly off for what you actually want.

Assuming you're going for profile guided devirtualization (and thus 
inlining), being able to check the type of the receiver (as opposed to 
the result of the virtual lookup) might be advantageous.  (Or, to say it 
differently, that's what I'm used to seeing.  Your approach might be 
completely reasonable, it's just not what I'm used to seeing.)  Have you 
thought about the tradeoffs here?

>
> Thanks.
> -Betul
>
>
> http://reviews.llvm.org/D8908
>
> EMAIL PREFERENCES
>    http://reviews.llvm.org/settings/panel/emailpreferences/
>
>