[PATCH] Indirect call target profiling related profile reader/writer changes

Wed Apr 15 09:33:54 PDT 2015

>
>> On Apr 10, 2015, at 9:25 AM, betulb at codeaurora.org wrote:
>>
>>>
>>> On 04/09/2015 11:06 AM, Betul Buyukkurt wrote:
>>>>
>>>> In http://reviews.llvm.org/D8908#153838, @reames wrote:
>>>>
>>>>> Have the IR level construct patches made it up for review?  If so,
>>>>> can
>>>>
>>>> So far I've posted two patches. These two patches should apply cleanly
>>>> to the tip, working with the present profile infrastructure. The next
>>>> set of patches will be the enabler ones: i.e. three more patches one
>>>> for
>>>> each of clang, llvm and compiler-rt. Clang patch will be up for review
>>>> later today.
>>>>
>>>>> you send me a link?  I managed to miss them.
>>>>
>>>> So far there is this patch and the instrinsic instruction definitions:
>>>> http://reviews.llvm.org/D8877. All patches are necessary for getting
>>>> the
>>>> IC targets and having them displayed by the llvm-profdata.
>>> Ok, I'm really not convinced that the instrumentation code needs to be
>>> or should be an intrinsic.  This seems like something which should be
>>> emitted by the frontend and optimized like any other code.  To say this
>>> a different way, my instrumentation is going to be entirely different
>>> than your instrumentation.
>>>
>>> Having said that, I really don't care about this part of the proposed
>>> changes since they aren't going to impact me at all.  I'm am
>>> specifically not objecting to the changes, just commenting.  :)
>>>>
>>>>> I'm assuming this will be some type of per call site metadata?
>>>>
>>>> We do assign metadata at the indirect call sites. Format looks like as
>>>> follows:
>>>>
>>>> !33 = metadata !{metadata !"indirect_call_targets", i64
>>>> <total_exec_count>, metadata !"target_fn1Ã¢ÂÂ, i64
>>>> <target_fn1_count>,
>>>> metadata !"target_fn2Ã¢ÂÂ, i64 <target_fn2_count>, Ã¢ÂÂ¦.}
>>>>
>>>> Currently, we're recording only the top most called five function
>>>> names
>>>> at each indirect call site. Following the string literal
>>>> Ã¢ÂÂindirect_call_targetsÃ¢ÂÂ are the fields  <total_exec_count>
>>>> i.e. a 64
>>>> bit value for the total number of times the indirect call is executed
>>>> followed by the function names and execution counts of each target.
>>> This was the part I was trying to ask about.  I really want to see
>>> where
>>> you're going with this optimization wise.  My naive guess is that this
>>> is going to be slightly off for what you actually want.
>>>
>>> Assuming you're going for profile guided devirtualization (and thus
>>> inlining), being able to check the type of the receiver (as opposed to
>>> the result of the virtual lookup) might be advantageous.  (Or, to say
>>> it
>>> differently, that's what I'm used to seeing.  Your approach might be
>>> completely reasonable, it's just not what I'm used to seeing.)  Have
>>> you
>>> thought about the tradeoffs here?
>>
>> Not sure if I understood the problem here, however, we're recording both
>> the target address and the addresses/names of the instrumented functions
>> during the execution of the instrumented binary. During profile reading
>> these addresses are used to match the target addresses to corresponding
>> functions. During optimization, target function name should be adequate
>> to
>> retrieve the function type information which then can be used to compare
>> the types of the arguments as well as the return type w/ the call site
>> to
>> ensure the match.
>>
>> To clarify, our interest at this time is to upstream all the
>> infrastructure related changes. The optimizations using the profile data
>> will not be part of our initial set of patches. On the other hand,
>> agreeing on the metadata format at the call sites would be advantageous
>> in
>> terms of supporting multiple profile methods.
>>
>> Thanks,
>> -Betul
>
> I have the same questions and concerns as Philip. If the goal of this is
> to support devirtualization, this is not how I would expect it to be done.
> If you really want something more general, can you provide data to show
> that it is worth the cost? Recording all that information when profiling
> will not be cheap, and it adds significant complexity as well.
>
> I also donât think we should consider taking all of your infrastructure
> changes without first seeing at least preliminary versions of the
> optimization patches that demonstrate the benefits.

We'll be posting the optimization related patches up for review as well.
We're in the process of adapting our changes to the community mainline and
collecting new numbers on gains. The proposed gains were collected off of
our internal mainline which is slightly different than the community's.
We'd appreciate comments to direct the patches in the due direction.

Thanks,
-Betul

>
>>
>>>
>>>>
>>>> Thanks.
>>>> -Betul
>>>>
>>>>
>>>> http://reviews.llvm.org/D8908 <http://reviews.llvm.org/D8908>
>>>>
>>>> EMAIL PREFERENCES
>>>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>>> <http://reviews.llvm.org/settings/panel/emailpreferences/>
>