[LLVMdev] IC profiling infrastructure

Mon Jun 15 21:16:26 PDT 2015

Xinliang David Li <davidxl at google.com> writes:
> On Mon, Jun 15, 2015 at 5:35 PM, Justin Bogner <mail at justinbogner.com> wrote:
>> "Betul Buyukkurt" <betulb at codeaurora.org> writes:
>>>> - We don't need to store the value profiling kind in the data at all.
>>>>   The frontend knows which invocations of the intrinsic are for each kind
>>>>   implicitly, much like it knows the difference between a counter for an
>>>>   "if" and a "for" apart implicitly. Just store one set of profiling data
>>>>   and let the frontend sort it out.
>>>
>>> I think so too. However, value_kind should stay around until the kinds of
>>> expressions whose values are computed are included in the hash
>>> computation. Otherwise, if a value-profiled expression is to be removed
>>> from source, then all the rest of the values would get attached to the MD
>>> fields of wrong types. Currently hash checking verifies if the counter
>>> assigned regions in the source has been changed across profile consumption
>>> runs.
>>
>> This isn't a good reason to keep value_kind - the function hash not
>> reflecting these kinds of changes is a problem regardless of whether or
>> not we have multiple types. We should just add hash computations for
>> these indirect call counters. As long as we only add this when indirect
>> profiling is turned on it won't invalidate old hashes so it's trivial to
>> stay backwards compatible.
>
> I think there are other better reasons why keeping value kind is
> desirable. Different kinds of value profile data needs different
> 'preprocessing' during reading. For instance, indirect call target
> needs to translate target address into function name. For length data,
> we may want to compute average length; for data address, we may want
> to compute the min-alignment for each site. It will be hard to do so
> with out value kind info.

But since the consumer is the frontend, and it knows which counts are
which, it knows the kind, no? I don't understand how storing the kind
info helps or hurts - it's just redundant.

>>
>> The other thing we should do is store which profiling options are
>> enabled, in both formats. When we specify profile-instr-use it'll be
>> less error prone if we can detect whether or not this includes indirect
>> call profiling without checking other options, and it's probably a good
>> idea for "llvm-profdata merge" to disallow merging profiles that are
>> gathering different sets of data. A bitfield seems suitable for this.
>
> For value profiling, allowing profile-gen and profile-use passes using
> different options is a useful feature. Consider the following
> scenarios:
> 1) collect the same profile data once with all kinds of value profile
>    data collected. The exact same profile data can be used in performance
>    experiments with different kinds of value profiling enabled/disabled

This use case is pretty easy to accomodate for with the model where the
profile stores its type. We could autodetect the type to decide how to
interpret the profile, and if more specific profile flags are set simply
ignore some of the data. The thing that the file format storing the type
gives us is that it's harder to misinterpret the data by supplying the
mismatching flags between reading and writing.

> 2) work around profile-use/value transformation bug selectively for
>    some file without the need to change anything in instrumentation pass

I'm not sure I understand what you mean here.

> Besides, with the latest patch, the value_kind is not recorded in each
> profile value thus the overhead is minimized.

The overhead is low, sure, but the code complexity of dealing with the
multiple kinds in this way is quite real. Since I'm not convinced we're
getting real benefit from storing the kind, I don't believe this
trade-off is worth it.

> thanks,
>
> David