[PATCH] D13308: Value profiling - remaining LLVM changes

Betul Buyukkurt via llvm-commits llvm-commits at lists.llvm.org
Wed Oct 14 12:51:03 PDT 2015


Hi David,

Thanks for your comments. Inline are the replies to your comments.

> On Mon, Oct 5, 2015 at 10:16 AM, Betul Buyukkurt <betulb at codeaurora.org>
wrote:
>> Hi David,
>> Thanks for your comments. I've posted earlier today the CL for the
compiler-rt changes i.e. http://reviews.llvm.org/D9009 .
>> These two (LLVM and compiler-rt) CL's have to be merged together to not
break the profiler. The raw profile file is not backwards compatible.
http://reviews.llvm.org/D9009 implements the format discussed in the
below
>> referenced threads back in early June:
>> http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150525/279181.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150601/279539.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150608/280575.html
>>> davidxl added inline comments.
>>> ================
>>> Comment at: include/llvm/IR/IntrinsicInst.h:377
>>> @@ +376,3 @@
>>> +  /// This represents the llvm.instrprof_value_profile_target intrinsic.
>>> +  class InstrProfValueProfileTargetInst : public IntrinsicInst { + 
public:
>>> ----------------
>>> Suggest dropping 'Target' in the name. Just
InstrProfValueProfileInstr.
>>> Target is strongly tied to indirect call value profiling.
>>> ================
>>> Comment at: include/llvm/IR/IntrinsicInst.h:403
>>> @@ +402,3 @@
>>> +
>>> +    ConstantInt *getIndex() const {
>>> +      return cast<ConstantInt>(const_cast<Value
*>(getArgOperand(4)));
>>> ----------------
>>> Add  a comment: returns value site index.
>>> ================
>>> Comment at: include/llvm/IR/Intrinsics.td:325
>>> @@ +324,3 @@
>>> +// through instrumentation based profiling.
>>> +def int_instrprof_value_profile_target : Intrinsic<[],
>>> +                                                   [llvm_ptr_ty,
llvm_i64_ty,
>>> ----------------
>>> Suggest drop 'target'.

I've dropped all instances of name 'target' from the value profiling
related intrinsics and related code.

>>> ================
>>> Comment at: include/llvm/ProfileData/InstrProfReader.h:172
>>> @@ -159,1 +171,3 @@
>>>    const char *ProfileEnd;
>>> +  const uint32_t *CurrentVDataCounts;
>>> +  const ValueData *CurrentVData;
>>> ----------------
>>> Realistically it is not useful to track more than 10 values per value
site.
>>> For this reason, please change the VDataCount type to be unsigned
char.

This is a pointer to the uint32_t ValueDataCounts array returned by the  
__llvm_profile_gather_value_data function in InstrProfiling.c of
compiler-rt. So changing it to any other type indicates changes to the
runtime.

A call to __llvm_profile_instrument_target at runtime, causes first the  
__llvm_profile_value_node **ValueCounters field of the __llvm_profile_data
checked for null. If it's null, we first create an array of
__llvm_profile_value_node* of size sum of all elements of NumValueSites.
This forms a row of head pointers for our linked lists i.e. one linked
list per value site.

Then the __llvm_profile_instrument_target call, finds the head for the
supplied site index. Walks the linked list for the value site, and if it
can not locate the target value, appends a new node to the linked list.

This means we maintain as many linked lists as there are value sites at
run-time. Once the execution hits at exit, we iterate through the
__llvm_profile_data elements to turn all the value profiling data into two
arrays.

First array is a uint32_t array that shows the number of
target_value/numtaken pairs present per value site. Its size is equivalent
to the sum of all elements of NumValueSites. The second array is the
__llvm_profile_value_data array that is formed by appending all the data
fields of the linked lists together. These two arrays are what gets dumped
into the raw profile data file. Therefore, the value profile data is in
the same order as the __llvm_profile_data in file.

Raw profile reader uses the NumValueSites array to decide how many
elements to read from the value data counts array (i.e. the first array
described in the paragraph before). Using the counts stored in the value
data counts array, the reader decides how many elements to read per value
site from the second array.

>>> ================
>>> Comment at: lib/ProfileData/InstrProfReader.cpp:235
>>> @@ -234,1 +234,3 @@
>>>    auto NamesSize = swap(Header.NamesSize);
>>> +  auto PaddingSize = sizeof(uint64_t) - NamesSize % sizeof(uint64_t);
+  auto ValueDataCountsSize = swap(Header.ValueDataCountsSize);
----------------
>>> Please draw a diagram describing raw data layout here:
>>> Header:
>>>   ....
>>> prf_data_section:
>>>    ....
>>> prf_cnts_section:
>>>    ...
>>> prf_name_section:
>>>     ...
>>> padding1
>>> vdata_counts_section: // # of counts per site
>>>   ...
>>> padding2
>>> vdata_value_section:
>>>    ...
>>> ================
>>> Comment at: lib/ProfileData/InstrProfReader.cpp:333
>>> @@ +332,3 @@
>>> +      InstrProfValueSiteRecord NewVRecord;
>>> +      for (uint32_t VIndex = 0; VIndex < CurrentVDataCount; ++VIndex) {
>>> +        uint64_t TargetValue = swap(CurrentVData->Value);
>>> ----------------
>>> Should CurrentVData be set to Data->Values?
>>> Current implementation seems to assume that VData and VDataCounts are
laid
>>> out in the same order as the prf_data wrt the owning functions. I
don't
>>> think this assumption is right.
>> The value profile data appears as linked lists hanging from the per
function profile data structs at runtime. Therefore, an extra step is
performed to turn the value profile data lists into a consecutive array
form. We do that by iterating over the per function profile data
structs.
>> This as an artifact caused the value profile data and the count arrays to
>> be laid out in the same order with the per function profile data.
> I think it is better to not depend on this implementation detail at
runtime.  The current implementation also depends on the fact that raw
profile data is accessed sequentially.  Hidden assumptions like this are
not great. It also make the access of value profile data for a given
function is different from edge profile counters.

The __llvm_profile_data array in the raw profile file can only be accessed
sequentially. However, I do agree that the alignment of value profile data
to _llvm_profile_data is an implementation artifact and is not uniform wrt
how the name/counter arrays are being access currently.

> The problem can be easily solved: there are two fields in ProfileData
structure that can be used for this purpose:
> 1) the functionPointer field which is currently not used can be used to
store the PerSiteCountsPtr for that function

Function pointer is currently used to find out the name of the target
function by comparing it against the target values returned from profiling
indirect call sites. We may avoid having the function pointer iff
llvm-profdata is taught to accept the instrumented binary i.e. used to
generate the profile data.

> 2) the Values field can be used to store ValuePtr for the function In
the header, add PerSiteCountsDelta and ValueDelta field.

Sure, currently the Values field is not processed by the reader. It may be
used to store the ValuePtr, however unless llvm-profdata is taught to read
in the instrumented binary, we need an extra field in the
__llvm_profile_data struct for the PerSiteCountsPtr.

Thanks,
-Betul

> With those,  Value data for a given function can be obtained similarly
to getCounter
> David
>> Raw profile file format is as follows:
>> Header: Added in two uint64_t elements for value data counts array size
and value data array size
>> Data array : Array may end at a uint32_t boundary when generated on
uint32_t targets. I'll have to fix the current CL to maintain data
sections ending at uint64_t boundaries. Current CL does not have a
padding
>> after the data array.
>> Counters array : same as before
>> Name array : same as before
>> Padding to end the name array at uint64_t boundary : same as before
Value data counts array: Array of uint32_t values, each value indicates
how many uint64_t data pairs {TargetValue, NumTaken} to read from the
value data array per value site.
>> Padding to end the value data counts array at uint64_t boundary Value
data array: Array of uint64_t
>> -Betul
>>> http://reviews.llvm.org/D13308






More information about the llvm-commits mailing list