[LLVMdev] RFC - Improvements to PGO profile support

Philip Reames listmail at philipreames.com
Tue Mar 24 13:54:43 PDT 2015


On 03/24/2015 10:34 AM, Xinliang David Li wrote:
>> I follow what you're trying to say.  :)
>>
>> Also, trusting exact entry counts is going to be somewhat suspect.  These
>> are *highly* susceptible to racy updates, overflow, etc...  Anything which
>> puts too much implicit trust in these numbers is going to be problematic.
> No, we won't be using profile counts in the absolute sense, a.k.a.  if
> (count == 100000).  The counts is used as global hotness relative to
> others. It is also used together with program summary data.
>
>> I have no objection to adding a mechanism for expressing an entry count.  I
>> am still very hesitant about the proposals with regards to redefining the
>> current MD_prof.
> For now, we won't touch MD_prof's definition.
>
>> I'd encourage you to post a patch for the entry count mechanism, but not tie
>> its semantics to exact execution count.  (Something like "the value provided
>> must correctly describe the relative hotness of this routine against others
>> in the program annoatated with the same metadata.  It is the relative
>> scaling that is important, not the absolute value.  In particular, the value
>> need not be an exact execution count.")
>>
> Define entry count in a more flexible way is fine -- as long as the
> implementation can choose to use 'execution count' :)
Absolutely agreed here.  :)
>
> thanks,
>
> David
>
>>> Many of the other issues you raise seem like they could also be addressed
>>> without major changes to the existing infrastructure. Let’s try to fix those
>>> first.
>>
>> That's exactly the point of the proposal.  We definitely don't want to make
>> major changes to the infrastructure at first. My thinking is to start
>> working on making MD_prof a real count. One of the things that are happening
>> is that the combination of real profile data plus the frequency propagation
>> that we are currently doing is misleading the analysis.
>>
>> I consider this a major change.  You're trying to redefine a major part of
>> the current system.
>>
>> Multiple people have spoken up and objected to this change (as currently
>> described).  Please start somewhere else.
>>
>>
>> For example (thanks David for the code and data). In the following code:
>>
>> int g;
>> __attribute__((noinline)) void bar() {
>>   g++;
>> }
>>
>> extern int printf(const char*, ...);
>>
>> int main()
>> {
>>    int i, j, k;
>>
>>    g = 0;
>>
>>    // Loop 1.
>>    for (i = 0; i < 100; i++)
>>      for (j = 0; j < 100; j++)
>>         for (k = 0; k < 100; k++)
>>             bar();
>>
>>    printf ("g = %d\n", g);
>>    g = 0;
>>
>>    // Loop 2.
>>    for (i = 0; i < 100; i++)
>>      for (j = 0; j < 10000; j++)
>>          bar();
>>
>>    printf ("g = %d\n", g);
>>    g = 0;
>>
>>
>>    // Loop 3.
>>    for (i = 0; i < 1000000; i++)
>>      bar();
>>
>>    printf ("g = %d\n", g);
>>    g = 0;
>> }
>>
>> When compiled with profile instrumentation, frequency propagation is
>> distorting the real profile because it gives different frequency to the
>> calls to bar() in the 3 different loops. All 3 loops execute 1,000,000
>> times, but after frequency propagation, the first call to bar() gets a
>> weight of 520,202 in loop #1, 210,944 in  loop #2 and 4,096 in loop #3. In
>> reality, every call to bar() should have a weight of 1,000,000.
>>
>> Duncan responded to this.  My conclusion from his response: this is a bug,
>> not a fundamental issue.  Remove the max scaling factor, switch the counts
>> to 64 bits and everything should be fine.  If you disagree, let's discuss.
>>
>>
>>
>> Thanks.  Diego.
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>





More information about the llvm-dev mailing list