[LLVMdev] RFC - Improvements to PGO profile support
Xinliang David Li
davidxl at google.com
Tue Mar 24 10:34:20 PDT 2015
> I follow what you're trying to say. :)
>
> Also, trusting exact entry counts is going to be somewhat suspect. These
> are *highly* susceptible to racy updates, overflow, etc... Anything which
> puts too much implicit trust in these numbers is going to be problematic.
No, we won't be using profile counts in the absolute sense, a.k.a. if
(count == 100000). The counts is used as global hotness relative to
others. It is also used together with program summary data.
>
> I have no objection to adding a mechanism for expressing an entry count. I
> am still very hesitant about the proposals with regards to redefining the
> current MD_prof.
For now, we won't touch MD_prof's definition.
>
> I'd encourage you to post a patch for the entry count mechanism, but not tie
> its semantics to exact execution count. (Something like "the value provided
> must correctly describe the relative hotness of this routine against others
> in the program annoatated with the same metadata. It is the relative
> scaling that is important, not the absolute value. In particular, the value
> need not be an exact execution count.")
>
Define entry count in a more flexible way is fine -- as long as the
implementation can choose to use 'execution count' :)
thanks,
David
>
>>
>> Many of the other issues you raise seem like they could also be addressed
>> without major changes to the existing infrastructure. Let’s try to fix those
>> first.
>
>
> That's exactly the point of the proposal. We definitely don't want to make
> major changes to the infrastructure at first. My thinking is to start
> working on making MD_prof a real count. One of the things that are happening
> is that the combination of real profile data plus the frequency propagation
> that we are currently doing is misleading the analysis.
>
> I consider this a major change. You're trying to redefine a major part of
> the current system.
>
> Multiple people have spoken up and objected to this change (as currently
> described). Please start somewhere else.
>
>
> For example (thanks David for the code and data). In the following code:
>
> int g;
> __attribute__((noinline)) void bar() {
> g++;
> }
>
> extern int printf(const char*, ...);
>
> int main()
> {
> int i, j, k;
>
> g = 0;
>
> // Loop 1.
> for (i = 0; i < 100; i++)
> for (j = 0; j < 100; j++)
> for (k = 0; k < 100; k++)
> bar();
>
> printf ("g = %d\n", g);
> g = 0;
>
> // Loop 2.
> for (i = 0; i < 100; i++)
> for (j = 0; j < 10000; j++)
> bar();
>
> printf ("g = %d\n", g);
> g = 0;
>
>
> // Loop 3.
> for (i = 0; i < 1000000; i++)
> bar();
>
> printf ("g = %d\n", g);
> g = 0;
> }
>
> When compiled with profile instrumentation, frequency propagation is
> distorting the real profile because it gives different frequency to the
> calls to bar() in the 3 different loops. All 3 loops execute 1,000,000
> times, but after frequency propagation, the first call to bar() gets a
> weight of 520,202 in loop #1, 210,944 in loop #2 and 4,096 in loop #3. In
> reality, every call to bar() should have a weight of 1,000,000.
>
> Duncan responded to this. My conclusion from his response: this is a bug,
> not a fundamental issue. Remove the max scaling factor, switch the counts
> to 64 bits and everything should be fine. If you disagree, let's discuss.
>
>
>
> Thanks. Diego.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
More information about the llvm-dev
mailing list