[LLVMdev] RFC - Improvements to PGO profile support

Xinliang David Li xinliangli at gmail.com
Tue Mar 24 12:50:10 PDT 2015


On Tue, Mar 24, 2015 at 12:45 PM, Chandler Carruth <chandlerc at google.com>
wrote:

>
> On Tue, Mar 24, 2015 at 11:46 AM, Xinliang David Li <xinliangli at gmail.com>
> wrote:
>
>> On Tue, Mar 24, 2015 at 11:29 AM, Chandler Carruth <chandlerc at google.com>
>> wrote:
>>
>>> Sorry I haven't responded earlier, but one point here still doesn't make
>>> sense to me:
>>>
>>> On Tue, Mar 24, 2015 at 10:27 AM, Xinliang David Li <davidxl at google.com>
>>> wrote:
>>>
>>>> Diego and I have discussed this according to the feedback received. We
>>>> have revised plan for this (see Diego's last reply).  Here is a more
>>>> detailed re-cap:
>>>>
>>>> 1) keep MD_prof definition as it is today; also keep using the
>>>> frequency propagation as it is (assuming programs with irreducible
>>>> loops are not common and not important. If it turns out to be
>>>> otherwise, we will revisit this).
>>>> 2) fix all problems that lead to wrong 'frequency/count' computed from
>>>> the frequency propagation algorithm
>>>>    2.1) relax 32bit limit
>>>>
>>>
>>> I still don't understand why this is important or useful.... Maybe I'm
>>> just missing something.
>>>
>>> Given the current meaning of MD_prof, it seems like the result of
>>> limiting this to 32-bits is that the maximum relative ratio of
>>> probabilities between two successors of a basic block with N successors is
>>> (2 billion / N):1 -- what is the circumstance that makes this resolution
>>> insufficient?
>>>
>>> It also doesn't seem *bad* per-se, I just don't see what it improves,
>>> and it does cost memory...
>>>
>>
>> right -- there is some ambiguity here -- it is needed if we were to
>> change MD_prof's definition to represent branch count.  However, with the
>> new plan, the removal of the limit only applies to the function entry count
>> representation planned.
>>
>
> Ah, ok, that makes more sense.
>
> I'm still curious, is the ratio of 2 billion : 1 insufficient between the
> hottest basic block in the inner most loop and the entry block? My
> intuition is that this ratio encapsulates all the information we could
> meaningfully make decisions based upon, and I don't have any examples where
> it falls over, but perhaps you have some examples?
>

The ratio is not the problem. The problem is that we can no longer
effectively differentiate hot functions. 2 billion vs 4 billion will look
the same with the small capping.

David



>
> (Note, the 4096 scaling limit thing is completely separate, that makes
> perfect sense to me.)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150324/2f2cfdf8/attachment.html>


More information about the llvm-dev mailing list