[llvm] r200219 - [vectorize] Initial version of respecting PGO in the vectorizer: treat

Mon Jan 27 14:00:23 PST 2014

On Mon, Jan 27, 2014 at 1:48 PM, Chandler Carruth <chandlerc at gmail.com>wrote:

> On Mon, Jan 27, 2014 at 1:41 PM, Chandler Carruth <chandlerc at gmail.com>wrote:
>
>> ---- Block Freqs ----
>>>  entry = 1.0
>>>   entry -> if.else = 0.375
>>>   entry -> if.then = 0.625
>>>  if.then = 0.625
>>>   if.then -> if.end22 = 0.625
>>>  if.else = 0.375
>>>   if.else -> for.cond.preheader = 0.1406
>>>   if.else -> if.end22 = 0.23437
>>>  for.cond.preheader = 0.1406
>>>   for.cond.preheader -> for.body.lr.ph = 0.08789
>>>   for.cond.preheader -> for.end = 0.05273
>>>  for.body.lr.ph = 0.08789                   ### Preheader in question
>>>   for.body.lr.ph -> for.body = 0.08789
>>>  for.body = 2.8125                          ### Loop in question
>>>
>>
>> Oh goodness. These static frequencies don't really make any sense at all.
>> But they're also not wrong at all. ARRRRG!
>>
>
> Oh my, its worse than that. I've not thought about the block frequencies
> this way before, but it appears that with the current model, inlining a
> function call guarded by a branch has the perplexing property of making any
> blocks within the function colder relative to their function's entry block.

OK, I have a better mental model for this. I'll send something to the dev
list for general discussion, and change this to be something much much more
conservative in two ways:

1) Only apply to nested loops
2) Require a more significant bias than 20%, maybe the same bias produced
by __builtin_expect.

Does that make sense as a short term solution? I can also hide this behind
a flag to experiment as this heuristic is firing *may* more often than I
ever expected it to.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140127/847043dd/attachment.html>