[llvm] r231238 - [MBP] Fix a really horrible bug in MachineBlockPlacement, but behind

Wed Mar 4 13:16:19 PST 2015

On Wed, Mar 4, 2015 at 11:58 AM, Chandler Carruth <chandlerc at gmail.com>
wrote:

> On Wed, Mar 4, 2015 at 10:46 AM, Xinliang David Li <xinliangli at gmail.com>
> wrote:
>
>> I am not sure the new layout strategy will work well consistently.
>>
>
> There is at least one bug in the new strategy that I know of, and
> generally its behind a flag because it isn't as well tested or understood
> yet... Anyways.
>
>
>> 1) if 'if (a1())' and 'if (a2())' branches are equally likely, the new
>> layout does not actually improve icache utilization -- it just trades one
>> for another
>>
>
> ? Before, we would *interleave* the basic blocks for the two unlikely
> branches. If we end up taking one of them, this would force another
> discontinuity.
>
>

The new heuristic is basically another way to enforce topological order
which may not work. 'd1'  or 'd2' in this case can be large and cold, and
it is better to put them far way in hat condition.

> 2) if 'if(a2()' is less likely, i.e. 'b2,c2,d2,f2' are much colder, the
>> new strategy will reduce reuse.
>>
>
> I don't understand this comment? Can you give an example?
>

This is the side effect of the new placement strategy -- basically sequence
'b2,c2, ..' is always laid out before 'b1,c1, ....'. 'b2,c2..' can be
colder, and placing them closer to a1,a2,done can reduce locality.

>
>
>>
>> The key problem is that why 'f1' is not picked as the successor for c1:
>>  presumably it (f1, which is a merge point) is not picked because there
>> is more likely predecessor, which is 'd1'. If 'd1' is more likely, 'd1'
>> should be picked as successor of 'b1' -- picking 'c1' after 'b1' does not
>> make sense.   If 'd1' is unlikely, then 'f1' should follow 'c1', and 'd1'
>> should be outlined and placed far away -- which will improve cache reuse
>> and reduce taken branches.
>>
>
> But what if c1 and d1 are the *same probability*. That is the case here.
> If there were strong profile information, we would lay this out exactly as
> you describe, but in this case there isn't.
>

If it is the same probability, b1,c1,f1,d1 may still be slightly better --
at least one arm has contiguous layout.

>
> When there isn't any specific probability, the code intends to lay out
> diamonds in a cohesive chain: b1, c1, d1, f1 or b1, d1, c1, f1 would both
> be reasonable. What we got before this change was b1, c1, b2, c2, d1, f1,
> d2, f2, which seems really bad.
>

Interleaving itself is not the real problem here, for instance it is ok to
do 'b1,c1,f1,b2,c2,f2,d1,d2' for many cases.

>
> Unlike triangle shaped pattern if (a()) { b(); } c();, for diamond shape
>> control flow, when real profile data is available (or branch annotation or
>> prediction can be trusted), it is almost always better to break topological
>> constraint using branch probability.
>>
>
> We have specific logic to do this when we have strong profile information,
> but that logic isn't what this is changing. That logic may need
> improvements as well, but those should be orthogonal to this change.
>

yes.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150304/3d4a3fcd/attachment.html>