[llvm] r291609 - CodeGen: Allow small copyable blocks to "break" the CFG.

Kyle Butt via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 11:07:26 PST 2017


I'm not sure I understand. I looked at his test case and I get the same
layout on X86 as I do on amdgcn.

Matt: There are 4 cases:

bb2   bb6  Probability (As guessed by heuristics)   Taken branch count
original   Taken branch count new
 N     N        23.4%                                             2
                1 (unconditional)
 N     Y        39.1%                                             1
                1
 Y     N        14.1%                                             1
                1
 Y     Y        23.4%                                             0
                1

By my count, I see the same score for both, and with
https://reviews.llvm.org/D28522, the branch to the exit block will
disappear in your example, making the new layout strictly better.

On Wed, Jan 11, 2017 at 9:36 AM, Xinliang David Li <davidxl at google.com>
wrote:

> Kyle, I looked at Matt's case in more details. For this target, bb2 in is
> selected as fallthrough of bb in the final layout, so bb4  should not be a
> taildup'ed. Can you take a look what went wrong?
>
> By comparison, on x86, the cloned bb4 is the layout successor of bb which
> is expected.
>
> David
>
> On Wed, Jan 11, 2017 at 12:06 AM, Matt Arsenault <arsenm2 at gmail.com>
> wrote:
>
>>
>> On Jan 10, 2017, at 19:13, Kyle Butt <iteratee at google.com> wrote:
>>
>>>
>>>
>> I looked at the code in question. There are more compare instructions,
>> but no codepath should execute more of them. Which codepath are you
>> concerned about?
>>
>> For the compare, and 1 of the branches, it occurs due to tail
>> duplication, and so for those, this is not a regression, it is WAI.
>>
>> Are you worried about the code size, or did this actually cause a
>> performance regression?
>> If it did cause a regression, can you tell me which path is the hot path?
>>
>>> -Matt
>>>
>>>
>> Thanks,
>> Kyle.
>>
>>
>> This changes from having a path where no branch occurs, to ensuring that
>> a branch will occur, and branches are expensive. I noticed this from the
>> code size changes, but I’m mostly surprised by replacing a fall through
>> with a branch.
>>
>> Looking at the expected cycle counts on all paths in the artificial
>> testcase, the loads + waits are always skipped, which is good. I think if
>> the waitcnts were inserted smarter, the original code CFG would be slightly
>> better. I need to look more at the full testcase.
>>
>> -Matt
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170111/1b768a25/attachment.html>


More information about the llvm-commits mailing list