[llvm] r291609 - CodeGen: Allow small copyable blocks to "break" the CFG.

Kyle Butt via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 11:43:43 PST 2017


On Wed, Jan 11, 2017 at 11:07 AM, Kyle Butt <iteratee at google.com> wrote:

> I'm not sure I understand. I looked at his test case and I get the same
> layout on X86 as I do on amdgcn.
>
> Matt: There are 4 cases:
>
> bb2   bb6  Probability (As guessed by heuristics)   Taken branch count
> original   Taken branch count new
>  N     N        23.4%                                             2
>                   1 (unconditional)
>  N     Y        39.1%                                             1
>                   1
>  Y     N        14.1%                                             1
>                   1
>
^^^^^^ This line is wrong
     Y     N        14.1%                                             1
                    2

>  Y     Y        23.4%                                             0
>                   1
>
> By my count, I see the same score for both, and with
> https://reviews.llvm.org/D28522, the branch to the exit block will
> disappear in your example, making the new layout strictly better.
>

You're right. This is worse.

Sorry, I can't count.


>
> On Wed, Jan 11, 2017 at 9:36 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>> Kyle, I looked at Matt's case in more details. For this target, bb2 in is
>> selected as fallthrough of bb in the final layout, so bb4  should not be a
>> taildup'ed. Can you take a look what went wrong?
>>
>> By comparison, on x86, the cloned bb4 is the layout successor of bb which
>> is expected.
>>
>> David
>>
>> On Wed, Jan 11, 2017 at 12:06 AM, Matt Arsenault <arsenm2 at gmail.com>
>> wrote:
>>
>>>
>>> On Jan 10, 2017, at 19:13, Kyle Butt <iteratee at google.com> wrote:
>>>
>>>>
>>>>
>>> I looked at the code in question. There are more compare instructions,
>>> but no codepath should execute more of them. Which codepath are you
>>> concerned about?
>>>
>>> For the compare, and 1 of the branches, it occurs due to tail
>>> duplication, and so for those, this is not a regression, it is WAI.
>>>
>>> Are you worried about the code size, or did this actually cause a
>>> performance regression?
>>> If it did cause a regression, can you tell me which path is the hot
>>> path?
>>>
>>>> -Matt
>>>>
>>>>
>>> Thanks,
>>> Kyle.
>>>
>>>
>>> This changes from having a path where no branch occurs, to ensuring that
>>> a branch will occur, and branches are expensive. I noticed this from the
>>> code size changes, but I’m mostly surprised by replacing a fall through
>>> with a branch.
>>>
>>> Looking at the expected cycle counts on all paths in the artificial
>>> testcase, the loads + waits are always skipped, which is good. I think if
>>> the waitcnts were inserted smarter, the original code CFG would be slightly
>>> better. I need to look more at the full testcase.
>>>
>>> -Matt
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170111/21476af4/attachment.html>


More information about the llvm-commits mailing list