[llvm] r291609 - CodeGen: Allow small copyable blocks to "break" the CFG.

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 11 11:49:24 PST 2017


As we discussed offline, there are cases that the taildup can hurt make
things worse (especially when there is no profile data). We should probably
revert this patch for now and continue discussion in the original patch.

David

On Wed, Jan 11, 2017 at 11:43 AM, Kyle Butt <iteratee at google.com> wrote:

>
>
> On Wed, Jan 11, 2017 at 11:07 AM, Kyle Butt <iteratee at google.com> wrote:
>
>> I'm not sure I understand. I looked at his test case and I get the same
>> layout on X86 as I do on amdgcn.
>>
>> Matt: There are 4 cases:
>>
>> bb2   bb6  Probability (As guessed by heuristics)   Taken branch count
>> original   Taken branch count new
>>  N     N        23.4%                                             2
>>                   1 (unconditional)
>>  N     Y        39.1%                                             1
>>                   1
>>  Y     N        14.1%                                             1
>>                   1
>>
> ^^^^^^ This line is wrong
>      Y     N        14.1%                                             1
>                       2
>
>>  Y     Y        23.4%                                             0
>>                   1
>>
>> By my count, I see the same score for both, and with
>> https://reviews.llvm.org/D28522, the branch to the exit block will
>> disappear in your example, making the new layout strictly better.
>>
>
> You're right. This is worse.
>
> Sorry, I can't count.
>
>
>>
>> On Wed, Jan 11, 2017 at 9:36 AM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>>> Kyle, I looked at Matt's case in more details. For this target, bb2 in
>>> is selected as fallthrough of bb in the final layout, so bb4  should not be
>>> a taildup'ed. Can you take a look what went wrong?
>>>
>>> By comparison, on x86, the cloned bb4 is the layout successor of bb
>>> which is expected.
>>>
>>> David
>>>
>>> On Wed, Jan 11, 2017 at 12:06 AM, Matt Arsenault <arsenm2 at gmail.com>
>>> wrote:
>>>
>>>>
>>>> On Jan 10, 2017, at 19:13, Kyle Butt <iteratee at google.com> wrote:
>>>>
>>>>>
>>>>>
>>>> I looked at the code in question. There are more compare instructions,
>>>> but no codepath should execute more of them. Which codepath are you
>>>> concerned about?
>>>>
>>>> For the compare, and 1 of the branches, it occurs due to tail
>>>> duplication, and so for those, this is not a regression, it is WAI.
>>>>
>>>> Are you worried about the code size, or did this actually cause a
>>>> performance regression?
>>>> If it did cause a regression, can you tell me which path is the hot
>>>> path?
>>>>
>>>>> -Matt
>>>>>
>>>>>
>>>> Thanks,
>>>> Kyle.
>>>>
>>>>
>>>> This changes from having a path where no branch occurs, to ensuring
>>>> that a branch will occur, and branches are expensive. I noticed this from
>>>> the code size changes, but I’m mostly surprised by replacing a fall through
>>>> with a branch.
>>>>
>>>> Looking at the expected cycle counts on all paths in the artificial
>>>> testcase, the loads + waits are always skipped, which is good. I think if
>>>> the waitcnts were inserted smarter, the original code CFG would be slightly
>>>> better. I need to look more at the full testcase.
>>>>
>>>> -Matt
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170111/a5a8ef82/attachment.html>


More information about the llvm-commits mailing list