[llvm] r291609 - CodeGen: Allow small copyable blocks to "break" the CFG.
Kyle Butt via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 11 11:07:26 PST 2017
I'm not sure I understand. I looked at his test case and I get the same
layout on X86 as I do on amdgcn.
Matt: There are 4 cases:
bb2 bb6 Probability (As guessed by heuristics) Taken branch count
original Taken branch count new
N N 23.4% 2
1 (unconditional)
N Y 39.1% 1
1
Y N 14.1% 1
1
Y Y 23.4% 0
1
By my count, I see the same score for both, and with
https://reviews.llvm.org/D28522, the branch to the exit block will
disappear in your example, making the new layout strictly better.
On Wed, Jan 11, 2017 at 9:36 AM, Xinliang David Li <davidxl at google.com>
wrote:
> Kyle, I looked at Matt's case in more details. For this target, bb2 in is
> selected as fallthrough of bb in the final layout, so bb4 should not be a
> taildup'ed. Can you take a look what went wrong?
>
> By comparison, on x86, the cloned bb4 is the layout successor of bb which
> is expected.
>
> David
>
> On Wed, Jan 11, 2017 at 12:06 AM, Matt Arsenault <arsenm2 at gmail.com>
> wrote:
>
>>
>> On Jan 10, 2017, at 19:13, Kyle Butt <iteratee at google.com> wrote:
>>
>>>
>>>
>> I looked at the code in question. There are more compare instructions,
>> but no codepath should execute more of them. Which codepath are you
>> concerned about?
>>
>> For the compare, and 1 of the branches, it occurs due to tail
>> duplication, and so for those, this is not a regression, it is WAI.
>>
>> Are you worried about the code size, or did this actually cause a
>> performance regression?
>> If it did cause a regression, can you tell me which path is the hot path?
>>
>>> -Matt
>>>
>>>
>> Thanks,
>> Kyle.
>>
>>
>> This changes from having a path where no branch occurs, to ensuring that
>> a branch will occur, and branches are expensive. I noticed this from the
>> code size changes, but I’m mostly surprised by replacing a fall through
>> with a branch.
>>
>> Looking at the expected cycle counts on all paths in the artificial
>> testcase, the loads + waits are always skipped, which is good. I think if
>> the waitcnts were inserted smarter, the original code CFG would be slightly
>> better. I need to look more at the full testcase.
>>
>> -Matt
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170111/1b768a25/attachment.html>
More information about the llvm-commits
mailing list