[PATCH] D27695: Add Instruction number to LSR cost model (PR23384)

Wed Jan 4 16:18:09 PST 2017

On Wed, Jan 4, 2017 at 12:55 PM, Wei Mi <wmi at google.com> wrote:

>
>
> On Wed, Jan 4, 2017 at 11:51 AM, Evgeny Stupachenko via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
>> evstupac added a comment.
>>
>> Quentin,
>>
>>   I've put first part in a separate review:
>> https://reviews.llvm.org/D28307
>>
>> Wei,
>>
>>   Did you have a chance to test the patch performance on your benchmarks?
>>
>>
> Yes, I run the patch through internal benchmarks. It is flat overall
> except two regressions. I look into one and I am trying to reduce a
> testcase from it. Another one is probably from the same cause, but I will
> verify.
>
> Thanks,
> Wei.
>
>

The two regressions mentioned above are from the same cause.

I attach a runable testcase foo.cc which is extracted from an internal
benchmark. Compiled with O2, it shows 1.5% degradation with the patch on my
sandybridge desktop while for the original benchmark it shows 3%
degradation on sandybridge and 5% on ivybridge machine.

The instruction number for the testcase is actually reduced with the patch,
but stalled-cycles-backend is increased significantly because the patch
uses many more memory accesses with complex addressing mode.

Base:
.LBB0_12:                               #   Parent Loop BB0_2 Depth=1
                                        # =>  This Inner Loop Header:
Depth=2
        movsd   -24(%rcx), %xmm1        # xmm1 = mem[0],zero
        movsd   -16(%rcx), %xmm2        # xmm2 = mem[0],zero
        mulsd   -24(%rdx), %xmm1
        addsd   %xmm0, %xmm1
        mulsd   -16(%rdx), %xmm2
        addsd   %xmm1, %xmm2
        movsd   -8(%rcx), %xmm1         # xmm1 = mem[0],zero
        mulsd   -8(%rdx), %xmm1
        addsd   %xmm2, %xmm1
        movsd   (%rcx), %xmm0           # xmm0 = mem[0],zero
        mulsd   (%rdx), %xmm0
        addsd   %xmm1, %xmm0
        addq    $32, %rdx
        addq    $32, %rcx
        addq    $-4, %rdi
        jne     .LBB0_12

With the patch:
.LBB0_12:                               #   Parent Loop BB0_2 Depth=1
                                        # =>  This Inner Loop Header:
Depth=2
        movsd   -24(%rdi,%rbx,8), %xmm1 # xmm1 = mem[0],zero
        mulsd   -24(%rcx,%rbx,8), %xmm1
        addsd   %xmm0, %xmm1
        movsd   -16(%rdi,%rbx,8), %xmm0 # xmm0 = mem[0],zero
        mulsd   -16(%rcx,%rbx,8), %xmm0
        addsd   %xmm1, %xmm0
        movsd   -8(%rdi,%rbx,8), %xmm1  # xmm1 = mem[0],zero
        mulsd   -8(%rcx,%rbx,8), %xmm1
        addsd   %xmm0, %xmm1
        movsd   (%rdi,%rbx,8), %xmm0    # xmm0 = mem[0],zero
        mulsd   (%rcx,%rbx,8), %xmm0
        addsd   %xmm1, %xmm0
        addq    $4, %rbx
        cmpq    %rbx, %rdx
        jne     .LBB0_12

Thanks,
Wei.

>> Repository:
>>   rL LLVM
>>
>> https://reviews.llvm.org/D27695
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170104/0196b6d6/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: foo.cc
Type: text/x-c++src
Size: 994 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170104/0196b6d6/attachment.cc>