[llvm-dev] [RFC] New pass: LoopExitValues
James Molloy via llvm-dev
llvm-dev at lists.llvm.org
Wed Sep 2 05:36:24 PDT 2015
Hi,
Coremark really isn't a good enough test - have you run the LLVM test suite
with this patch, and what were the performance differences?
I'm still a bit confused about what pattern exactly this pass is supposed
to trigger on. I understand the mechanics, but I still can't quite see what
patterns it would be useful on. You've mentioned matrix multiply - how does
this pass alter the IR? What value is it avoiding being recomputed? How
does this pass affect register pressure?
Also, your example just removes a mov and an add - the push/pops are just
register allocation (unless your pass in fact *reduces* register pressure?)
A bit more clarification would be great.
Cheers,
James
On Tue, 1 Sep 2015 at 19:07 Steve King via llvm-dev <llvm-dev at lists.llvm.org>
wrote:
> On Mon, Aug 31, 2015 at 5:52 PM, Jake VanAdrighem
> <jvanadrighem at gmail.com> wrote:
> > Do you have some specific performance measurements?
>
> Averaging 4 runs of 10000 iterations each of Coremark on my X86_64
> desktop showed:
>
> -O2 performance: +2.9% faster with the L.E.V. pass
> -Os size: 1.5% smaller with the L.E.V. pass
>
> In the case of Coremark, the benefit comes mainly from the matrix
> portion benchmark, which uses nested loops. Similarly, I used a
> matrix multiplication for the regression test as shown below. The
> L.E.V. pass eliminated 4 instructions.
>
> void matrix_mul(unsigned int Size, unsigned int *Dst, unsigned int
> *Src, unsigned int Val) {
> for (int Outer = 0; Outer < Size; ++Outer)
> for (int Inner = 0; Inner < Size; ++Inner)
> Dst[Outer * Size + Inner] = Src[Outer * Size + Inner] * Val;
> }
>
>
> With LoopExitValues
> -------------------------------
> matrix_mul:
> testl %edi, %edi
> je .LBB0_5
> xorl %r9d, %r9d
> xorl %r8d, %r8d
> .LBB0_2:
> xorl %r11d, %r11d
> .LBB0_3:
> movl %r9d, %r10d
> movl (%rdx,%r10,4), %eax
> imull %ecx, %eax
> movl %eax, (%rsi,%r10,4)
> incl %r11d
> incl %r9d
> cmpl %r11d, %edi
> jne .LBB0_3
> incl %r8d
> cmpl %edi, %r8d
> jne .LBB0_2
> .LBB0_5:
> retq
>
>
>
> Without LoopExitValues:
> -----------------------------------
> matrix_mul:
> pushq %rbx # Eliminated by L.E.V. pass
> .Ltmp0:
> .Ltmp1:
> testl %edi, %edi
> je .LBB0_5
> xorl %r8d, %r8d
> xorl %r9d, %r9d
> .LBB0_2:
> xorl %r10d, %r10d
> movl %r8d, %eax # Eliminated by L.E.V. pass
> .LBB0_3:
> movl %eax, %r11d
> movl (%rdx,%r11,4), %ebx
> imull %ecx, %ebx
> movl %ebx, (%rsi,%r11,4)
> incl %r10d
> incl %eax
> cmpl %r10d, %edi
> jne .LBB0_3
> incl %r9d
> addl %edi, %r8d # Eliminated by L.E.V. pass
> cmpl %edi, %r9d
> jne .LBB0_2
> .LBB0_5:
> popq %rbx # Eliminated by L.E.V. pass
> retq
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150902/a92d50ef/attachment.html>
More information about the llvm-dev
mailing list