[LLVMbugs] [Bug 6993] New: Extra load inside sparse matrix multiplication kernel

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Fri Apr 30 10:43:59 PDT 2010


http://llvm.org/bugs/show_bug.cgi?id=6993

           Summary: Extra load inside sparse matrix multiplication kernel
           Product: new-bugs
           Version: 2.7
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Keywords: code-quality
          Severity: normal
          Priority: P
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: bearophile at mailas.com
                CC: llvmbugs at cs.uiuc.edu


The SciMark2 benchmark shows two points where llvm-gcc 2.7 generates suboptimal
asm code (they can be found by visual inspection of the asm and comparing the
timings with gcc 4.5).

The less significant of those two points is in the "Sparse matmult" part of
SciMark2 (the other more important problem is left to another bug report). A
good thing of this point is that it's easy to spot and it's compact.

In attach there is a reduced version of SciMark2 in C language that performs
only the Sparse matmult.

Timings, NLOOPS=400_000, seconds:
  gcc 4.5:       5.49
  llvm-gcc 2.7:  5.69

CPU Celeron 2.13 GHz, Windows Vista 32 bit.

In both cases code compiled with:
-O3 -s -fomit-frame-pointer -msse3 -march=core2


The situation can also be seen inspecting the asm generated by the two
compilers:

gcc, inner loop of SparseCompRow_matmult():
L45:
    movl    (%esi,%eax,4), %edx
    fldl    (%edi,%edx,8)
    fmull   (%ebx,%eax,8)
    incl    %eax
    faddp   %st, %st(1)
    cmpl    %eax, %ecx
    jg  L45


llvm-gcc, inner loop of SparseCompRow_matmult():
LBB3_6:
    movl    (%edi), %edx
    movl    52(%esp), %ebp
    addl    $4, %edi
    movsd   (%ebp,%edx,8), %xmm1
    mulsd   (%ebx), %xmm1
    addl    $8, %ebx
    decl    %esi
    addsd   %xmm1, %xmm0
    jne LBB3_6


llvm-gcc here has four loads from memory instead of the necessary three.
Normally one extra load is not an important difference, but inside the inner
loops of numerical FP kernels even few extra asm instructions can be
significant.

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list