[LLVMbugs] [Bug 6993] New: Extra load inside sparse matrix multiplication kernel
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Fri Apr 30 10:43:59 PDT 2010
http://llvm.org/bugs/show_bug.cgi?id=6993
Summary: Extra load inside sparse matrix multiplication kernel
Product: new-bugs
Version: 2.7
Platform: PC
OS/Version: Windows XP
Status: NEW
Keywords: code-quality
Severity: normal
Priority: P
Component: new bugs
AssignedTo: unassignedbugs at nondot.org
ReportedBy: bearophile at mailas.com
CC: llvmbugs at cs.uiuc.edu
The SciMark2 benchmark shows two points where llvm-gcc 2.7 generates suboptimal
asm code (they can be found by visual inspection of the asm and comparing the
timings with gcc 4.5).
The less significant of those two points is in the "Sparse matmult" part of
SciMark2 (the other more important problem is left to another bug report). A
good thing of this point is that it's easy to spot and it's compact.
In attach there is a reduced version of SciMark2 in C language that performs
only the Sparse matmult.
Timings, NLOOPS=400_000, seconds:
gcc 4.5: 5.49
llvm-gcc 2.7: 5.69
CPU Celeron 2.13 GHz, Windows Vista 32 bit.
In both cases code compiled with:
-O3 -s -fomit-frame-pointer -msse3 -march=core2
The situation can also be seen inspecting the asm generated by the two
compilers:
gcc, inner loop of SparseCompRow_matmult():
L45:
movl (%esi,%eax,4), %edx
fldl (%edi,%edx,8)
fmull (%ebx,%eax,8)
incl %eax
faddp %st, %st(1)
cmpl %eax, %ecx
jg L45
llvm-gcc, inner loop of SparseCompRow_matmult():
LBB3_6:
movl (%edi), %edx
movl 52(%esp), %ebp
addl $4, %edi
movsd (%ebp,%edx,8), %xmm1
mulsd (%ebx), %xmm1
addl $8, %ebx
decl %esi
addsd %xmm1, %xmm0
jne LBB3_6
llvm-gcc here has four loads from memory instead of the necessary three.
Normally one extra load is not an important difference, but inside the inner
loops of numerical FP kernels even few extra asm instructions can be
significant.
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
More information about the llvm-bugs
mailing list