[LLVMbugs] [Bug 23384] New: LSR needs to be improved for x86
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Thu Apr 30 16:38:50 PDT 2015
https://llvm.org/bugs/show_bug.cgi?id=23384
Bug ID: 23384
Summary: LSR needs to be improved for x86
Product: libraries
Version: trunk
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: wmi at google.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
Created attachment 14272
--> https://llvm.org/bugs/attachment.cgi?id=14272&action=edit
testcase 1.cc
For the testcase attached, llvm now generates 3 AddRec but 1 AddRec is enough
by using complex addressing modes. Every AddRec will be translated to an add
insn at the end of loop, so the number of AddRec is important for loop
performance. And it is more important than NumRegs especially when register
pressure is not very high.
~/workarea/llvm-r234391/build/bin/clang++ -std=c++11 -O2
-fno-omit-frame-pointer -S 1.cc -o 1.s
The kernel loop:
.LBB1_10: # %for.body6.i
movss -4(%rcx), %xmm2 # xmm2 = mem[0],zero,zero,zero
movss (%rcx), %xmm1 # xmm1 = mem[0],zero,zero,zero
mulss -4(%rdi), %xmm2
addss %xmm0, %xmm2
mulss (%rdi), %xmm1
addss %xmm2, %xmm1
addq $8, %rdi
addq $8, %rcx
addl $-2, %esi
movaps %xmm1, %xmm0
jne .LBB1_10
A better version is:
.LBB1_8: # %for.body6.i
movss (%rdi,%rcx,4), %xmm0 # xmm0 = mem[0],zero,zero,zero
mulss (%rsi,%rcx,4), %xmm0
addss %xmm1, %xmm0
movss 4(%rdi,%rcx,4), %xmm1 # xmm1 = mem[0],zero,zero,zero
mulss 4(%rsi,%rcx,4), %xmm1
addss %xmm0, %xmm1
addq $2, %rcx
cmpl %ecx, %eax
movaps %xmm1, %xmm0
jne .LBB1_8
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150430/a4278e5d/attachment.html>
More information about the llvm-bugs
mailing list