<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - LSR needs to be improved for x86"
href="https://llvm.org/bugs/show_bug.cgi?id=23384">23384</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>LSR needs to be improved for x86
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>wmi@google.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Created <span class=""><a href="attachment.cgi?id=14272" name="attach_14272" title="testcase 1.cc">attachment 14272</a> <a href="attachment.cgi?id=14272&action=edit" title="testcase 1.cc">[details]</a></span>
testcase 1.cc
For the testcase attached, llvm now generates 3 AddRec but 1 AddRec is enough
by using complex addressing modes. Every AddRec will be translated to an add
insn at the end of loop, so the number of AddRec is important for loop
performance. And it is more important than NumRegs especially when register
pressure is not very high.
~/workarea/llvm-r234391/build/bin/clang++ -std=c++11 -O2
-fno-omit-frame-pointer -S 1.cc -o 1.s
The kernel loop:
.LBB1_10: # %for.body6.i
movss -4(%rcx), %xmm2 # xmm2 = mem[0],zero,zero,zero
movss (%rcx), %xmm1 # xmm1 = mem[0],zero,zero,zero
mulss -4(%rdi), %xmm2
addss %xmm0, %xmm2
mulss (%rdi), %xmm1
addss %xmm2, %xmm1
addq $8, %rdi
addq $8, %rcx
addl $-2, %esi
movaps %xmm1, %xmm0
jne .LBB1_10
A better version is:
.LBB1_8: # %for.body6.i
movss (%rdi,%rcx,4), %xmm0 # xmm0 = mem[0],zero,zero,zero
mulss (%rsi,%rcx,4), %xmm0
addss %xmm1, %xmm0
movss 4(%rdi,%rcx,4), %xmm1 # xmm1 = mem[0],zero,zero,zero
mulss 4(%rsi,%rcx,4), %xmm1
addss %xmm0, %xmm1
addq $2, %rcx
cmpl %ecx, %eax
movaps %xmm1, %xmm0
jne .LBB1_8</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>