<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [x86] hoist complex address generation common subexpressions?"
href="https://llvm.org/bugs/show_bug.cgi?id=25261">25261</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[x86] hoist complex address generation common subexpressions?
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>spatel+llvm@rotateright.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>There's a potential optimization remaining from the test case in <a class="bz_bug_link
bz_status_RESOLVED bz_closed"
title="RESOLVED FIXED - sign extensions cause suboptimal pointer arithmetic?"
href="show_bug.cgi?id=20134">bug 20134</a>:
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
After:
<a href="http://llvm.org/viewvc/llvm-project?view=revision&revision=250560">http://llvm.org/viewvc/llvm-project?view=revision&revision=250560</a>
We generate:
00 movslq %esi, %rax
03 movl 0x4(%rdi,%rax,4), %ecx
07 addl 0x8(%rdi,%rax,4), %ecx
0b movl %ecx, (%rdi,%rax,4)
0e retq
We could hoist the common factor (%rdi,%rsi,4) out of the load and store
addresses:
00 movslq %esi, %rax
03 leaq (%rdi,%rax,4), %rax
07 movl 0x4(%rax), %ecx
0a addl 0x8(%rax), %ecx
0d movl %ecx, (%rax)
0f retq
This probably requires simulation and micro-benchmarking across
micro-architectures to decide if it's worthwhile. The LEA option appears to
increase code size by 1 byte (hand-hacked asm - didn't verify that it's
correct) for this case where the CSE is used 3 times. In the general case,
using an LEA may also require an additional register. Also, whether the simpler
address calcs actually provide a time-savings or not may be
implementation-dependent.
The solution to this problem may be related to whatever is implemented for <a class="bz_bug_link
bz_status_NEW "
title="NEW --- - [x86] avoid big bad immediates in the instruction stream, part 1: use SIB addressing"
href="show_bug.cgi?id=24447">bug
24447</a>.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>