<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - sign extensions cause suboptimal pointer arithmetic?"
href="http://llvm.org/bugs/show_bug.cgi?id=20134">20134</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>sign extensions cause suboptimal pointer arithmetic?
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>spatel+llvm@rotateright.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>Consider this program:
void foo(int *a, int i) {
a[i] = a[i+1] + a[i+2];
}
----------------------------------
Or as slightly optimized LLVM IR for a 64-bit system:
define void @foo(i32* nocapture %a, i32 %i) #0 {
entry:
%add = add nsw i32 %i, 1
%idxprom = sext i32 %add to i64
%arrayidx = getelementptr inbounds i32* %a, i64 %idxprom
%0 = load i32* %arrayidx, align 4
%add1 = add nsw i32 %i, 2
%idxprom2 = sext i32 %add1 to i64
%arrayidx3 = getelementptr inbounds i32* %a, i64 %idxprom2
%1 = load i32* %arrayidx3, align 4
%add4 = add nsw i32 %1, %0
%idxprom5 = sext i32 %i to i64
%arrayidx6 = getelementptr inbounds i32* %a, i64 %idxprom5
store i32 %add4, i32* %arrayidx6, align 4
ret void
}
-----------------------------------
When compiled for x86-64 with r211521, we get:
_foo:
00 leal 0x1(%rsi), %eax
03 cltq <--- sign extend
05 leal 0x2(%rsi), %ecx
08 movslq %ecx, %rcx <--- sign extend
0b movl (%rdi,%rcx,4), %ecx
0e addl (%rdi,%rax,4), %ecx
11 movslq %esi, %rax <--- sign extend
14 movl %ecx, (%rdi,%rax,4)
17 ret
Is it possible to recognize that 'i' is being sign extended after multiple math
ops, move the sign extend ahead of those math ops, and do those math ops in
64-bit?
If we could do that, I think we would produce the optimal codegen:
_foo:
00 movslq %edx, %rdx
03 movl 0x4(%rdi,%rdx,4), %eax
07 addl 0x8(%rdi,%rdx,4), %eax
0b movl %eax, (%rdi,%rdx,4)
0e ret
This code is faster and 35% smaller (15/23 bytes)...and this is what gcc 4.9
produces at -O1.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>