<html>
<head>
<base href="http://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - Missed optimization opportunities in pointer arithmetic code"
href="http://llvm.org/bugs/show_bug.cgi?id=16408">16408</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Missed optimization opportunities in pointer arithmetic code
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>All
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>st@quanttec.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvmbugs@cs.uiuc.edu
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>LLVM fails to eliminate superfluous round-trip conversions between pointer pair
and pointer + size representations of array references.
For example, clang generates for this C++ code
__attribute__((noinline))
const int* test1(const int* begin, const int* end) {
return begin + (end - begin);
}
__attribute__((noinline))
size_t test2(const int* array, size_t length) {
return array + length - array;
}
the following LLVM code (at -O2)
define i32* @_Z5test1PKiS0_(i32* %begin, i32* %end) nounwind uwtable readnone
noinline ssp {
%1 = ptrtoint i32* %end to i64
%2 = ptrtoint i32* %begin to i64
%3 = sub i64 %1, %2
%4 = ashr exact i64 %3, 2
%5 = getelementptr inbounds i32* %begin, i64 %4
ret i32* %5
}
define i64 @_Z5test2PKim(i32* nocapture %array, i64 %length) nounwind uwtable
readnone noinline ssp {
%.idx = shl nuw i64 %length, 2
%1 = ashr exact i64 %.idx, 2
ret i64 %1
}
This gets translated to the following x64 code:
.globl __Z5test1PKiS0_
.align 4, 0x90
__Z5test1PKiS0_: ## @_Z5test1PKiS0_
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
subq %rdi, %rsi
andq $-4, %rsi
addq %rdi, %rsi
movq %rsi, %rax
popq %rbp
ret
.cfi_endproc
.globl __Z5test2PKim
.align 4, 0x90
__Z5test2PKim: ## @_Z5test2PKim
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp7:
.cfi_def_cfa_offset 16
Ltmp8:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp9:
.cfi_def_cfa_register %rbp
leaq (,%rsi,4), %rax
sarq $2, %rax
popq %rbp
ret
.cfi_endproc
The optimizer fails to exploit in test1 that the difference of two pointers
must be a multiple of the pointee type size, since (at least in in C/C++) only
pointers to elements of the same array may be subtracted (see ISO C++ ยง5.7).
When the size of the pointee type equals its alignment, the optimizer could
also exploit an assumption that the pointer addresses must be aligned.
In test2 the optimizer does not exploit that the difference between two
pointers into an array can not be greater than (max uintptr_t value)/(size of
pointee type).
Benjamin Kramer pointed out in
<a href="http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-June/030159.html">http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-June/030159.html</a>
that for the test1 case:
"
The optimizer already knows this. Here's the IR for a pattern like yours:
%sub.ptr.lhs.cast = ptrtoint i32* %1 to i64
%sub.ptr.rhs.cast = ptrtoint i32* %0 to i64
%sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
%sub.ptr.div = ashr exact i64 %sub.ptr.sub, 2
%add.ptr = getelementptr inbounds i32* %0, i64 %sub.ptr.div
The "exact" bit on the ashr is a hint that the shift only shifts out zeros
(because the pointers are aligned). The getelementptr gets lowered into adds
later in the SelectionDAG, but the "exact" bit is lost by then. It has to
assume that the value may be unaligned and inserts the andq $-4.
I see two ways to fix this:
1. Teach SelectionDAG about "exact" bits. I don't think this is possible with
our current infrastructure.
2. Teach InstCombine how to fuse ashrs and getelementptr instructions. Not sure
how tricky this is.
"</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>