[LLVMbugs] [Bug 16408] New: Missed optimization opportunities in pointer arithmetic code
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Fri Jun 21 11:44:43 PDT 2013
http://llvm.org/bugs/show_bug.cgi?id=16408
Bug ID: 16408
Summary: Missed optimization opportunities in pointer
arithmetic code
Product: libraries
Version: trunk
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: P
Component: Scalar Optimizations
Assignee: unassignedbugs at nondot.org
Reporter: st at quanttec.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
LLVM fails to eliminate superfluous round-trip conversions between pointer pair
and pointer + size representations of array references.
For example, clang generates for this C++ code
__attribute__((noinline))
const int* test1(const int* begin, const int* end) {
return begin + (end - begin);
}
__attribute__((noinline))
size_t test2(const int* array, size_t length) {
return array + length - array;
}
the following LLVM code (at -O2)
define i32* @_Z5test1PKiS0_(i32* %begin, i32* %end) nounwind uwtable readnone
noinline ssp {
%1 = ptrtoint i32* %end to i64
%2 = ptrtoint i32* %begin to i64
%3 = sub i64 %1, %2
%4 = ashr exact i64 %3, 2
%5 = getelementptr inbounds i32* %begin, i64 %4
ret i32* %5
}
define i64 @_Z5test2PKim(i32* nocapture %array, i64 %length) nounwind uwtable
readnone noinline ssp {
%.idx = shl nuw i64 %length, 2
%1 = ashr exact i64 %.idx, 2
ret i64 %1
}
This gets translated to the following x64 code:
.globl __Z5test1PKiS0_
.align 4, 0x90
__Z5test1PKiS0_: ## @_Z5test1PKiS0_
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp2:
.cfi_def_cfa_offset 16
Ltmp3:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp4:
.cfi_def_cfa_register %rbp
subq %rdi, %rsi
andq $-4, %rsi
addq %rdi, %rsi
movq %rsi, %rax
popq %rbp
ret
.cfi_endproc
.globl __Z5test2PKim
.align 4, 0x90
__Z5test2PKim: ## @_Z5test2PKim
.cfi_startproc
## BB#0:
pushq %rbp
Ltmp7:
.cfi_def_cfa_offset 16
Ltmp8:
.cfi_offset %rbp, -16
movq %rsp, %rbp
Ltmp9:
.cfi_def_cfa_register %rbp
leaq (,%rsi,4), %rax
sarq $2, %rax
popq %rbp
ret
.cfi_endproc
The optimizer fails to exploit in test1 that the difference of two pointers
must be a multiple of the pointee type size, since (at least in in C/C++) only
pointers to elements of the same array may be subtracted (see ISO C++ ยง5.7).
When the size of the pointee type equals its alignment, the optimizer could
also exploit an assumption that the pointer addresses must be aligned.
In test2 the optimizer does not exploit that the difference between two
pointers into an array can not be greater than (max uintptr_t value)/(size of
pointee type).
Benjamin Kramer pointed out in
http://lists.cs.uiuc.edu/pipermail/cfe-dev/2013-June/030159.html
that for the test1 case:
"
The optimizer already knows this. Here's the IR for a pattern like yours:
%sub.ptr.lhs.cast = ptrtoint i32* %1 to i64
%sub.ptr.rhs.cast = ptrtoint i32* %0 to i64
%sub.ptr.sub = sub i64 %sub.ptr.lhs.cast, %sub.ptr.rhs.cast
%sub.ptr.div = ashr exact i64 %sub.ptr.sub, 2
%add.ptr = getelementptr inbounds i32* %0, i64 %sub.ptr.div
The "exact" bit on the ashr is a hint that the shift only shifts out zeros
(because the pointers are aligned). The getelementptr gets lowered into adds
later in the SelectionDAG, but the "exact" bit is lost by then. It has to
assume that the value may be unaligned and inserts the andq $-4.
I see two ways to fix this:
1. Teach SelectionDAG about "exact" bits. I don't think this is possible with
our current infrastructure.
2. Teach InstCombine how to fuse ashrs and getelementptr instructions. Not sure
how tricky this is.
"
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20130621/aa38a536/attachment.html>
More information about the llvm-bugs
mailing list