[llvm-bugs] [Bug 28090] New: LLVM generates terrible x86 code for trivial, fully unrolled loops
via llvm-bugs
llvm-bugs at lists.llvm.org
Sat Jun 11 16:49:02 PDT 2016
https://llvm.org/bugs/show_bug.cgi?id=28090
Bug ID: 28090
Summary: LLVM generates terrible x86 code for trivial, fully
unrolled loops
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: chandlerc at gmail.com
CC: llvm-bugs at lists.llvm.org
Classification: Unclassified
Consider this code:
----
struct V {
static constexpr int length = 32;
unsigned short data[32];
};
int reduce(V &v) {
int sum = 0;
for (int i = 0; i < v.length; ++i) {
sum += static_cast<int>(v.data[i]);
}
return sum;
}
----
If the length weren't a constant, LLVM would do a delightful job of vectorizing
the reduction loop. But because it happens to be a constant trip count, we
fully unroll the loop and generate this mess:
----
% ./bin/clang++ -std=c++1z -c -S -o - -O2 -march=haswell x.cpp
.text
.file "x.cpp"
.globl _Z6reduceR1V
.p2align 4, 0x90
.type _Z6reduceR1V, at function
_Z6reduceR1V: # @_Z6reduceR1V
.cfi_startproc
# BB#0: # %entry
movzwl (%rdi), %eax
movzwl 2(%rdi), %ecx
addl %eax, %ecx
movzwl 4(%rdi), %eax
addl %ecx, %eax
....
; repeat OVER AND OVER AGAIN with minor variations in registers...
....
movzwl 60(%rdi), %edx
addl %ecx, %edx
movzwl 62(%rdi), %eax
addl %edx, %eax
retq
----
Ow. This hurts code size as well. =/ I figure we need reduction support in the
SLP vectorizer or some such?
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20160611/db1d8ee7/attachment.html>
More information about the llvm-bugs
mailing list