<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - poor unrolling of range-based for loops"
href="https://llvm.org/bugs/show_bug.cgi?id=30628">30628</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>poor unrolling of range-based for loops
</td>
</tr>
<tr>
<th>Product</th>
<td>clang
</td>
</tr>
<tr>
<th>Version</th>
<td>3.9
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>C++
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedclangbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>matthias.thul@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>dgregor@apple.com, llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>This is not a bug but a suggestion for an improvement. Consider the following
range based for loop:
==========
#include <algorithm>
#include <array>
#include <functional>
#include <random>
const size_t N = 100000;
const unsigned value = 31415926;
template<size_t N>
std::array<unsigned, N> generateData() {
std::mt19937 randomEngine(0);
std::array<unsigned, N> data;
std::generate(data.begin(), data.end(), randomEngine);
return data;
}
void testRange() {
auto const data = generateData<N>();
bool result = true;
for (unsigned entry : data) {
if (entry == value) {
result = false;
break;
}
}
assert(result);
}
==========
I compiled it with "-std=c++14 -O3" using Clang 3.9. I was expecting the
range-based for loop to be unrolled but this didn't happen. Using "#pragma
unroll 8" before the loop didn't have any effect either.
I was expecting the compiler to essentially generate the same code that it does
for the below two loops, both of which are unrolled automatically 8 times
without the pragma directive.
==========
void testManual() {
auto data = generateData<N>();
bool result = true;
for (size_t i = 0; i < N; i++) {
if (data[i] == value) {
result = false;
break;
}
}
assert(result);
}
void testIterator() {
auto data = generateData<N>();
bool result = true;
for (auto itData = data.begin(); itData != data.end(); ++itData) {
if (*itData == value) {
result = false;
break;
}
}
assert(result);
}
==========
This has an severe effect on performance as you can see from the below
benchmark:
Benchmark Time CPU Iterations
--------------------------------------------------------
benchmarkManual 33175 ns 33135 ns 21015
benchmarkRange 58488 ns 58370 ns 10771
benchmarkIterator 27077 ns 27045 ns 29426
The benchmark was generated using the Google benchmark library and only the
loops but not the array generation were benchmarked.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>