<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Stack unnecessarily used when unrolling loops on statically-sized local arrays"
href="https://bugs.llvm.org/show_bug.cgi?id=41403">41403</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Stack unnecessarily used when unrolling loops on statically-sized local arrays
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Register Allocator
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>vgatherps@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org, quentin.colombet@gmail.com
</td>
</tr></table>
<p>
<div>
<pre>Hi,
The following code results in unnecessary stack spills when unrolling the outer
loop:
#define LEN 2
// determines to what extent iteration happens in the inner loop
// (do first operation horizontally, then second horizontally)
// this unrolling of the inner loop results in bad codegen
//
// as opposed to the outer loop
// (do operation 1 on first index, then 2, etc, then go to index 2)
#define INNER_LEN 2
static_assert(LEN % INNER_LEN == 0, "length not divisible by inner width");
void test_loop_width(int *in) {
int inner1[LEN];
for (int j = 0; j < (LEN / INNER_LEN); j++) {
for (int i = 0; i < INNER_LEN; i++) {
inner1[i + j*INNER_LEN] = in[i + j*INNER_LEN];
}
for (int i = 0; i < INNER_LEN; i++) {
in[i + j*INNER_LEN] *= inner1[i + j*INNER_LEN];
}
}
}
Results in the following code getting generated:
test_loop_width(int*):
movq (%rdi), %rax
movq %rax, -8(%rsp)
imull (%rdi), %eax
movl %eax, (%rdi)
movl 4(%rdi), %eax
imull -4(%rsp), %eax
movl %eax, 4(%rdi)
retq
Recompiling this with INNER_LEN set to 1 generates the expected code of
test_loop_width(int*):
movl (%rdi), %eax
movl 4(%rdi), %ecx
imull %eax, %eax
movl %eax, (%rdi)
imull %ecx, %ecx
movl %ecx, 4(%rdi)
retq
The second vertical pattern scales to arbitrary lengths, while the horizontal
pattern uses the stack aggressively.
Ideally, llvm would not be using the stack here and would be able to recognize
that the two are equivalent and code could be reordered to not use the stack.
I don't think this is an aliasing bug since the code vectorizes for larger
lengths in each case, just that the horizontal case still uses the stack.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>