<html>
<head>
<base href="https://llvm.org/bugs/" />
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW --- - left shift is broken for negative <16 x i16> on avx2"
href="https://llvm.org/bugs/show_bug.cgi?id=27730">27730</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>left shift is broken for negative <16 x i16> on avx2
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>andrew.b.adams@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr>
<tr>
<th>Classification</th>
<td>Unclassified
</td>
</tr></table>
<p>
<div>
<pre>I think there's a bug in lowering left-shifts of wider-than-native 16-bit
signed integer vectors on haswell. It's computing incorrect values.
With the following .ll:
define void @fn(<16 x i16> * %a_ptr, <16 x i16> * %b_ptr, <16 x i16> * %c_ptr)
{
%a = load <16 x i16>, <16 x i16> * %a_ptr
%b = load <16 x i16>, <16 x i16> * %b_ptr
%result = shl <16 x i16> %a, %b
store <16 x i16> %result, <16 x i16> * %c_ptr
ret void
}
driven by the following test code:
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
extern "C" void fn(int16_t *a, int16_t *b, int16_t *c);
int main(int argc, char **argv) {
int16_t *a = (int16_t *)aligned_alloc(32, 32);
int16_t *b = (int16_t *)aligned_alloc(32, 32);
int16_t *c = (int16_t *)aligned_alloc(32, 32);
for (int i = 0; i < 16; i++) {
a[i] = -3;
b[i] = i/2;
}
fn(a, b, c);
for (int i = 0; i < 16; i++) {
printf("%d vs %d\n", a[i] * (1 << b[i]), c[i]);
}
return 0;
}
Note that there's no nuw or nsw tag on my shl, and I'm shifting a small value
by a small amount, so there's no (signed) overflow happening. The language
reference seems to indicate that this is well-defined, and should be equivalent
to the shift-and-multiply in the driver.
We get a disagreement between the driver and the vector code:
$ llc vector_shl.ll -mcpu=haswell -o vector_shl.s
$ clang++ driver.cpp vector_shl.s
$ ./a.out
-3 vs -3
-3 vs -3
-6 vs -5
-6 vs -5
-12 vs -9
-12 vs -9
-24 vs -17
-24 vs -17
-48 vs -33
-48 vs -33
-96 vs -65
-96 vs -65
-192 vs -129
-192 vs -129
-384 vs -257
-384 vs -257
Sandybridge works fine:
$ llc vector_shl.ll -mcpu=sandybridge -o vector_shl.s
$ clang++ driver.cpp vector_shl.s
$ ./a.out
-3 vs -3
-3 vs -3
-6 vs -6
-6 vs -6
-12 vs -12
-12 vs -12
-24 vs -24
-24 vs -24
-48 vs -48
-48 vs -48
-96 vs -96
-96 vs -96
-192 vs -192
-192 vs -192
-384 vs -384
-384 vs -384
native <8 x i16> vectors also seem to work fine for either target.
trunk, 3.8, and 3.7 all give the same behavior, so either this is an old bug,
or I'm misunderstanding what counts as UB here.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>