<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Inefficient code generation for v16i8 vertical less-equal"
href="https://bugs.llvm.org/show_bug.cgi?id=38522">38522</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Inefficient code generation for v16i8 vertical less-equal
</td>
</tr>
<tr>
<th>Product</th>
<td>new-bugs
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>new bugs
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>gonzalobg88@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>chandlerc@gmail.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
</td>
</tr></table>
<p>
<div>
<pre>The following Rust program:
extern crate packed_simd;
use packed_simd::*;
pub fn le_i8x16(x: i8x16, y: i8x16) -> bool {
x.le(y).all()
}
when compiled with AVX2 and O3 (RUSTFLAGS="-C target-feature=+avx2") generates
the following LLVM-IR (<a href="https://godbolt.org/g/R2zra8">https://godbolt.org/g/R2zra8</a>) and assembly:
declare i32 @llvm.x86.sse41.ptestc(<2 x i64>, <2 x i64>);
define zeroext i1 @le_i8x16(<16 x i8>* %x, <16 x i8>* %y) {
start:
%0 = load <16 x i8>, <16 x i8>* %x, align 16
%1 = load <16 x i8>, <16 x i8>* %y, align 16
%2 = icmp sle <16 x i8> %0, %1
%3 = sext <16 x i1> %2 to <16 x i8>
%4 = bitcast <16 x i8> %3 to <2 x i64>
%5 = tail call i32 @llvm.x86.sse41.ptestc(<2 x i64> %4, <2 x i64> <i64 -1, i64
-1>)
%6 = icmp eq i32 %5, 1
ret i1 %6
}
packed_simd_iter::le_i8x16:
push rbp
mov rbp, rsp
vmovdqa xmm0, xmmword, ptr, [rdi]
vpcmpgtb xmm0, xmm0, xmmword, ptr, [rsi]
vpcmpeqd xmm1, xmm1, xmm1
vpxor xmm0, xmm0, xmm1
vptest xmm0, xmm1
setb al
pop rbp
ret
Note that Rust generates IR for a vertical <= vector-vector operation which
returns a mask, and then IR for the all reduction.
The assembly we would expect is:
pushq %rbp
movq %rsp, %rbp
vmovdqa (%rdi), %xmm0
vpcmpgtb (%rsi), %xmm0, %xmm0
vptest %xmm0, %xmm0
sete %al
popq %rbp
retq</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>