<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - [X86] Mask from phi generates redundant pslld"
href="https://bugs.llvm.org/show_bug.cgi?id=40155">40155</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>[X86] Mask from phi generates redundant pslld
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>All
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Backend: X86
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>nikita.ppv@gmail.com
</td>
</tr>
<tr>
<th>CC</th>
<td>craig.topper@gmail.com, llvm-bugs@lists.llvm.org, llvm-dev@redking.me.uk, spatel+llvm@rotateright.com
</td>
</tr></table>
<p>
<div>
<pre>Originally reported at <a href="https://github.com/rust-lang/rust/issues/57110">https://github.com/rust-lang/rust/issues/57110</a>.
The following Rust code:
pub unsafe fn foo(y: __m128i, mut x: __m128i) {
let mut mask = _mm_cmplt_epi32(x, y);
while _mm_extract_epi32(mask, 0) == 0 {
x = _mm_blendv_epi8(y, x, mask);
mask = _mm_cmplt_epi32(x, y);
}
}
With this slightly reduced IR:
define <4 x i32> @test(<4 x i32> %x, <4 x i32> %y) #0 {
start:
%mask = icmp slt <4 x i32> %x, %y
%elem = extractelement <4 x i1> %mask, i32 0
br i1 %elem, label %end, label %loop
loop:
%mask2 = phi <4 x i1> [ %mask3, %loop ], [ %mask, %start ]
%x2 = phi <4 x i32> [ %x3, %loop ], [ %x, %start ]
%x3 = select <4 x i1> %mask2, <4 x i32> %x2, <4 x i32> %y
%mask3 = icmp slt <4 x i32> %x3, %y
%elem2 = extractelement <4 x i1> %mask3, i32 0
br i1 %elem2, label %end, label %loop
end:
%x4 = phi <4 x i32> [ %x3, %loop ], [ %x, %start ]
ret <4 x i32> %x4
}
attributes #0 = { "target-cpu"="x86-64" "target-features"="+sse4.1" }
Produces the following assembly for the loop BB:
.LBB0_2: # %loop
# =>This Inner Loop Header: Depth=1
pslld $31, %xmm0
movdqa %xmm1, %xmm3
blendvps %xmm0, %xmm2, %xmm3
movdqa %xmm1, %xmm0
pcmpgtd %xmm3, %xmm0
pextrb $0, %xmm0, %eax
movaps %xmm3, %xmm2
testb $1, %al
je .LBB0_2
The pslld 31, %xmm0 instruction here is unnecessary.
The issue is that the comparison mask comes in from a phi node. In the entry
block the v4i1 mask was type legalized to v4i32. In the loop block it's
truncated back down to v4i1 and used in a vselect. Type legalization then
converts the truncate into a sign_extend_inreg from v4i1 to v4i32. This is a
no-op for mask values and would usually get combined away. However, at this
point we don't know that this is a mask (coming from a pcmpgtd for both phi
operands) anymore.
This is somewhat related to <a class="bz_bug_link
bz_status_NEW "
title="NEW - [AVX,SSE] inefficient code generated for vector compares due to sext to i32 moved across phi node"
href="show_bug.cgi?id=11730">https://bugs.llvm.org/show_bug.cgi?id=11730</a>, though
there is no type legalization issue here (the chosen types are correct).</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>