[LLVMbugs] [Bug 16695] New: pminuw instruction not generated for <16 x i16> selection encoding a 'min' (SSE4)
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Wed Jul 24 11:37:10 PDT 2013
http://llvm.org/bugs/show_bug.cgi?id=16695
Bug ID: 16695
Summary: pminuw instruction not generated for <16 x i16>
selection encoding a 'min' (SSE4)
Product: new-bugs
Version: unspecified
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: new bugs
Assignee: unassignedbugs at nondot.org
Reporter: matt at pharr.org
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
Created attachment 10920
--> http://llvm.org/bugs/attachment.cgi?id=10920&action=edit
test case
With a SSE4 target, given a vector select of <16xi16> representing a 'min':
define <16 x i16> @foo16(<16 x i16> %a, <16 x i16> %b, <16 x i8> %__mask) #0 {
allocas:
%less_a_load_b_load.i = icmp ult <16 x i16> %a, %b
%blend.i8 = select <16 x i1> %less_a_load_b_load.i, <16 x i16> %a, <16 x i16>
%b
ret <16 x i16> %blend.i8
}
roughly 30 instructions are generated if I run "llc
-mattr=+sse,+sse2,+sse3,+sse41,-sse42,-sse4a,+ssse3,-popcnt,+cmov bug.ll -o -".
In contrast, given a vector select of an <8 x i16> vector:
define <8 x i16> @foo8(<8 x i16> %a, <8 x i16> %b, <8 x i8> %__mask) #0 {
allocas:
%less_a_load_b_load.i = icmp ult <8 x i16> %a, %b
%blend.i8 = select <8 x i1> %less_a_load_b_load.i, <8 x i16> %a, <8 x i16> %b
ret <8 x i16> %blend.i8
}
A single pminuw is generated.
_foo8: ## @foo8
## BB#0: ## %allocas
pminuw %xmm1, %xmm0
ret
It'd be nice if the <16 x i16> case turned into two nice little pminuw
instructions.
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20130724/c399e97f/attachment.html>
More information about the llvm-bugs
mailing list