[LLVMbugs] [Bug 22483] New: will it blend? apparently not [SSE, AVX, X86]
bugzilla-daemon at llvm.org
bugzilla-daemon at llvm.org
Thu Feb 5 13:38:40 PST 2015
http://llvm.org/bugs/show_bug.cgi?id=22483
Bug ID: 22483
Summary: will it blend? apparently not [SSE, AVX, X86]
Product: libraries
Version: trunk
Hardware: PC
OS: All
Status: NEW
Severity: normal
Priority: P
Component: Backend: X86
Assignee: unassignedbugs at nondot.org
Reporter: spatel+llvm at rotateright.com
CC: llvmbugs at cs.uiuc.edu
Classification: Unclassified
define float @blendv(float %x, float %y) {
%cmp = fcmp oge float %x, %y
%sel = select i1 %cmp, float %x, float %y
ret float %sel
}
Or in C:
float blendv(float x, float y) {
if (x >= y) return x;
return y;
}
There are no scalar FP select instructions for xmm registers (at least through
AVX2 from what I can tell)...just like there are no scalar FP logical ops (and,
xor, or, andn). Consistent unorthogonality?
Currently (r228316), we generate:
$ llc -mattr=avx blend.ll -o -
...
vcmpless %xmm0, %xmm1, %xmm2
vandps %xmm0, %xmm2, %xmm0
vandnps %xmm1, %xmm2, %xmm1
vorps %xmm0, %xmm1, %xmm0
retq
I think that we'd be better off using 'vblendvps'; this was added with SSE4.1:
vcmpless %xmm0, %xmm1, %xmm2
vblendvps %xmm2, %xmm0, %xmm1, %xmm0
retq
I'm not sure what's in bits 32:127 of the output reg in either case, but we're
not worse off using blendv?
FWIW, icc 15 just does a compare and branch:
vcomiss %xmm1, %xmm0
jae L_L3
vmovaps %xmm1, %xmm0
L_L3:
ret
--
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150205/0d2874fd/attachment.html>
More information about the llvm-bugs
mailing list