[LLVMbugs] [Bug 22483] New: will it blend? apparently not [SSE, AVX, X86]

Thu Feb 5 13:38:40 PST 2015

http://llvm.org/bugs/show_bug.cgi?id=22483

            Bug ID: 22483
           Summary: will it blend? apparently not [SSE, AVX, X86]
           Product: libraries
           Version: trunk
          Hardware: PC
                OS: All
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
          Assignee: unassignedbugs at nondot.org
          Reporter: spatel+llvm at rotateright.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

define float @blendv(float %x, float %y) {
  %cmp = fcmp oge float %x, %y
  %sel = select i1 %cmp, float %x, float %y
  ret float %sel
}

Or in C:

float blendv(float x, float y) {
        if (x >= y) return x;
        return y;
}

There are no scalar FP select instructions for xmm registers (at least through
AVX2 from what I can tell)...just like there are no scalar FP logical ops (and,
xor, or, andn). Consistent unorthogonality?

Currently (r228316), we generate:
$ llc  -mattr=avx blend.ll -o -
...
    vcmpless    %xmm0, %xmm1, %xmm2
    vandps    %xmm0, %xmm2, %xmm0
    vandnps    %xmm1, %xmm2, %xmm1
    vorps    %xmm0, %xmm1, %xmm0
    retq

I think that we'd be better off using 'vblendvps'; this was added with SSE4.1:
    vcmpless    %xmm0, %xmm1, %xmm2
    vblendvps    %xmm2, %xmm0, %xmm1, %xmm0
    retq

I'm not sure what's in bits 32:127 of the output reg in either case, but we're
not worse off using blendv?

FWIW, icc 15 just does a compare and branch:
        vcomiss   %xmm1, %xmm0
        jae       L_L3
        vmovaps   %xmm1, %xmm0
L_L3:
        ret

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150205/0d2874fd/attachment.html>