[LLVMbugs] [Bug 12359] New: byte shuffles generate inefficient code for swapping in zeros with sse3

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Mar 26 08:25:44 PDT 2012


http://llvm.org/bugs/show_bug.cgi?id=12359

             Bug #: 12359
           Summary: byte shuffles generate inefficient code for swapping
                    in zeros with sse3
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: sroland at vmware.com
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified


This code:
define <16 x i8> @shuf(<16 x i8> %inval1) {
entry:
  %0 = shufflevector <16 x i8> %inval1, <16 x i8> zeroinitializer, <16 x i32>
<i32 0, i32 4, i32 3, i32 2, i32 16, i32 16, i32 3, i32 4, i32 0, i32 4, i32 3,
i32 2, i32 16, i32 16, i32 3, i32 4>
  ret <16 x i8> %0
}

gets compiled to:
    pxor    %xmm1, %xmm1
    pshufb    .LCPI0_0(%rip), %xmm1
    pshufb    .LCPI0_1(%rip), %xmm0
    por    %xmm1, %xmm0
    ret

(I didn't include the .LCPI constants here, but note that it will put 0x80 in
them for bytes which will come from the "other" vector so the pshufb results
can be ored.)

This is inefficient, since all values taken from the zeroinitializer vector
(i.e. zeros) could instead be directly encoded in the first pshufb, so this
code could just be (with the exact same constant even):
    pshufb    .LCPI0_1(%rip), %xmm0
        ret

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list