[LLVMbugs] [Bug 9070] New: Incorrect code generated from shuffles

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Thu Jan 27 08:05:42 PST 2011


http://llvm.org/bugs/show_bug.cgi?id=9070

           Summary: Incorrect code generated from shuffles
           Product: libraries
           Version: trunk
          Platform: PC
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P
         Component: Backend: X86
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: zvi.rackover at intel.com
                CC: llvmbugs at cs.uiuc.edu


Running llc on the following test (also attached) gives incorrect generated
code:

target datalayout =
"e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-f80:128:128-v64:64:64-v128:128:128-a0:0:64-f80:32:32-n8:16:32"
target triple = "i686-pc-win32"

define void @test1(<8 x i32>* %source, <2 x i32>* %dest) nounwind {

  %a149 = getelementptr inbounds <8 x i32>* %source
  %a150 = load <8 x i32>* %a149, align 32
  %a151 = shufflevector <8 x i32> %a150, <8 x i32> undef, <2 x i32> <i32 0, i32
5>
  %a152 = shufflevector <2 x i32> %a151, <2 x i32> undef, <2 x i32> <i32 1, i32
0>
  %a153 = getelementptr inbounds <2 x i32>* %dest
  store <2 x i32> %a152, <2 x i32>* %a153, align 8
  ret void
}

The test reads an <8 x i32> source vector from memory and writes a <2 x i32>
dest vector to memory.

The two shuffles do:
temp.0 = source.0
temp.1 = source.5

dest.0 = temp.1
dest.1 = temp.0

Which is equivalent to:
dest.0 = source.5
dest.1 = source.0


Output:
llc < test-repro.ll
        .def     _test1;
        .scl    2;
        .type   32;
        .endef
        .text
        .globl  _test1
        .align  16, 0x90
_test1:                                 # @test1
# BB#0:
        movl    4(%esp), %eax
        movaps  16(%eax), %xmm0
        movlps  (%eax), %xmm0
        pshufd  $1, %xmm0, %xmm0        # xmm0 = xmm0[1,0,0,0]
        movl    8(%esp), %eax
        pextrd  $1, %xmm0, 4(%eax)
        movd    %xmm0, (%eax)
        ret

After the 'movaps':
 XMM0 = [source.4 source.5 source.6 source.7]

After the 'movlps':
 XMM0 = [source.0 source.1 source.6 source.7]

After the 'pshufd':
 XMM0 = [source.1 source.0 source.0 source.0]

The 'pextrd' writes:
 dest.1 = source.0

The 'movd' writes:
 dest.0 = source.1 <== This is not correct see explanation of test above.


Removing the following pattern from X86InstrSSE.td gives correct (but
inefficient) code:
def : Pat<(X86Movss VR128:$src1,
          (bc_v4i32 (v2i64 (load addr:$src2)))),
          (MOVLPSrm VR128:$src1, addr:$src2)>;

-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list