[LLVMbugs] [Bug 2622] New: movlps does not get selected for 4, 5, 2, 3 shuffle to/ from memory

bugzilla-daemon at cs.uiuc.edu bugzilla-daemon at cs.uiuc.edu
Fri Aug 1 01:49:08 PDT 2008


http://llvm.org/bugs/show_bug.cgi?id=2622

           Summary: movlps does not get selected for 4,5,2,3 shuffle to/from
                    memory
           Product: new-bugs
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: new bugs
        AssignedTo: unassignedbugs at nondot.org
        ReportedBy: nicolas at capens.net
                CC: llvmbugs at cs.uiuc.edu


The following LLVM IR is not generating optimal x86 code:

external global <4 x float>, align 16           ; <<4 x float>*>:0 [#uses=1]
external global <4 x float>, align 16           ; <<4 x float>*>:1 [#uses=1]
external global <4 x float>, align 1            ; <<4 x float>*>:2 [#uses=2]

define internal void @""() {
        load <4 x float>* @0, align 16          ; <<4 x float>>:1 [#uses=1]
        load <4 x float>* @1, align 16          ; <<4 x float>>:2 [#uses=1]
        add <4 x float> %1, %2          ; <<4 x float>>:3 [#uses=1]
        load <4 x float>* @2, align 1           ; <<4 x float>>:4 [#uses=1]
        shufflevector <4 x float> %4, <4 x float> %3, <4 x i32> < i32 4, i32 5,
i32 2, i32 3 >          ; <<4 x float>>:5 [#uses=1]
        store <4 x float> %5, <4 x float>* @2, align 1
        ret void
}

What I'm getting is:

  push        ebp  
  mov         ebp,esp 
  and         esp,0FFFFFFF0h 
  movaps      xmm0,xmmword ptr ds:[1757E80h] 
  addps       xmm0,xmmword ptr ds:[1757E90h] 
  movups      xmm1,xmmword ptr ds:[1757E88h] 
  movsd       xmm1,xmm0 
  movups      xmmword ptr ds:[1757E88h],xmm1 
  mov         esp,ebp 
  pop         ebp  
  ret              

But I was rather hoping to see something like:

  push        ebp  
  mov         ebp,esp 
  and         esp,0FFFFFFF0h 
  movaps      xmm0,xmmword ptr ds:[1757E80h] 
  addps       xmm0,xmmword ptr ds:[1757E90h] 
  movlps      xmmword ptr ds:[1757E88h],xmm1 
  mov         esp,ebp 
  pop         ebp  
  ret      

Curiously, it looks like instruction selection already has a pattern to select
movlps for this situation, but for some reason it doesn't actually end up in
the result. Same for movhps and a 1,2,6,7 shuffle.

Additionally, it looks like a pattern could be added for using movlps/ movhps
as an insert from memory (a pattern for extract already seems present, though I
haven't verfied yet that it gets selected).


-- 
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.



More information about the llvm-bugs mailing list