<html>
    <head>
      <base href="http://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - having a select instruction doesn't mean we need to use it (x86 - blend)"
   href="http://llvm.org/bugs/show_bug.cgi?id=20648">20648</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>having a select instruction doesn't mean we need to use it (x86 - blend)
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>All
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>spatel+llvm@rotateright.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvmbugs@cs.uiuc.edu
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Better hardware earned us more instructions, more data, and worse performance.

Using llc with x86-64 target built from r215293:

$ cat blend.ll 
define void @foo(<4 x i32>* %f) {
  %select = select <4 x i1> <i1 1, i1 0, i1 0, i1 1>, <4 x i32> <i32 1, i32 1,
i32 1, i32 1>, <4 x i32> <i32 2, i32 2, i32 2, i32 2>
  store <4 x i32> %select, <4 x i32>* %f, align 4
  ret void
}

$ ./llc blend.ll --mattr="-sse4.1" -o -
...
LCPI0_0:
    .long    1                       ## 0x1
    .long    2                       ## 0x2
    .long    2                       ## 0x2
    .long    1                       ## 0x1
    .section    __TEXT,__text,regular,pure_instructions
    .globl    _foo
    .align    4, 0x90
_foo:                                   ## @foo
    .cfi_startproc
## BB#0:
    movaps    LCPI0_0(%rip), %xmm0  <--- precomputed select data loaded
    movups    %xmm0, (%rdi)
    retq

$ ./llc blend.ll --mattr="+sse4.1" -o -
...
LCPI0_0:
    .long    2                       ## 0x2
    .long    2                       ## 0x2
    .long    2                       ## 0x2
    .long    2                       ## 0x2
LCPI0_1:                               <--- 16 extra bytes of constant pool
    .long    1                       ## 0x1
    .long    1                       ## 0x1
    .long    1                       ## 0x1
    .long    1                       ## 0x1
    .section    __TEXT,__text,regular,pure_instructions
    .globl    _foo
    .align    4, 0x90
_foo:                                   ## @foo
    .cfi_startproc
## BB#0:
    movaps    LCPI0_1(%rip), %xmm0        <--- 2 loads and a select
    blendps    $6, LCPI0_0(%rip), %xmm0
    movups    %xmm0, (%rdi)
    retq</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>