<html>
    <head>
      <base href="https://llvm.org/bugs/" />
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW --- - Inverted masks can be used for better codegen"
   href="https://llvm.org/bugs/show_bug.cgi?id=27780">27780</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Inverted masks can be used for better codegen
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: X86
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>mkuper@google.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr>

        <tr>
          <th>Classification</th>
          <td>Unclassified
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Given this IR:

define <32 x i8> @constant_pblendvb_avx2(<32 x i8> %xyzw, <32 x i8> %abcd) {
entry:
  %select = select <32 x i1> <i1 false, i1 false, i1 true, i1 false, i1 true,
i1 true, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 true, i1
true, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 true, i1
true, i1 true, i1 false, i1 false, i1 false, i1 true, i1 false, i1 true, i1
true, i1 true, i1 false>, <32 x i8> %xyzw, <32 x i8> %abcd
  ret <32 x i8> %select
}

Up until r269676, with SSH4.1, we used to lower to:
movdqa %xmm0, %xmm4
movaps {{.*#+}} xmm0 = [255,255,0,255,0,0,0,255,255,255,0,255,0,0,0,255]
pblendvb %xmm2, %xmm4
pblendvb %xmm3, %xmm1
movdqa %xmm4, %xmm0

Now, we lower it to:
movdqa %xmm0, %xmm4
movaps {{.*#+}} xmm0 = [0,0,255,0,255,255,255,0,0,0,255,0,255,255,255,0]
pblendvb %xmm4, %xmm2
pblendvb %xmm1, %xmm3
movdqa %xmm2, %xmm0
movdqa %xmm3, %xmm1

This isn't directly related to r269676, it's just that the CG got lucky before.
The underlying issue is that when the output of the blend is constrained (in
this  case, because the function's return value must live in xmm0 and xmm1) we
could invert the mask to avoid a copy.</pre>
        </div>
      </p>
      <hr>
      <span>You are receiving this mail because:</span>
      
      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>