<html>

    <head>

      <base href="https://bugs.llvm.org/">

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW - Improve lshr on <16 x i8> with variable shift amounts on X86"

   href="https://bugs.llvm.org/show_bug.cgi?id=34694">34694</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>Improve lshr on <16 x i8> with variable shift amounts on X86

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>libraries

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>5.0

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>enhancement

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>Backend: X86

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>llvm@henning-thielemann.de

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvm-bugs@lists.llvm.org

          </td>

        </tr></table>

      <p>

        <div>

        <pre>I want to shift byte/character vectors by variable amounts.

In the first case all bytes shall be shifted by the same amount,

in the second case all bytes shall be shifted by individual amounts:

define <16 x i8> @shift_vector_uniform(<16 x i8>, i8) {

_L1:

  %nv = insertelement <1 x i8> undef, i8 %1, i32 0

  %shift = shufflevector <1 x i8> %nv, <1 x i8> undef, <16 x i32>

zeroinitializer

  %v = lshr <16 x i8> %0, %shift

  ret <16 x i8> %v

}

define <16 x i8> @shift_vector_mixed(<16 x i8>, <16 x i8>) {

_L1:

  %v = lshr <16 x i8> %0, %1

  ret <16 x i8> %v

}

X86 does not allow to shift byte vectors, thus LLVM's backend must work with

shift of i16 vectors. It does so using three shifts psrlw $4, %xmm0; psrlw $2,

%xmm0; psrlw $1, %xmm0.

The result looks pretty lengthy. I have not tried but I think we can do that

better.

The first case should need only one psrlw and an appropriate mask. We could

create the required mask by a scalar shift which we then broadcast to all

vector elements.

We could implement the second case using two psrlw's. One applied to the even

and one to the odd vector elements and then blend the results.</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>