<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - [SIMD] __builtin_shufflevector to 64-bit vector then extending not vectorized"
   href="https://bugs.llvm.org/show_bug.cgi?id=50807">50807</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>[SIMD] __builtin_shufflevector to 64-bit vector then extending not vectorized
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Windows NT
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Backend: WebAssembly
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>clang@evan.coeusgroup.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>With -msimd128 -O3, I would expect a __builtin_shufflevector which returns half
the elements plus a __builtin_convertvector to extend each element (resulting
in a 128-bit vector) to generate a v128.shuffle and an extend_low.  Instead, it
generates a bunch of extract_lane and replace_lane instructions.

Here are a couple of quick examples (Compiler Explorer:
<a href="https://godbolt.org/z/EjbMqPhx1">https://godbolt.org/z/EjbMqPhx1</a>):


#include <wasm_simd128.h>

#pragma clang diagnostic ignored "-Wmissing-prototypes"

typedef   int8_t i8x16 __attribute__((__vector_size__(16)));
typedef  int16_t i16x8 __attribute__((__vector_size__(16)));
typedef  int32_t i32x4 __attribute__((__vector_size__(16)));
typedef  uint8_t u8x16 __attribute__((__vector_size__(16)));
typedef uint16_t u16x8 __attribute__((__vector_size__(16)));
typedef uint32_t u32x4 __attribute__((__vector_size__(16)));

i16x8
foo(i8x16 a) {
    return __builtin_convertvector(
        __builtin_shufflevector(a, a,
            0, 2, 4, 6, 8, 10, 12, 14
        ),
        i16x8
    );
}

v128_t
foo_intrin(v128_t a) {
    return
        wasm_i16x8_extend_low_i8x16(
            wasm_i8x16_shuffle(a, a,
                0, 2, 4, 6, 8, 10, 12, 14,
                1, 3, 5, 7, 9, 11, 13, 15)
        );
}

i16x8
bar(i8x16 a) {
    return
        __builtin_convertvector(
            __builtin_shufflevector(
                a, a,
                0, 2, 4, 6, 8, 10, 12, 14
            ),
            i16x8
        )
        -
        __builtin_convertvector(
            __builtin_shufflevector(
                a, a,
                1, 3, 5, 7, 9, 11, 13, 15
            ),
            i16x8
        );
}

i16x8
bar_intrin(v128_t a) {
    v128_t shuffled = wasm_i8x16_shuffle(
        a, a,
        0, 2, 4, 6, 8, 10, 12, 14,
        1, 3, 5, 7, 9, 11, 13, 15
    );
    return
        wasm_i16x8_extend_low_i8x16(shuffled) -
        wasm_i16x8_extend_high_i8x16(shuffled);
}



I think it's pretty reasonable to expect that foo and foo_intrin should
generate roughly the same code (the upper half of the shuffle doesn't matter,
so maybe all zeros or something).

I'd be very impressed, OTOH, if bar and bar_intrin generated the same code. 
I'm not sure how feasible that is, though.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>