<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/57182>57182</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            WebAssembly: Suboptimal codegen of combined SIMD operations
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          alexcrichton
      </td>
    </tr>
</table>

<pre>
    Originally reported at https://github.com/rust-lang/stdarch/issues/1322 it appears that combining the codegen for these three wasm simd instructions results in a scalarized lowering rather than using the instructions themselves:

* `i16x8.extend_low_i8x16_u`
* `i32x4.extend_low_i16x8_u`
* `f32x4.convert_i32x4_u`

[This godbolt example](https://rust.godbolt.org/z/Wb5EcbqMa) has the Rust source code, the generated WebAssembly today, and the corresponding optimized LLVM IR that is being lowered.

cc @tlively 
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx9Ul1v2zAM_DX2i1DDlmPHffBDt25AgRYD2mF9DPTB2BpkyRPlNO2vH-U2W4IBAxKZlKjj8U7S69f-WzCDccLaVxZg9iGCZiKyMcYZs_om41_pN5g4LrJQfqIkLBivrHADxRi1CGqkyCAugBRUNefMRCbmGURAFkeCo5vSOOMGSoEyDQM4tvch5Qi0BgD2InBiaCbNjMMYFhWNd0i8cLERaZMJhkpYEcwb0bT-BULCDIJQEpRwbMFTlwsM2pgQ7AHWocrbrDyt_IZlbWmq9tgVcIzg9I6Ad6Y7Vu1uoaPLupofNxd16eK_dfu1Tnl3gBB3663zove1-fR9NMgGr6W3kcFRTLOFrLnNeHdpQNK8-KgrfEjSv9H_WTZflPz1IDJ-zUaxjskeqZahX4J6Vzrjn9d9UhxIKRLuGeQNIkySTI9ei9dUIpz-MCeQ4LN3Ognp52imVe37-x8P7O7x3U9iLSGdrx6ALs7nUoplmzJacwBqkENftW3ZdmXXdbnua31dX4s8mmihP2NCs7KnRa4Nhf3zRvz-4_EQhae7h1tilKZIpuZLsP1_Hqq1h9Pnag7-J6h4_lCbbdXxfOw51LIqt23XgtpWTa23TV1WaUfWvN2XuRUSLPZkV8a5gxe2QlBMTuWm5yXnZVe1vKw2m6bY1rUSFVcEqkUnFUkBkzC2SDySd3noV0pyGZAOrcGIfw8FohkcwNqO8MUSRx96YeGoglFj9C5f-_cr_9_nmD4L">