<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/129276>129276</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Suboptimal codegen for broadcasting a 16-bit vector element
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          dzaima
      </td>
    </tr>
</table>

<pre>
    This LLVM IR:
```llvm
define <16 x i16> @broadcast_sel15(<16 x i16> noundef %x) {
  %r = shufflevector <16 x i16> %x, <16 x i16> poison, <16 x i32> splat(i32 15)
  ret <16 x i16> %r
}
```
could be compiled to:
```asm
        vpshufhw ymm0, ymm0, 255
 vpermq   ymm0, ymm0, 255
```

but llvm produces:

```asm
 vpshufhw        ymm0, ymm0, 255
        vpbroadcastd    ymm1, dword ptr [rip + .LCPI15_0] ; 6
        vpermd  ymm0, ymm1, ymm0
```
[all 16 cases with C intrinsics with gcc for comparison](https://c.godbolt.org/z/r6ssxno7z), and [direct LLVM IR](https://llvm.godbolt.org/z/eYvbrjvME)

12..15 are the most problematic ones, but the codegen for 8 of `vextracti128`+`vpbroadcastw` would also likely be better off as `vpshuflw`+`vpermq` to avoid having two cross-lane ops. The rest are fine either way I think.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJx0VF1v2zgQ_DX0yyKCSJn6eNCDHddAgBQ43BWHu6eCIlcSW0rUkZSU5NcfqLhunKaCAcHkcoazM1rhve5GxJrwI-GnnZhDb12tXoQexK6x6rn-0msPj49_f4aHP0l2IOmB5Onrz5hlIOlBYatHBJLd0xyeQNOcZJ-A7NPGWaGk8OGrR0M5YeW7mtHOo8IWCONPhFVAiiNJDxD_OyDZCXw_t63BBWWw7heG7dT9--XJam_Hm42MxQ0_GREIK3XGIN6m2rgchg-AXdRZnN6qJelB2tkoaBCkHSZtUEGw73oi_LDBvj7LFBX0KzwPQxqv9OPNOI9ly4Ru-A_gN_tvuUl6aOYAsecwOatmif7C_Sv9lffy_Ib-esurU-pSTWOVWq1TMAUHhB-dnoCwIySP9388UP41JfwEJDtCfoOEblA3fPTK-04PPwpjgOYghUcPqw493IMeg9Oj1_Ky0kkJrXVbx4XbnOUnwso-hGnTz86EnWXSWdVYExLrOsLOL4SdXe7902iLl-g0uwcxqqhDaYcyXBP9AVhs8Qd4-O_SuG_L50-vySHpgbIkoRyEQwg9wmB9iNY0BgcRtAQ7oo_M0bdYIK3CDsdNTwm2BZKnCz4FJ2TQlJWxMewYF38aspI8hXXLnTDegtHf0TzHDDYYAjqwbQvCb1Cb6WZ9AxPTFQGCBbFYraAXix47CKsF6az3d0aMCHbyCXzpERz6sOnZvmjUoUcHq3iGBwi9Hr8nO1VnqsoqscOaFvu0yPe0qnZ93fJM5IJTKvYyayRreNuopizUXlQiZ2Kna5YynjJW0iLLsyqpiiyjDEvO2zIreUb2KQ5Cm2Trv3XdTns_Y01ZxYp8Z0SDxm-DirERV9h2CWNxbrk6Hrpr5s6TfWq0D_4nTNDBbBPunzKPof1rbuwU9CDMjSPXlscGCaD5XaMDXEYPGhxwDLvZmfo2LZ0O_dwk0g6X6Fxed5Oz31AGws7bTT1h54uUpWb_BwAA__9VeqT5">