<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/55646>55646</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            _mm256_permute4x64_epi64 generates vpermpd instead of vpermq
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          moon-chilled
      </td>
    </tr>
</table>

<pre>
    The following function:

```c
__m256i f(__m256i x) { return _mm256_permute4x64_epi64(x,0x4e); }
```

Compiles to:

```asm
vpermpd ymm0,ymm0,0x4e
```

It should generate vpermq instead.  The use of __m256i is a hint that 'x' is being used for integer operations, so the use of a floating-point instruction is more likely to trigger a penalty for switching domains.

GCC generates vpermq for this snippet, and will generate vpermpd instead if __m256d and _mm256_permute4x64_pd are used; this is ideal behaviour.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx1UkmvmzAQ_jVwGQX5mSXJgUOTKFXvvUcGD-DW2NQ2Wf59x-TlNYoayQJv823jxspb_XNA6KzW9qJMD91s2qCsSfJvCTsk7PGt2H209_XpNPKyUtAlfPOYXxO-hWS9A4dhdgZOY9w_TejGOWBxrYoTTqoqqISu7tm1QKpI8h0VHV5onrn3dpyURg_BvlMl_HjfOUe2ScJtHBlxfP4WqvcEPwL4wc5aQo8GnQgIC84fUMYHFDIDiCnNHsF28PCrPAgYlAkQBhEg4WuytY7bDcYo6bqkZB2hBOzRgZ0iOIXrSRN4S3VfoAI6benQ9KvJRsxI7ealFxFytA5Bq9-ob5QDBKf6CClgQiN0uC1E_qJCO0RuaUdBCNmzze_7_ZdB_3AYy8JABN6oacIQlQkj4aK0fomDYv3MA9QjBblc_k-n6bJwizsZO7xQxCFRaMpnEGdlZ5elss7lNt-KNKigsX73Zl6FP2mh8O5e0tnpeghh8vGZ8CONXoVhbrLWjrTQ-vz4rSZnf2FLbo_K-xmpIceyrIoqHeqqQbEtxEfVybJs11teoCixqIquKBu2WadaNKh9nZS7hHODF1ggaJ6Uh1TVnHHOSp6zNWcszzpWdd2mkZXIP2TFmqRgSL3RWdSRWdenrl4kNXPv6VArH_y_Q-G96g3iQkf4Yg6DdfVorVlRq7VGmS789aL_L0x0Om0">