<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/57966>57966</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [X86] Failure to merge cvtdq2ps nodes
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            backend:X86,
            llvm:SLPVectorizer
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          RKSimon
      </td>
    </tr>
</table>

<pre>
    Pulled out of Issue #54630

We can probably get further with this in SLP with non-pow2 support, but we should investigate whether the backend can easily merge cvtdq2ps nodes back together, since we do generate them for scalar cases if the i32 originated on the FPU, so potentially multiple scalars may have come from the same vector.

```
#include <x86intrin.h>

__v4sf _mm_cvtepi32_ps(__v4si a)
{
    __v4sf res;
    for(int i = 0; i != 4; ++i)
        res[i] = a[i];
    return res;
}

void save_f4( float *arr, __v4si i4)
{
    __v4sf f4 = _mm_cvtepi32_ps(i4);
    f4 *= 1.23f;
    for(int i = 0; i != 3; ++i)
      arr[i] = f4[i];
}
```
clang -g0 -O3
```
_mm_cvtepi32_ps(int __vector(4)):
  cvtdq2ps %xmm0, %xmm0
  retq

save_f4(float*, int __vector(4)):
  pshufd $238, %xmm0, %xmm1 # xmm1 = xmm0[2,3,2,3]
  cvtdq2ps %xmm1, %xmm1
  cvtdq2ps %xmm0, %xmm0
  mulps .LCPI1_0(%rip), %xmm0
  movlps %xmm0, (%rdi)
  mulss .LCPI1_1(%rip), %xmm1
  movss %xmm1, 8(%rdi)
  retq
```

https://godbolt.org/z/84817sx63
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyVVVFzmzgQ_jXwshMPSAjjBx6SuJnpXGcu085d--YRIEB3AlFJOEl_fVcySXDqtlMPxlq0--1-34p1pZun8n5WSjSgZwe6hffWzgIiQlmW0yRK9lFyfbp_FlDzESajK16pJ-iEg3Y2rhcGHqTrwfXSghzh04f704NRj1eTfiBg52nSxkXkFipM8yDA9npWDXofhXWy407AQy8CFt6g4vX_YmxCQsGtxHSDMB1WcHTNVzJZxG6EDX7gdBciPbyVYy18gkZjgaMwHhk3B2i1AVtzxQ2iWoyVbUglKQFtZCdHdEUZxvD07v6fAKdh0k6MTnLla5iVk5MSC5CFgT9Bz49Ylx4EtEYPIdpytI6idtps1hJGebJcJ5NQLFfNDQpObx-LXI7OyHHTR_TdOuxwOGa2hcMwHJC_mLDkw2QjUoQNCTwiu8V_e3NaAH6WMCNsRFePUQgMxVQgMe0eEtz1S5J6K_NWRG7wki-wsHw8FMPnbB8i-WKcwRvhZjOeZY22-zWdo5YNSnQUhzbDSqBVmjvMec1N6OHCSma_pNVmoYYfRTnFnTHOPLr3TjeEtn-gBv25Gr7YlRZI5Y0Yr6zPu14rPnZw1SVw9Te96HGBE9aHvMOJQjMw9CSvnyt6eS8iwh6HIfE6Pi8XF2zM13UbXjsQGuAVwqDfZppsP7cNgmeEFus0L8vUjw84LVCZsMluCDpQ_J5-2f4nlacrnD8gh28m7m8-3N6_Tw_ogJUxI6dQ_Y_O-qjeooWAZt1khLQvkOllyHQFac8oFJcgVy14MwnCvXcOu41Skzu8Ot1UWrmNNh1a3_BbZEW6tY85jUWZ5nmWJrTIkrgpabOjOx476ZQoUesvRe7P5R2XajY4__TF6RnPRpVvcuLYnqsNTjM0lDo-_1zh1P8PzwSa0v9B4Jm8Y9tdnsd92dYNaSllbEsFYztWMVoXqWBZhjfRpLHilVDW1xURskx2zOiLJOSkJQm56DX-dfwbjp785uc5QRaxLElCSLIjLE2TLEk2JClE0eZJVueEpIxFWSIGpLrxIF6u2JSh6GruLG4qaZ193eTWym4UQSiPz2fXa1N-_OuTHPQYB35lIPcd478Kng">