<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/63946>63946</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Questionable codegen for shuffles + combinations (x86)
</td>
</tr>
<tr>
<th>Labels</th>
<td>
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
DenisYaroshevskiy
</td>
</tr>
</table>
<pre>
Hi!
The code: https://godbolt.org/z/8n64TrnqK
Clang introduced a lot of
```
vextracti128 xmm6, ymm3, 1
vpackssdw xmm3, xmm3, xmm6
```
And other things. It is possoble that I don't understand why this is an optimization but looks suspicious.
Code pasted
```
#include "immintrin.h"
__m256i has_equal_in_u32(__m256i a, __m256i b0) {
__m256i b1 = _mm256_shuffle_epi32(b0, 57); // [1,2,3,0,5,6,7,4]
__m256i b2 = _mm256_shuffle_epi32(b0, 78); // [2,3,0,1,6,7,4,5]
__m256i b3 = _mm256_shuffle_epi32(b1, 78);
__m256i b4 = _mm256_permute4x64_epi64(b0, 78); // [2,3,0,1]
__m256i b5 = _mm256_permute4x64_epi64(b1, 78);
__m256i b6 = _mm256_permute4x64_epi64(b2, 78);
__m256i b7 = _mm256_permute4x64_epi64(b3, 78);
b0 = _mm256_cmpeq_epi32(a, b0);
b1 = _mm256_cmpeq_epi32(a, b1);
b2 = _mm256_cmpeq_epi32(a, b2);
b3 = _mm256_cmpeq_epi32(a, b3);
b4 = _mm256_cmpeq_epi32(a, b4);
b5 = _mm256_cmpeq_epi32(a, b5);
b6 = _mm256_cmpeq_epi32(a, b6);
b7 = _mm256_cmpeq_epi32(a, b7);
b0 = _mm256_or_si256(b0, b1);
b1 = _mm256_or_si256(b2, b3);
b2 = _mm256_or_si256(b4, b5);
b3 = _mm256_or_si256(b6, b7);
b0 = _mm256_or_si256(b0, b1);
b1 = _mm256_or_si256(b2, b3);
return _mm256_or_si256(b0, b1);
}
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzElc-OozgQxp-muFgTQfkPcODQnSja0Z5WmsuekAF38DbYNDY93fP0K9NJJiHpZm8bRViG-tVXfJRc0jl9MEoVwB-B7yI5-daOxU4Z7f6Wo3WtenXP-j2qbPNe_KEBE4h3ED98XH-0itS2UUAfSOv94IA-AO4B9wfbVLbzGzseAPe_APeZEezHaF7-vEyw7aQ5EG38aJupVg2RpLOe2CdyGQUiPv7nLTn-XtWbH2XtdYJZ2L_1vQDckve-p2FNFuGDrJ-da36GwDngYhV3lT6uD6Yh1rdqJL7V5uA25Lsn2pHBOmerThHfSk--k8YawNSTyTRqdF6ahvxs3wPkQrg0xA5e9_qX9NoaUk2edNY-O-ImN-ha28ltrsyxjSKDdF41X7gBSLWpu6lRBBB13wc3tdm0gHiJlWWPXGjSSleql0l2pTblRBEwOz2SwYzTpooBcwLp49HF8_2EAN2Rsg_b0rXT01OnSjXoOVWgtoSngDnQx9n3j4YgwB8TwC0CboPpIY4DbsMXSwG3DPhuqYTrSml2V-lSJblSCaq3SvRLpeRK6dRVZ5hdwoMa-8kr9iZYSCDYbamflXku63dqvpr6qrYbXqzy-CWfrvL0lj9lqeJLuu4H9XL2dG61ucUuZK9b6x6QLABcA3AB0DWALgC2BrAFwNcAvgDEGiAWQLoGpOtfw46l08jFuTtvrE0-C8e7PuFn4ezuS9PPwsX_8wYnaFR-Gs1_1IB0tziSo6agTU5zGakiEVmecsozFrUFY5QJHtNcpBXWLM-RS5nLmNG4zjGhkS4wRhqnSR6nPGfJpuIy4yKRrEKs87QBFqte6m7Tda99mKuRdm5ShaA5E1EnK9W50xgfixD0rZoODljcaefdb8xr36nir0m5MIZkGGBhhh-UIU92JMfTzxHAR1LbvtJmnlfhRvaWhWaMprErFvNe-3aqNrXtAfdB6rh8G0b7j6o94H4u1wHu54r_DQAA__9MlE56">