<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/88137>88137</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missed vectorization in OpenCL
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
natanelh-mobileye
</td>
</tr>
</table>
<pre>
something really weird is happening here...
[reproducer](https://godbolt.org/z/8W3sqnWsq)
[how it should behave](https://godbolt.org/z/q6xsrMPY4)
after using `clang` with `--print-after-all`, it seems like InstCombine is messing it up. here's the diff:
[godbolt](https://godbolt.org/z/eKGM9We41)
```
; *** IR Dump After PromotePass on my_convert_char16 ***
; Function Attrs: convergent norecurse nounwind optsize
define dso_local <16 x i8> @my_convert_char16(<16 x i16> noundef %0) local_unnamed_addr #0 {
%2 = extractelement <16 x i16> %0, i32 0
%3 = trunc i16 %2 to i8
%4 = insertelement <16 x i8> undef, i8 %3, i32 0
%5 = extractelement <16 x i16> %0, i32 1
%6 = trunc i16 %5 to i8
%7 = insertelement <16 x i8> %4, i8 %6, i32 1
%8 = extractelement <16 x i16> %0, i32 2
%9 = trunc i16 %8 to i8
%10 = insertelement <16 x i8> %7, i8 %9, i32 2
{ ... }
}
; *** IR Dump After InstCombinePass on my_convert_char16 ***
; Function Attrs: convergent norecurse nounwind optsize
define dso_local <16 x i8> @my_convert_char16(<16 x i16> noundef %0) local_unnamed_addr #0 {
%2 = bitcast <16 x i16> %0 to <32 x i8>
%3 = extractelement <32 x i8> %2, i64 0
%4 = insertelement <16 x i8> <i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef>, i8 %3, i64 0
%5 = bitcast <16 x i16> %0 to <32 x i8>
%6 = extractelement <32 x i8> %5, i64 2
%7 = insertelement <16 x i8> %4, i8 %6, i64 1
%8 = bitcast <16 x i16> %0 to <32 x i8>
%9 = extractelement <32 x i8> %8, i64 4
%10 = insertelement <16 x i8> %7, i8 %9, i64 2
{ ... }
ret <16 x i8> %49
}
```
(notice the difference in %4)
`InstCombinePass` changes `insertelement <16 x i8> undef, i8 %3, i32 0` into `insertelement <16 x i8> <i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef, i8 undef>, i8 %3, i64 0` for some reason, with 10 poison values and 6 undef values (?).
this hinders vectorization opportunities.
(sadly godbolt does not show any stable version above 15)
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzkl81u4zYQx5-GvgxsSKQ-DzrE9rpYtIsueln0FFDi2GJLkQpJ2XGevqDsxIkc7GaTQ4EWEGx9DP_z41AzHHHn5E4jViRdknQ944Nvja0091yjauedqaXCI85qI46VMx36VuodWORKHeGA0gqQDlre96jDkxYtLhYLEq1JdHP-TZcWe2vE0KAl6ZrQovW-d4TdELohdLMzojbKL4zdEbp5IHRTfGPuTn9zd4SWE6nWHEB6cK0ZlIAaW77Ht4neZffOfvn6ZzIR5VuPFgYX8EkWNYrrHckiOEjfhhvzeW-l9vPRbs6VIllE6GqkQOwcKPk3wmft_Mp0tdQYItKhGwWlh6FfjGEhNHfgWwQht9vA-TilM-rbZoG__vKl_IZJfJlFFp2P0yVbAqE3pwM-_wHroevhZpzkV2s64_Erdw6Mhu542xi9R-tvm5bbOLsMvGhtBt14aTTceG8DGZzG7FB70MZiM1iHoM2gD1ILML138gFPAgK3ISDCmVtlGq6AsFWcwT3IgrBPQJLoioHQ4skozoJVkBa4BULTiNASRqXbQWveobjlQlgglEVA8uXJKwRTCoStAe-95Y1HhV3gnSifFFcgGYXo2Vg2jvV20E0wPcl5E6gvRsloJLVDe60_Tm_EHvWLUfU1V-lPYsbPxmbXmOkVZv5DzDCXC2X2mqfiJynps7HlNWVxRRlHb8HML5jltat8CYvFAsLZ-vz-Pp18JymeZe7_KDFq6RvuXl3EsDiErRh9xLnKjOsX4WI8uhgXJ0tevOk_zhfCVrKA3khn9Hml_5WL53n7vnP2aZr2k2CkH1mE7E2LkD76pR-uBlnySjV4J3z5Jvji0W_y8SLxIgT5EgCmdQLA4mtxKKdlZLLV0kIbLxt82tbRom4QpD6H8bJFT6pMaDCalusdutBjvG8XySKQOoT6-wL_-bTKItgaC6E_Da3p2ePYv8XRmQH2XA3ogGsB2Unu8dZYWDeEli_aVt-GxlZqgdbBHhtvrHzgY8E3fW-sH7T0Et3LXpcWjgt1hHPjBsKgA23GhvUAXB_BeV4rhD1aF7R4bfYIcUpoORMVEyUr-QyrOI9plqcZZbO2onFZpiLOBSuKPIrLDHlRU1GWRcpyVrKZrGhEkyiJyjiNWcIWWDRClOUW4ySi25yTJMKOS7VQat-FbnImnRuwKoqY5TPFa1Ru_AigVOMBxoeE0vBNYKswZl4PO0eSSEnn3UXFS6-w-iKdQzEJktTwe4969dtssKqatLXSt0O9aExH6CaInf_mvTV_YeMJ3YwILnwLBMR_AgAA__8xj6kW">