<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/87189>87189</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Miscompilation of lbzip2 after loop-vectorize pass for avx512
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
AngryLoki
</td>
</tr>
</table>
<pre>
Hi, it was originally reported in https://bugs.gentoo.org/910438 that lbzip2 compiled by clang for AVX-512 platforms produces corrupted bz2 files.
I tried to check with the following results:
1) after adding `-fsanitize=undefined,address` issue fully disappears (i. e. produced files are correct and no complaints from sanitizers).
2) minimal flags to reproduce with clangs are `-mavx512f -fvectorize -O1`.
3) With OptBisect I see that the pass, which breaks code is LoopVectorizePass. If I compile all C files to LLVM bc with `-O0` and apply `opt -passes=loop-vectorize -mcpu=znver4` to [encode.c](https://github.com/kjn/lbzip2/blame/master/src/encode.c), resulting bz2 files are corrupted:
```bash
clang -c -emit-llvm -mavx512f -O1 -DHAVE_CONFIG_H -Isrc -Ilib src/encode.c -o src/encode.o
opt -passes=loop-vectorize -mcpu=znver4 src/encode.o -o src/encode.opt.o
```
4) if I set `__attribute__ ((optnone))` to all functions in [encode.c](https://github.com/kjn/lbzip2/blame/master/src/encode.c), code is correct. To produce the issue, it is enough to apply optimizations only to `assign_codes` and `generate_prefix_code`.
5) no issues with clang/avx256 or gcc/avx512
I attach partially compiled lbzip2.bc and testing `compressme` file here: [lbzip2.zip](https://github.com/llvm/llvm-project/files/14815150/lbzip2.zip), which breaks after loop-vectorize with almost any file:
```bash
# with llvm release 17 or 18
opt -passes=loop-vectorize -mcpu=znver4 lbzip2.bc -o lbzip2.opt.bc # <--
clang -O0 lbzip2.opt.bc -o lbzip2
./lbzip2 -z -k compressme
bzip2 -d -t compressme.bz2
> bzip2: ../environment.bz2: data integrity (CRC) error in data
> You can use the `bzip2recover' program to attempt to recover
> data from undamaged sections of corrupted files.
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0VtGO4rgS_RrzUkqUOISGBx4YmNxB6rl9dbXq3X1CjlNJPO3Yke3QA1-_sg00zK52NQ8rtWglsatOnTqnbGat6BTimpSfSLmbscn12qw3qjOnZ_0mZrVuTusvgtAtCAfvzII2ohOKSXkCg6M2DhsQCnrnRkuKDaEVoVU9dTbtUDmtU206QqtVns2LJbieOZD1WYwUuB5GIbGB-gRcMtVBqw1sXn9LypzCKJlrtRksjEY3E0cLXBszjT5hfabQCok2JdmOZJv4uwdnBDbgNPAe-Ru8C9eD6xFaLaV-F6oDg3aSLiANe3JCV8BahwZY0_gVZJElrWVKOHFGUuwm1WArFDaEblnTGLSWLDIQ1k4I7eSJaIRl44jMWCB0KVLA9Iq6iTiBGQz4kTtgqgGlQ_2SCeUstEYPcM1pLKGrS2HUwxuEEgOT0ErWWV-dwUv0WGEgL6bw4Ad2_F7mtIWkPSJ32ogzQvKSk0V2iVr4qL_6rS-j-ySsB7UHixj74xkbmbW-6--94D3UBtmbb0CDICw8az2-XkP_j1mbwr6F_bWjwKSE7aVwp-H5-fUr1DyC9QhfMs-gp4GNozz5d3p0kPikaEmxk1qPyR34gY8TKXZndUQz93udBlJ-QuURpZyUO0KXjxLshOunOuV6ILR6-6YIraLuvDwlG5DQamDWoSG0soYTWt3C0ZUvPUrFS-Imt1sbgwxvIiKLLP7VzPbxVRR0wiHBQbhEyuMAd515ySHZfdm8fj5sX_5b7f9z-ALJ3hoOyV6KGh7xQKIf3-iY4yc4-2H_nyOO7hr1Vkx8nHutiDbow_lOHQ7MOSPqyeHh4PVO6FKPTmmFgbjVpT9eBO2kuBNaWT8i_vWGXeV58VkKv-irDYOmg2Uvo0xYQKWnrg9Qgwr16MQgziwC1kqegswWWZySBx_eXoVLFlmHCg1zeBgNtuJ7-P7hsdLzpnRMau-MSmjFjt9puQBtoOM8Ppc5fRxlzDnGexiZcSJM29u4jKykNQ9AHFp3GVt-hR9Pg4cRBAs9GiTFxnN_2XYW4z-y79V6-ZeMRn9D7gitggMIrfL5Mi_zMrs1KMSMLXgYF3Gs_qDLQASTg7Z-EJ4CzL8zEqFF3BMsZFAiswj5k2cvX_68ET7YS_T1wau_5gA-Fym2SfLg4Zfsh3W3jXFZeiMCkjMkb3DXh7Dg8q2BxN19S-vzQ8tJ8Rli1GIDaRrkfRRGqwGVC4uLDTTMMRDKYWeEO3n3bf-_9UpDY7TxLvMrPgL-rifgTMFkowU8tT6HQa6P3klP3iGdYUPwgXM4jC4eMXHBLVLIHM6pSTVsYB024A-OaJb27nB-OJgfh8n976xZF82qWLEZrvOnPF8u51nxNOvXy7xs5m3D2tUSec6K1WrJcV62dcsXq7LhM7GmGZ1nRZHnOc2KRdqWDc85L1i-oDXOSzLPcGBCpl4z_gIyCy5cL5_y5WomWY3ShisPpQrfr3OB-huQWQfV-wsMmWdSWGc_ojjhJK6_Chu9GCaFr_3S_b_Uu5dluNhEk88mI9c_bb04RAitQgV_BAAA___BNSeP">