<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/60857>60857</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[LoopVectorize] vectorizer generates unusual/unprofitable max iteration count check for masked load index
</td>
</tr>
<tr>
<th>Labels</th>
<td>
performance,
vectorization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
erikdesjardins
</td>
</tr>
</table>
<pre>
For the following IR:
```llvm
define void @xor_loop(ptr noalias %data, i64 %len, ptr noalias %key) {
start:
br label %body1
body1:
%i = phi i64 [ 0, %start ], [ %next, %body2 ]
%exit = icmp eq i64 %i, %len
br i1 %exit, label %ret, label %body2
body2:
%next = add i64 %i, 1
%masked = and i64 %i, 4095
%key_ptr = getelementptr inbounds [4096 x i8], ptr %key, i64 0, i64 %masked
%key_elem = load i8, ptr %key_ptr
%data_ptr = getelementptr inbounds [0 x i8], ptr %data, i64 0, i64 %i
%data_elem = load i8, ptr %data_ptr
%xor = xor i8 %key_elem, %data_elem
store i8 %xor, ptr %data_ptr
br label %body1
ret:
ret void
}
```
which is like
```c
for (size_t i = 0; i < len; ++i)
data[i] ^= key[i % 4096];
```
The vectorizer (https://godbolt.org/z/4933MGM8f) generates the following check to skip the vector loop if the iteration count is >4096:
```llvm
vector.scevcheck: ; preds = %body2.preheader
%0 = add i64 %len, -1
%1 = icmp ugt i64 %0, 4095
br i1 %1, label %scalar.ph, label %vector.ph
```
While this technically allows it to save one instruction in the vector loop, it means anything larger than 4096 iterations is handled by the scalar loop, which feels suboptimal.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyMVd-P4jYQ_mvMi3UoODgkDzwsy1FV6r1UVfu4cuKBzGHs1Hb2YP_6apwAgZ62Xa0AO_Pjm5nvm6gQ8GAB1kxumNzOVB9b59fg8aghfFdeow2z2unLeuc8jy3wvTPG_UB74L_-zvIXlm1Z9sKKbPg35v00XGnYowX-7lBztszOzr8Z5zomyi56bp0yqAJnQmoVFROvHIslHQ1YOj0ZHeHCRMXZajNED1H5eEvPee25UTUYsiW4ixFX-hwubsZMSOQs3_KuxSGr3PCMkjIhU2DO5Dad5YbuLJzj-JhCifR4TMyEhDPGFA-bU8fh72slOPpQRXeYuLj60OMbaA-P55TouQgxqXjElRIrrR-SLiZGJxWOoAcz-2i2zCo5sTzC5Y3aTqYHiGDgBDbSDdra9VYHasgyqwp-5liOPUoe44CGIWaTaQ7Zn5JQ5JTFOKUp0kMYwjBxIHr8D1jZTzBNmTUFhc_hPwF0TT9xObsBDH1jOS1qHPgt5tUpROdhtD07_0n4T3lMHLkTwENM6hotVtup6U2Qw_FHi03LMXCDR3iyaIbznqoSZcAPeIt8EEjG8k36-cqJxDmpYcPEBpmorjBSk-UGmdxyJr-SG3FBbpBqII4VNJV881Ngw-cfLfB3aKLz-AEJRhtjF6hYsWNid3C6dibOnT8wsftgYres8vzbL9_KPW2FA1jwKkJ4WlBNC82RR8fDEbv0bEjCaRFx3KcrjOSLzvLG9TZSk1j-NcH-ZLsNgeahgfeUheUv_L_-qH-dB-Jrvr1pfN55aEFpmFIse5b1uBW_TJW9uC-d_hCvltmzsm87Z_GwYEKjjPLzrn24Hevq2k-m9VeLBnhsMfAITWuxUcZcuKK2B44xdVy9A3cWONoQfd-kBqN9HkISZeQnUDZwZS-xpbkZ5Q9AbxtlE3_uMwo0nlZZbUDz-pLCDYXcwg1U3wOYwENfuy7iSZn5TK9zXeWVmsF6UayKVVlKWc7adSYqKfKqWEqxX0mQZbMs91kOy3xZ5UrUM1yLTOSZWFRZkUtZzistV4UqslpqyHVZsGUGJ4VmTvQgjs4whB7WRVbK1Sy1NqQXrBAd-L3zJ2UbYEIw8cqEuBI_FUi3cjvza4r1pe4PgS0zgyGGe_SI0aQ39m_OdX9eZUP6m2jorone9qFXholdbzvv9hhVbYCf1Plf3B8EQ5tgfGsM-9BqOM96b9ZPqsTY9vW8cScmdkkaw9eXzrvv0EQmdqkRgYld6sU_AQAA__8TNXp5">