<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57476>57476</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Unprofitable ctpop vectorization
</td>
</tr>
<tr>
<th>Labels</th>
<td>
vectorization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
nikic
</td>
</tr>
</table>
<pre>
Example from https://github.com/rust-lang/rust/issues/101060:
```
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
define i64 @test(ptr %arr) {
entry:
br label %loop
loop:
%accum = phi i64 [ %accum.next, %loop ], [ 0, %entry ]
%iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
%iv.next = add nuw i64 %iv, 1
%gep = getelementptr inbounds i64, ptr %arr, i64 %iv
%value = load i64, ptr %gep, align 8
%ctpop = tail call i64 @llvm.ctpop.i64(i64 %value)
%accum.next = add i64 %accum, %ctpop
%exitcond = icmp eq i64 %iv.next, 2
br i1 %exitcond, label %exit, label %loop
exit:
%lcssa = phi i64 [ %accum.next, %loop ]
ret i64 %lcssa
}
declare i64 @llvm.ctpop.i64(i64)
```
This two-iteration loop gets vectorized by `opt -loop-vectorize -mcpu=znver2` (https://llvm.godbolt.org/z/avnh9KTh3), because we assign cost 1 to scalar ctpop and cost 3 to the vector ctpop, so it's nominally "profitable". At least for low iteration count, this is not actually the case.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJydVE2TmzgQ_TVwUUHxYWM4cJjE2csek5y3GtEYJUIikvB45tdvS-Cvqa1UbapsIan7vW51P6nT_Vv75QLTLJENRk9sdG62UfkSFX_R7yTcuHQp1xMtzGJdIkGdtjl9hLULWprkWZ5Vmcdlxyh7iWi-_sLSgTmhYz04kPCmF8ei8siiosBkIgwmc3Hw4LIIg1_mz0s_q3ZhSMR1MtQelBd1omo_qa4Ysn2lbQrwFN8Z4c-5xb7U1T_kuKifSr-qRAq1XJKTWm6odexxEAoZBWXRLnPoz13PzhDFHoyJioZFh0-rLypn3m5FYKwzTEKH0vtKredH3rC-u3o2zpcpZDePYg24_3QzpAovFPrzlYuMx7Akn2zbD_GD4c4qzv9FKc5_zheQgRT6nqnldSX2Jo_LH5xPOAdHqj5KnIjPV06oTi-qtx7nEY_V_PxAduc5g1zWxkkN_QccBfErkOKkWP2A4m7Wa3wHQjIOUl77KOV5SoM9DWT1FjUEop5-bMvzmTfnYNlKFbgeUHgRjmvVB4zg08zw1_1ot-IXD1IR-SPQW2_i8ZtPGx_VFBye1CS5tfD_1LSBDV2VLdNAssU5HJ-vBZdg8Hf1vJXxw2uwjt9GYZl71YlwaMAJrVhIhKRi2Rm500a8Y886EmCV6dmxxNuTm4klE58XOuC7OqMpyIkyrp_fr5DWSfedli7Vxr9c7_SHsxqbv7-NpU-RitAhh8Uie0UG1noZcW0dy5nTzJJswLBVS0ANDabSm9yIW6ar2VNZzXyrDpYpPQlFknvzj81s9CAcdJLEVaTsxTGJQDwDYaWmG3QrAqerETrjfIGE53EMuFsClQ_JwWIaY5tXVd7ss7pu4r4t-6ZsIHbCSWy_q3u8LfNr2UKQeDGy_c1L78u2fRJi-kHYx8d-f9gdqnhsDwfMc8jzGvYD7OqB131TAJZDBVDtOoyDXG1LwqNTP2XgH9n9MRZtkRVFVpd5XmZ50aQD5tCU1ZBV0Gf7bEfSwokubxo6SR2MTRvS6paT9boT1tm7ce0eYghJ_LC4UZtWiZ-CxyH_NiT_Ly0ID6w">