<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/57476>57476</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Unprofitable ctpop vectorization
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            vectorization
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          nikic
      </td>
    </tr>
</table>

<pre>
    Example from https://github.com/rust-lang/rust/issues/101060:
```
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i64 @test(ptr %arr) {
entry:
  br label %loop

loop:
  %accum = phi i64 [ %accum.next, %loop ], [ 0, %entry ]
  %iv = phi i64 [ %iv.next, %loop ], [ 0, %entry ]
  %iv.next = add nuw i64 %iv, 1
  %gep = getelementptr inbounds i64, ptr %arr, i64 %iv
  %value = load i64, ptr %gep, align 8
  %ctpop = tail call i64 @llvm.ctpop.i64(i64 %value)
  %accum.next = add i64 %accum, %ctpop
  %exitcond = icmp eq i64 %iv.next, 2
  br i1 %exitcond, label %exit, label %loop

exit:
  %lcssa = phi i64 [ %accum.next, %loop ]
  ret i64 %lcssa
}

declare i64 @llvm.ctpop.i64(i64)
```

This two-iteration loop gets vectorized by `opt -loop-vectorize -mcpu=znver2` (https://llvm.godbolt.org/z/avnh9KTh3), because we assign cost 1 to scalar ctpop and cost 3 to the vector ctpop, so it's nominally "profitable". At least for low iteration count, this is not actually the case.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJydVE2TmzgQ_TVwUUHxYWM4cJjE2csek5y3GtEYJUIikvB45tdvS-Cvqa1UbapsIan7vW51P6nT_Vv75QLTLJENRk9sdG62UfkSFX_R7yTcuHQp1xMtzGJdIkGdtjl9hLULWprkWZ5Vmcdlxyh7iWi-_sLSgTmhYz04kPCmF8ei8siiosBkIgwmc3Hw4LIIg1_mz0s_q3ZhSMR1MtQelBd1omo_qa4Ysn2lbQrwFN8Z4c-5xb7U1T_kuKifSr-qRAq1XJKTWm6odexxEAoZBWXRLnPoz13PzhDFHoyJioZFh0-rLypn3m5FYKwzTEKH0vtKredH3rC-u3o2zpcpZDePYg24_3QzpAovFPrzlYuMx7Akn2zbD_GD4c4qzv9FKc5_zheQgRT6nqnldSX2Jo_LH5xPOAdHqj5KnIjPV06oTi-qtx7nEY_V_PxAduc5g1zWxkkN_QccBfErkOKkWP2A4m7Wa3wHQjIOUl77KOV5SoM9DWT1FjUEop5-bMvzmTfnYNlKFbgeUHgRjmvVB4zg08zw1_1ot-IXD1IR-SPQW2_i8ZtPGx_VFBye1CS5tfD_1LSBDV2VLdNAssU5HJ-vBZdg8Hf1vJXxw2uwjt9GYZl71YlwaMAJrVhIhKRi2Rm500a8Y886EmCV6dmxxNuTm4klE58XOuC7OqMpyIkyrp_fr5DWSfedli7Vxr9c7_SHsxqbv7-NpU-RitAhh8Uie0UG1noZcW0dy5nTzJJswLBVS0ANDabSm9yIW6ar2VNZzXyrDpYpPQlFknvzj81s9CAcdJLEVaTsxTGJQDwDYaWmG3QrAqerETrjfIGE53EMuFsClQ_JwWIaY5tXVd7ss7pu4r4t-6ZsIHbCSWy_q3u8LfNr2UKQeDGy_c1L78u2fRJi-kHYx8d-f9gdqnhsDwfMc8jzGvYD7OqB131TAJZDBVDtOoyDXG1LwqNTP2XgH9n9MRZtkRVFVpd5XmZ50aQD5tCU1ZBV0Gf7bEfSwokubxo6SR2MTRvS6paT9boT1tm7ce0eYghJ_LC4UZtWiZ-CxyH_NiT_Ly0ID6w">