<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/72803>72803</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Unnecessary roundtrip through avx512 vector registers for integer mask
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jhorstmann
</td>
</tr>
</table>
<pre>
I have the following rust code, and was hoping that for an `avx512` target the autovectorizer could directly use the integer mask by moving it to a `k` register.
```rust
pub fn add_masked(a: [i32; 8], m: u64, b: [i32; 8]) -> [i32; 8] {
let mut res = [0; 8];
let mut bit = 1;
let mut i = 0;
while i < 8 {
let x = if m & bit != 0 {
b[i]
} else {
0
};
res[i] = a[i].wrapping_add(x);
i += 1;
bit <<= 1;
}
res
}
```
The generated code is not bad, but there is an unnecessary roundtrip through vector registers by broadcasting the mask and testing it against constants:
```asm
.LCPI0_0:
.quad 1
.quad 2
.quad 4
.quad 8
.quad 16
.quad 32
.quad 64
.quad 128
example::add_masked:
mov rax, rdi
vpbroadcastq zmm0, rdx
vptestmq k1, zmm0, zmmword ptr [rip + .LCPI0_0]
vmovdqu ymm0, ymmword ptr [rsi]
vpaddd ymm0 {k1}, ymm0, ymmword ptr [rcx]
vmovdqu ymmword ptr [rdi], ymm0
vzeroupper
ret
```
[Compiler Explorer](https://rust.godbolt.org/z/5Pz1e35eq)
I tried to reproduce the issue also with `clang`, but somehow it does not vectorize the C code. Gcc manages to vectorize it, but the result is slightly worse than the rust version.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJx8VcuSozoPfhpno5qUMbnAIotOevLXVP2LWZyznjJYAU8bTNuGXJ7-lEySyW2aorqDLp9kWZ8kvddVi7hi8zWbv09kH2rrVr9r63xoZNtOCquOqx9QywEh1Ag7a4zd67YC1_sApVXIxAZkq2AvPdS2I12oZYCddSBbYAsuh8M8EWzBIUhXYYhIsg92wDJYp0_ooLS9UaC0wzKYI_R-jKfbgBU6aKT_gOIIjR0ogA4QLEgC_yBch5X2Ad2U8XfG385_F3x8KdVR1PUF7FqQSv0iRFRMZJKlb8Dma50Klq4hY_N3OlJD4n4xo9_FK5McvrH0-4MY2HI9hgIAMBig6QM49MDSd7LlV4D0YnixKnSIVskf1Q2Gjjp-p9vX2mDUbCC7D31xPUQ3vYMGmFiMMUQSoZ4d6CnoQJTfnYYt3wGNxxsf_mRxlxs9Dv0ZLWYhzx_TvZMdNcovqegGDkzkT74amFg_lyOmGAu1ie-jntK4flD8sRUu0mtP3DbKPzVChS06GVDFngbtobUBCqni_fexZ12Uyxb6tsUSvZfuCM72rQpOdxBqZ_uqhrGrrz3pqW8LZ6UqpQ8jPXBsaKJNwFGoA8hK6jayqvVBtsGz9O1lQ0vfjJLp_zc_f_Bf_Gp4KdH0s5cKAJJXYvHadvZanP0FeXGWXwTpX2AXL3ETccbFg2w6g3SC9O2GmI8nauww9pQ80JU4pc_6obsW95MMTk3DR4vDPcLQUa2bz8v3R0JmF_NT0-ytU9AFRzylC2ViDdcKPzJiaOygPns4nv2PD_7-mURDJ5VS0YN49JFQW46uLyHKw1dR70yVPo-tiHXvckJn-65D90jO8AUn2Hy9sU2nDTr4fuiMdehiiKwOoYudKbZMbGm2TiurCmvC1LqKie2Jie385ynBdI6fRO0b2B8QnEZF09th56zqy_Og975HkMZb2OtQ02QvjWwrSuvMQG8brO2emKIsjgS9bpAIsonkncL_yhIa2coKPUX6Y6TDDZ1pPPQmEKe90VVNe2dvXdw8sh0taMkN6Ly27XSiVqnK01xOcJUsOZ8lyzzJJvWKF8s0KRJe5CKTyaws50WZcp4n6Q4ll8uJXgku0iRJ8iRLsjSbLvNiMcsznu_UcqfmOZtxbKQ2U2OGhso4ifVYLUXG04mRBRofl7QQLe7HYjEhaGe7Ffl8K_rKsxk32gf_ByXoYHD175fjatzOz1OL9vft-p30zqzuL7_Soe6LaWkbJrYU9PzvW-fsbywDE9uYqmdiG4_yXwAAAP__b2JxJg">