<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/69895>69895</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimization: duplicate vector register initialization for __builtin_memset_inline
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          dvyukov
      </td>
    </tr>
</table>

<pre>
    The code is:
```
void mymemset(char* dst, int c, unsigned long n) {
  __builtin_memset_inline(dst, c, 32);
  for (unsigned long i = 0; i < n; i += 32)
    __builtin_memset_inline(dst + 32, c, 32);
  __builtin_memset_inline(dst + n - 32, c, 32);
}
```

[Clang generates](https://godbolt.org/z/oeeozvs1q):
```
mymemset(char*, int, unsigned long):
        vmovd   %esi, %xmm0
        vpbroadcastb %xmm0, %ymm0
        vmovdqu %ymm0, (%rdi)
        testq   %rdx, %rdx
        je      .LBB0_3
        leaq    32(%rdi), %rax
 xorl    %ecx, %ecx
        vmovd   %esi, %xmm0
        vpbroadcastb %xmm0, %ymm0
.LBB0_2:                                # =>This Inner Loop Header: Depth=1
        vmovdqu %ymm0, (%rax)
        addq    $32, %rcx
        cmpq    %rdx, %rcx
        jb      .LBB0_2
.LBB0_3:
 vmovd   %esi, %xmm0
        vpbroadcastb    %xmm0, %ymm0
 vmovdqu %ymm0, -32(%rdi,%rdx)
        vzeroupper
 retq
```

Second and third vmovd/vpbroadcastb are excessive, ymm0 already contains the necessary value.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy0VV2P6jYQ_TXDy2hRsBNIHvKwLEWtdPvU-46ceJZ469jBdlLYX185ye4Fdre3qlQLkQGfOfNxJrbwXh0NUQnZFrLdQvShsa6Uw6X_0w6LyspL-b0hrK0kVB74IyQ7SB5hncyf8edglcT20lLrKQDL60Y4YI8ofQD2hMoErKPRmzGcRG3NEQ2wAmGznTgQD4eqVzooc5iIDspoZQhYPvOMHJwBK4C_ez1bh8DyW2qFwHeYAN-O5hOa2WTbuDFxzAQ_CRx9Rocv4v_c2eDD1wSw2X3a0vk72z5pYY54JENOBPKQ7YDlTQjdqAbbA9sfraysDkvrjsD2r8D2lsi-Dn51GkN9LtpHvWatPih1TYLzGlo7SEQElpFX0QVYdm7b5A7WVc4KWQsfqnfEBL58BEfOU_--OeJyYJmT6kawuAL5cJoScPI8c0brBvVC03P5bbtNDvx2U5OIDKMiV2FmKvFGdbZO41xq_RYpWv9bQ6ZsGfB3t68WMB5HHfgv3xvl8TdjyOE3azv8lYQkFyl21IUG-G71b7stzh-6LaQ8TfHSaZQj7r4FddvNoGtJ7lEv1bUk7Lpi_mPM_ks7J_jnI_ZJtQ83uj-9ZX1X-fBKzvZdR27-31E4_cMr-wfV1kgURmJolJNTZGD7m1SFI6RzTd6rgWIyMSkU2pGQF6ytCUIZj6EhNBRhwl1wELqn5UKWXBa8EAsqV-siT3i64utFU6Z1UaSyeM5EWq-zgmfrmmdc5pSL6rnYJAtVsoTxVcJ4kiZ5tl6uWL3h1Yo29XNepEUFaUKtUHqp9dDG42ShvO-pXBd5kS20qEj78apgzNBfOG4CY_HmcGX0eaj6o4c00coH_4MlqKCp_F15TxJtF1SrXkVQ1sTxlH2nVS0C4UB1sA4dHZUP5FAZFZTQM3Y86b84bRe90-XdoahC01fL2rbA9jGT-fHQOftCdQC2H_P3wPZjfX8HAAD__8bl_uY">