<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/86499>86499</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
`memmove` is emitted where `memcpy` is valid
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
folkertdev
</td>
</tr>
</table>
<pre>
minimal example:
https://godbolt.org/z/baK6b13Ga
```c
#include "inttypes.h";
typedef struct Foo {
uint8_t x1[127];
} Foo;
void bad(Foo* ptr, int dst, int src) {
Foo x = ptr[src];
ptr[dst] = x;
}
void good(Foo* ptr, int dst, int src) {
ptr[dst] = ptr[src];
}
```
The `bad` function in this example emits a memmove instead of the more optimal memcpy.
```asm
bad: # @bad
movsxd rcx, edx
mov rax, rcx
shl rax, 7
sub rax, rcx
add rax, rdi
movsxd rcx, esi
mov rdx, rcx
shl rdx, 7
sub rdx, rcx
add rdi, rdx
mov edx, 127
mov rsi, rax
jmp memmove@PLT
```
On x86_64 this is merely inefficient, but for embedded targets, using memmove means that memmove needs to be included in the binary. It can consitute a significant percentage of the total binary size, so we'd prefer not to emit it. memcpy is generally unavoidable, and therefore part of the binary already.
For x86_64 using a large type (127 bytes) is required for this issue to become clear, for smaller types the memmove is inlined. But on 32-bit targets without vector instructions, even a 3-byte value will get `memmove`d: https://godbolt.org/z/3vhK34bvo.
The `good` function avoids this issue because the clang frontend emits a memcpy. In this example the memcpy is eventually inlined as a series of SSE instructions.
I'd argue that in the `bad` function, because the `src` and `dst` pointers are both `getelementptr`'d from the same pointer, the conditions of LLVM's memcpy are satisfied ([the pointers don't alias OR src == dst](https://llvm.org/docs/LangRef.html#int-memcpy)), hence LLVM should in theory be able to emit a memcpy here.
Recognizing this pattern is more important for languages like rust, where array access (by default) involves a bounds check. `rustc` is not currently capable of emitting just a single memcpy like clang.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJycVlGP4jgS_jXmpdQoOBDggYft7eM02j7taXZ1ryvHriSecWzOrtAwv_5UTtIN3Tsz0iHUNJTL9fmrrz5HpWRbj3gQm0exeVqogboQD01wXzGSwfOiDuZ66K23vXKAF9WfHIryF1E8iWL62xGdEv8mj0Ie22Dq4GgZYivk8ZuQx1r9VtWr8p_qNklUxfjW03dZWq_dYBCElNYTXU-Ylp2QUpSPt5kcMNhAojhogmMIILbTCgCAwXra_UVwWYnN40puxebpbYftEye82_EcrIFaGSF3HJS_wImikL-C9QQm0fxvilrI_X01Ln8BUT7lnM0jr7kpyEvGAO-zecorL7d4PgBpQ_i_kHwo87eA3irODbgF8GeHIKqCuagKaAavyQYP1gN1Ns39B-wtJVDQY9-HM4L1iVAZCA1Qh9CHiBBOlDXTY69P1yX8bfNV6sdfuGKZj_Gzl5AliHWGOB69D-d0MQBRX5geNJc3TvjVh3P-jCrHedldPHXuNr6dommof5imjLmLG_uh7B2s9DE-5pufwDJ3sOboUH838xWZsSOy7xCCYz6PyCuTY2IaE9W7xC_9adxgbLtYF_9-_hPgB3r63cNlV_1VrUf92AQ9RnRXsB6bxmqLPmu6HgiaEAH7Go1BA6Rii5Q4NiTr21ep9ah8AuoUvf7kEU0CClCzErOHmFGyCLX1Kl6X8IlAKw86-GRpIAQF7Hy2sVp5ghNGjZ5Ui7OGKZByUzok-w0ZSgrwgkJuDZwiNhjBB-LCPA9gaTmJnc_ZoseonLvC4BXPtapd3kN5wwUiNjwlJxVpLjkVUy6iMu8m5hjizOTIhwLHFAG7IQi5W8kt1FfCxL5gE0T872AjmkzrRH4acKRJhx5BO1TZW3hF6pVzGPN2aRziebYTWO-sR7OEx4EgeCjlQ21p7hG8WOrCQHBGTSFmM2BntsHn_uEZPSgoHxgenJUbEF6sc9AisdnMaqqKbAE_u03Kc_dbua7P4Z6gybmyed5aV-Y-3TJQo1ZDwnxI7ZRvoYnBE3pz62vZsz69872Jl6nFfDAaco8nhkBxdsJoMXFX__jjH3d03EP-lJWkYsttYUFPmv1owHlEbmCLqmBbr4qsJmYuEX87BesJYwIVEepAXaYECR326IkvhKrIVZsY-rxVUj3OeVwmsxK8sRkwH-L5-T__EnKb5pPz3kmRTY1Fw9ITm0fOei1uGPCWQDmrEvz-mW8rvpD4ThqvJyF392127txPPTZBJyGPz8q3n7FZdtS7_GxAD2N9Iff5_St06DVmeJC6MLh56EO8shXwwL0O59xT4Mlb3nbhM-rQevuNhyo3-6SIMPpsVjyitj-FSOwSPCgsmEG1mMDZrwhxGC_lF94XVIzqCkprTImZqa9gsFGDozyV_hzcGVkidRi8SaA71F-X3CTeJzfUpmwqeogRPbkraHXKJwlNPgkxzi9DouxgvnWvisx4sqBnmS3MoTT7cq8WeFhtV6tyK1f7zaI7oCxWqkIt93u1KpWp1s2qWDfbdVNVm82uWtiDLOS6KOVmVRb7zWZZ1ytdreu9rsumNruVWBfYK-uWc-cWebgOu2q93y-cqtGl_FAppceXcfL4UW7ztIgHznmohzaJdeFsovS2C1lyeLhzhTxqfHI0E81jmLUwRs_KWbMYoju8Mw9L3VAvdegniU0fD6cYvqAmIY8ZGOstA_9fAAAA__9Xum29">