<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/54535>54535</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
avoid libcall to memcpy harder
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:X86,
llvm:optimizations,
missed-optimization
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
nickdesaulniers
</td>
</tr>
</table>
<pre>
Via [this thread](https://lore.kernel.org/lkml/YjxTt3pFIcV3lt8I@zn.tnic/):
Consider the following example:
```
struct foo {
unsigned long x0;
unsigned long x1;
unsigned long x2;
unsigned long x3;
unsigned long x4;
unsigned long x5;
unsigned long x6;
unsigned long x7;
unsigned long x8;
unsigned long x9;
unsigned long x10;
unsigned long x11;
unsigned long x12;
unsigned long x13;
unsigned long x14;
unsigned long x15;
// Comment out below members.
unsigned long x16;
unsigned long x17;
unsigned long x18;
unsigned long x19;
} *x, *y;
struct foo* get_x(void);
struct foo* cpy(struct foo *y) {
struct foo *x = get_x();
if (y != x)
*x = *y;
return x;
}
```
compiled with `-O2 -mno-sse` (as the Linux kernel does), we get:
```asm
cpy:
...
movl $160, %edx
movq %rbx, %rdi
movq %r14, %rsi
callq memcpy@PLT
...
```
but if we reduce the number of members in `struct foo`, we can get:
```
cpy:
...
movl $16, %ecx
movq %rax, %rdi
movq %rbx, %rsi
rep;movsq (%rsi), %es:(%rdi)
...
```
which is going to be way faster. FWICT, it looks like isel is choosing whether to lower `@llvm.memcpy.p0i8.p0i8.i64()` to a libcall to memcpy vs inline a simple memcpy.
I assume there's some limit on how many bytes rep;movsq can copy, but surely it's much larger than 16x8B?
cc @phoebewang
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJyVVlFvozgQ_jXkxQoCAyl54KFNFanSSnsP1d7d08nYk-CNwaxtmmR__Y2BNkm3tXYjBIFvPOP55hvbtRbn6ptkJCoeXCMtcY0BJqLiMaJl41xvo-w-olu8lDYQH8B0oGJt9v7LoVX4-Pf76dll_faJf8uUK5-iPPnZxa6TfBy49h6SxyiZ7xvdWSnAYCggO62UPspuT-DE2l7BxXiVzNf4ap0ZuEN7TaK7h-kbwd-AzvYdCKI0OjklUfY5mIZAGgKzEJiHwCIErkLgXQgsQ-A6SEKYoiBHaZCkNMhSGqQpveVpUhzZ6LaFzhE9OFIDCoW00NZgbPy5oyCnaZDUNMhqeqE1unvEKd6fIrrxz_MFeKdVBMke3H9oWb5oKcZuCNjy_oyW11r33un6VvO3-IlE2eNblOsI3lbu0KY84y31ZiePv4ET1bOHm0Q8YsANpsMhV2l_2Jtct71USNRRuobg5-VXSpZtp5fWAr76GTA79vsX2Q0nMq0iRGiwfj7I4hF8Br82P7PtHAOZeUUJieP4NotWv6gpmzzFgWNdChCnX6x-TFaFqefqFUbIz61QtbOVfWfFmVLeDBXp55Ynf315nizeJveOphpVjOXAVA2IgcNISDd4QRO9e5U2kZ2n8EoXUz44jLPuY5b-nKJXhniAIfZbDF14fM-QgR6lg7b2BxmVOZpM9faxp62lnAO86vIz9o6N5A3BHWqv_XbhNK4I5MjOZMesAxMTsv37afPsnUuHLasPlih5AByCUsNxvNHa-qHHBpB5413gkoJ_fIw8Ueqljadqxn0iy-kmV_ncVahjHMHQae1r718ma_Lii6ZkB4ha6XexGYmvO_2JMGuHdiy7gYjeWWI1virZ4oR1Rxq_vrHuTOqzA3tDn688135x2BAvIzsYUGdMdHTTDsiMYmY_bqpomq5O5UOUba_Dc04wyb7RUMORIQ0LUWVina3ZwkmnoGJ-gfogu4YZ3K0Xg1HV7YFgj90-1DF2vz8LIHvzY9kb_R04zm0rMWPf49siL7Ji0VS04DRnK05XYiXWNNsVVORZXnC2y1nO2EIxXOhthceRiNKa8QN0AiP-U6Jk6aQcOsbK7nXvkLqfzEk8UVzQFoOCWF6jHiweF7KiCaVJhg1Q0jRJYiHWaeLns8PprFY7JAhaJlU8agEPOQtTjRnVw956iUjr7AXEevoNAsbZon82uEabCg8-BwGWDaqT2NCLkYRqZOB_upm8VA">