<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/120339>120339</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Redundant Copying of Large Struct Parameter to Stack When Passed to Another Function
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jonathan-gruber-jg
</td>
</tr>
</table>
<pre>
When passing a large struct as an argument to a function, and the calling function already has the large struct as a parameter, Clang redundantly copies the struct parameter to the stack.
A minimal test case is in the attached file test.c.txt (GitHub would not allow me to upload it with the .c extension, sadly), reproduced below for your convenience:
```
struct S {
void *x, *y, *z, *w;
};
extern int extern_func(struct S);
int tail_call(struct S x) {
return extern_func(x);
}
int non_tail_call(struct S x) {
return ~extern_func(x);
}
```
I only tested the target architectures x86_64, aarch64, and riscv64, but I would not be surprised if other target architectures exhibit the same inefficiency.
Host system: Arch Linux, x86_64.
Clang version: official Arch Linux package of clang, version 18.1.8-4.
Command line to reproduce results: clang -c test.c --target=<arch> -O<opt-level>
x86_64 assembly (Intel syntax), with -Oz, -Os, -O2, or -O3
```
tail_call:
push rbp
mov rbp,rsp
pop rbp
jmp extern_func
non_tail_call:
push rbp
mov rbp,rsp
sub rsp,0x20
movaps xmm0,XMMWORD PTR [rbp+0x10]
movaps xmm1,XMMWORD PTR [rbp+0x20]
movups XMMWORD PTR [rsp+0x10],xmm1
movups XMMWORD PTR [rsp],xmm0
call extern_func
not eax
add rsp,0x20
pop rbp
ret
```
aarch64 assembly, with -Oz, -Os, -O2, or -O3
```
tail_call:
sub sp, sp, #0x30
stp x29, x30, [sp, #32]
add x29, sp, #0x20
ldp q0, q1, [x0]
mov x0, sp
stp q0, q1, [sp]
bl extern_func
ldp x29, x30, [sp, #32]
add sp, sp, #0x30
ret
non_tail_call:
sub sp, sp, #0x30
stp x29, x30, [sp, #32]
add x29, sp, #0x20
ldp q0, q1, [x0]
mov x0, sp
stp q0, q1, [sp]
bl extern_func
mvn w0, w0
ldp x29, x30, [sp, #32]
add sp, sp, #0x30
ret
```
riscv64 assembly, with -Oz, -Os, -O2, or -O3
```
tail_call:
addi sp,sp,-48
sd ra,40(sp)
ld a1,24(a0)
ld a2,16(a0)
ld a3,8(a0)
ld a0,0(a0)
sd a1,32(sp)
sd a2,24(sp)
sd a3,16(sp)
sd a0,8(sp)
addi a0,sp,8
auipc ra,0x0
jalr ra # extern_func
ld ra,40(sp)
addi sp,sp,48
ret
non_tail_call:
addi sp,sp,-48
sd ra,40(sp)
ld a1,24(a0)
ld a2,16(a0)
ld a3,8(a0)
ld a0,0(a0)
sd a1,32(sp)
sd a2,24(sp)
sd a3,16(sp)
sd a0,8(sp)
addi a0,sp,8
auipc ra,0x0
jalr ra # extern_func
not a0,a0
ld ra,40(sp)
addi sp,sp,48
ret
```
Only the tail call for x86_64 is optimized semi-correctly, save for the pointless register and stack manipulation prior to the unconditional branch to extern_func.
Please let me know if I should include anything else in this bug report.
[test.c.txt](https://github.com/user-attachments/files/18172937/test.c.txt)
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzMV1Fv2z4O_zTKC-HAkZMmechDmq63AhtabAfs3grZYmJ1suRJcuLs4T77gbLTpGm67e76B_5AUTki-SP1I2mLwnu1MYgLNrlmk5uBaEJp3eLJGhFKYZKNa3J0ydNmkFu5X3wr0UBNRmYDArRwGwQfXFMEEB6EAeE2TYUmQLAgYN2YIihrGF-BMBJCiVAIrcn8IAOhHQq5h1L4qPAKFWrhRIUBHeGstDAbcCgbI4UJeg-FrRV2tr3VswGF0e2L4vuQpUuWLpdQKaMqoSGgD1AIj6A8KBM1RQiiKFHCWmmMGsNiGNoAjM_-ocLHJoedbbQEYwMIre0OKiQ3Ta2tkKAC7FQoI9awAGwDGt9T4IXUe8bn9OywdlY2BUrIkVDW1sHeNg4Ka7ZoFJoCWUYBs6u0_0uX_QG_Aptex-PMt1ZJYHzZEirjy32__uzXHcui5vSmf0iXFJQzoEyA7vGRksH47IBOMR6USSsIpR8pcSc60DI-P4bhMDTOnOG1R6DpzRHOWPP455D__g3mCTssXd6BNXofE4ddwQWqpwDCFaUKWITGoYd2dvV4NY51SYL-0Uhwyhfb7mfeBLg7yXaO4BtXO-VRglqDDSVV2CV4bEuVq9DVnqgQlMH1WhWU1X1fhx-tD-D3PmDFsiUsXVHCJ2WamMguvl6zK_ktulhI2RJsxBL6xAhqUXwXGwS7hoL0CaU3gdFsOBrOkmc8W1V0Vq1MLN3nWgSHvtHBk48IAknR9wAkSXdSlt2wbEXHZdkHSO5ZtrJ1SDRuUbPsQ-ehCx-E91jlek_Nc2cCavB7E0Tb90BslOQ-Vmpy77uF02IdJPfZWXaPJRPbAgCgbnwJLq_7n5XdQvzJV84fNmtbn6g8VTW8qNIY78uKfAkPZw7O8X2T00IbfJW2PD3qitpDW1Up46t_ff787f7LDTz88wuwyXUEuU7bUcomVMUn2qO3tflBu4Nvag_nmv4Ul68iXgf_hvZB7RA2MQDnFJGAWoAEoqUNIeXFUxPbLyhzGF53ad90zwXyPtVAmYjx9P8Zz9I2O0TmQw1tV3i0SfLJ9bNixo_U0tl6zROg5yNqWcOPCPBj1MO0LxMDbdobp8vo90y9o71Tz_UlssnHnwf79qF79t-s8L8nZb3r_5q1amtgF4126V9D5Msy7j8W71zGQkoFXRDxXzKeHUiR4ATjq3FKX82aXqOHU0ZDYoqPGZ-J9JWMQhhdXZZljK9mp6J-n_hKz0z80VXGz8LwR1cxjDNZ9BNjuGCU9jGcijomoigyceBBNKouOirS9pChJ6EdOEF5u9hQMYsX2TsnvOP7d63zyzS96UvLd0zTUfQqU_9Hmg6uLmXqndL0Rp66T0wEEun_nLi3uvU-XgzjjVDp7jtHl-7-rqI82DqoSv1ECR4rlRTWOSxC19ZebDGqE0BtlQkavQeHG-Vp0KD7VJwyoBJG1Y0WcbipnbLPU0hjCmukIoHQkDthipJkJyz0N7QHjTSWaAw0XXw3dkcXzjvwZbyNKlPoRiIIsw8lDVOoaYihCUZ5yBuaj2rrQo_GJtfHQSZ-8GdlCDVd8xi_Zfx2o0LZ5MPCVozfNh5d0o1BNMl5xm9pFqJ1NBtN-TybMn57AsjnA7nI5DybiwEuRtNsPJ7x6SwdlIur2XyE-TTj60mxnmDOi6t8ncpJnuYcMc8HasFTPh7x0SxNx1c8HcpRxsUU83Uh5XyKIzZOsRJKD7XeVkPrNgPlfYOLEU-zbD7QIkft4_jKucEdRCnj9F4fuAUZJXmz8WycauWDP8IEFTQuvhzmSFjZek9U2jV8ihPo124qeTidJb_GDMcx-IFe-pI2l6abBG77iXbQOL34BcEUQr8ktbNPWATGb2PgkeTuZNsF_08AAAD__9FckAs">