<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/76017>76017</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Clang generates incorrect code for unions of types with padding
</td>
</tr>
<tr>
<th>Labels</th>
<td>
clang
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
ivafanas
</td>
</tr>
</table>
<pre>
Reproducer:
```c++
#include <cstdio>
#include <cstdint>
struct my_struct_1 {
float a;
float b;
};
struct my_struct_2 {
float x;
double y;
};
union my_union {
my_struct_1 s1;
my_struct_2 s2;
};
my_union my_func() {
my_union u;
u.s1 = my_struct_1{100.f, 200.f};
return u;
}
int main() {
my_union u = my_func();
if (u.s1.a != 100.f)
std::puts("a ooops");
if (u.s1.b != 200.f)
std::puts("b ooops");
return 0;
}
```
Please, note that `my_struct_1` has field `b` which fits into padding of `my_struct_2` and `sizeof(my_struct_2) > sizeof(my_struct_1)`.
Bug is reproduced on X86 on clang 16 release and on the latest main branch (clang++-17 is not tested, but I assume its behaviour is the same):
```sh
# clang 16 release
clang++-16 -O1 example.cpp && ./a.out
b ooops
# clang main 5caae72d1a4f
clang++ -O1 example.cpp && ./a.out
b ooops
```
The problem is that C++ unions are represented in IR level as a struct of member with the largets size:
```
%union.my_union = type { %struct.my_struct_2 }
%struct.my_struct_2 = type { float, double }
%struct.my_struct_1 = type { float, float }
```
Information about `my_struct_1` layout is lost when union `llvm::StructType` is constructed. SROA pass operates on `my_struct_2` layout only.
`my_func` IR before SROA:
```
*** IR Dump After SimplifyCFGPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @_Z7my_funcv() #0 {
entry:
%retval = alloca %union.my_union, align 8
%ref.tmp = alloca %struct.my_struct_1, align 4
call void @llvm.lifetime.start.p0(i64 8, ptr %ref.tmp) #5
%a = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 0
store float 1.000000e+02, ptr %a, align 4, !tbaa !5
%b = getelementptr inbounds %struct.my_struct_1, ptr %ref.tmp, i32 0, i32 1
store float 2.000000e+02, ptr %b, align 4, !tbaa !10
call void @llvm.memcpy.p0.p0.i64(ptr align 8 %retval, ptr align 4 %ref.tmp, i64 8, i1 false), !tbaa.struct !11
call void @llvm.lifetime.end.p0(i64 8, ptr %ref.tmp) #5
%coerce.dive = getelementptr inbounds %union.my_union, ptr %retval, i32 0, i32 0
%0 = load { float, double }, ptr %coerce.dive, align 8
ret { float, double } %0
}
```
`my_func` IR after SROA:
```
*** IR Dump After SROAPass on _Z7my_funcv ***
; Function Attrs: mustprogress nounwind uwtable
define dso_local { float, double } @_Z7my_funcv() #0 {
entry:
%.fca.0.insert = insertvalue { float, double } poison, float 1.000000e+02, 0
%.fca.1.insert = insertvalue { float, double } %.fca.0.insert, double undef, 1
ret { float, double } %.fca.1.insert
}
```
As you can see, `200.f` value is lost and `undef ` value insertion happens here.
On the other side, IR before SROA also looks suspicious, because 32bits of double value are undef -> whole double is undef.
`gcc` works well on this example.
Possibly related discussion:
https://discourse.llvm.org/t/struct-copy/11330
Possibly related issue:
https://github.com/llvm/llvm-project/issues/53710
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWE1v4zgS_TX0pRCBomzJOfiQxONFn3qQnsNiLwEllSzuUKTAD2e8v35BSrJlx05nGtjDNoRYEFXvvWJ9sNTcWrFXiBuyeiar7YJ712qzEQfecMXtotT1cfOKvdG1r9CQ7InQLaHT35wOV0XYc7iGpywTqpK-RiDZS2VdLTTJfru7qNx5Nf61zvjKQXd8G-7eUiDFCA7QSM0dcJJNT4YH5ekBKbbn-9uA7CPgX2dAqLUvJcLxE0ivhFYBcbiZwc1l23SGOqe37BPsE2p3fGu8qghbE_Z4xTG84Wf4PrEpkGw7V0CK55TSpCHsBVi8mbEBGHTezFHC8kyJUA46LtQHBWcBE-NZ6JU3AKIBwtZBXsKBsDRYjKoep3cArKtDemVPvXc2IjEOWuveEsZ-AltOsOwrsOWnsOOe0Bt7MqX73OB3idxi2F-lHYJruQOS03kMcgott9AIlHVYK8OT91ZULTTCWRDKaeh5XQu1B91cmrPwMlfR0Ir_oG4IW8-XQ1yy3-DGWhrcy2kyl_vs9yAsmKmka9AK_rnOw08ludpDmoPB6FSk1QpciyC5QzskA5SGq6oNmx8thtp_SIsArLSD8CbWYUtK7-AbcGt9hxBcLbHlB6G9Ce8GXMs7jFG4bCyX-23bU_P4IHJYuRCSw8P3FPAv3vUSk6rvgbCcsBwSwnY80d4NRlMezClPDNHTVcU5FqxO-bL5QPRLLLdS6I8WoTe6lNgN28IdvIwcscoscIMxZmhROaxBKPj2ChIPKIFb4DC2ON1Ah12JBt6Fa8fImT06GxPkXv-enF9FuuTc1rItuGOPofKBsNXAklw20u3J-tbyHCF22pAXY4P91Da9Yzu068-L8ptqtOm4Cz7wUvtbJSn5MSwIC1JbB-8tKhi9zqmUh25oGz-iyR_HHoORsFBpNcBgncCP1-9P0HNrQfdoQo3AAHBVwCOZVvKYXEVg6p05DSEtsdEGI-7PgjVewWrrux6eGocGfoiul6I5vuz-8XvUpeDtX8VIcoCT2YiSPcPOqyru1JNzxpLsCTpvXW_03qANFe3Vu1A1-HfHSzkWXI2NUAi11W9SV1zeiy-QJZ3xTwcJy-j5NEHlzPHkLYREM-gOATTbApeBAD4kZ-DhUuwVrC8Mm8R1_ZXlx-Q6Wy8n64pLCQct6qA5JEAiRYNOdJhYx41LekrYWuRLWAfr3pkZ4ejVaiaFRxF7dCixQ-WCgVCl9qq290Vdw76AyBjQ881IYF3Ik6EY0oTGf0jYM2UzFD53k72EU9KVPB7Cc6Xl_0Jpekspu6e0vK80pfcj1GFX9cekp-ES-ZKwdQAc8-KcSRPTyHCte4qpSKHh0sYT6SwiGZtr0JJ-IVtQ1X83VyqNpsKkFgf8SSw-FsEJfHL0dsIQtqIRW2pe323HZ7yZpBulZtDdL3m2ol8ZnT40Pz50sF_vfa_fn_6Pm17SVDyhiVAWjYuhGm4PXPq7Byj0WtghEe41A3rNkf5djmtxs2WvaoyfF-kXU-NCwVfS5MnCUXuouAKLMRdJTodBP6cwCJ9O8XFQjqJgthzJQrhb3veoLLRoMAEYGL4PM652LRqwoo4kl4cxcGk1SK3_tGC97UUltLdxxsWKe4uQsTJMuLqZ_B2ow-Q2yHkIY_p7qyVObwg7LF2PBPsqVsS7Nn9aeEcphylc2NO4efEFoq0VpTyGiZiH4bAWtvLWhvYw5VfrXB-ynLAdYbvwgvbGYhIblzZ7wnaOsN3Q6B4q3R8J26VpltFPmYS1Hu-Q7IVrfZlUuiNsF-ep4eehN_rfWAW6aG4J262yIqWLepPVj9kjX-AmLWhGV8s1KxbthhV8yVIsqhpxTevHok6xrJeUP5Z8SXO-EBtGWZYyRmm-LGiRFHXKWMrqJV3lj9UKyZJix4U8-buI1Jsip2mxkLxEaeN_gTA2DviMrLYLs4mCS7-3ockL6-wZwQkncfMSPxb2qMbpT6hKG4OVg0rXCI020wivmzjM2mEyHz_3Ft7Iza_vW9T_3wAAAP__EXVHVw">