<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57693>57693</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Bug in the ICF in lld
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
mraleph
</td>
</tr>
</table>
<pre>
`lld` seems to apply ICF to two pieces of code ignoring relocations, e.g.
```cpp
#include <cstdio>
struct B {
virtual const char* getStr() {
return "one string";
}
};
struct C : public B {
virtual const char* getStr() {
return "another string";
}
};
[[clang::noinline]] static int getInt() {
int value;
asm("mov $0x7, %0" : "=r" (value));
return value;
}
int main(int argc, char* argv[]) {
B* b = argc == 2 ? new C() : new B();
printf("%s %d\n", b->getStr(), getInt());
return 0;
}
```
When compiled like so:
```console
$ clang++ -O3 -o /tmp/test -m32 -ffunction-sections -fdata-sections -Wl,--icf=all /tmp/test.cc
```
with a specific version of `clang`:
```
Fuchsia clang version 15.0.0 (https://llvm.googlesource.com/a/llvm-project c2592c374e469f343ecea82d6728609650924259)
Target: x86_64-unknown-linux-gnu
```
Resulting binary will SEGFAULT because `B::getStr` got folded with `getInt` and `printf` is trying to print `0x7` as a string:
```
* thread #1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x7)
frame #0: 0xf7d4fb76 libc.so.6`___lldb_unnamed_symbol3153 + 38
libc.so.6`___lldb_unnamed_symbol3153:
-> 0xf7d4fb76 <+38>: cmp byte ptr [eax], dh
0xf7d4fb78 <+40>: je 0xf7d4fc06 ; <+182>
0xf7d4fb7e <+46>: inc eax
0xf7d4fb7f <+47>: xor edx, edx
(lldb) bt
* thread #1, name = 'test', stop reason = signal SIGSEGV: invalid address (fault address: 0x7)
* frame #0: 0xf7d4fb76 libc.so.6`___lldb_unnamed_symbol3153 + 38
frame #1: 0xf7d12f39 libc.so.6`___lldb_unnamed_symbol2903 + 7145
frame #2: 0xf7d00f35 libc.so.6`_IO_printf + 37
frame #3: 0x0040ba99 test`main + 73
frame #4: 0xf7ccb905 libc.so.6`__libc_start_main + 229
frame #5: 0x0040b922 test`_start + 50
```
`0x7` in `getInt` is selected so that it would match `B::getStr` code. If it does not crash for you - just tweak to match it by checking `B::getStr`:
```console
$ buildtools/linux-x64/clang/bin/clang++ -O3 -c -o /tmp/test.o -m32 -ffunction-sections -fdata-sections /tmp/test.cc
$ objdump -rD /tmp/test.o | grep -A2 \<_ZN1B6getStrEv
00000000 <_ZN1B6getStrEv>:
0: b8 07 00 00 00 mov $0x7,%eax
1: R_386_32 .rodata.str1.1
```
I have not tried debugging it but from my reading of code the problem is here:
```
bool ICF<ELFT>::equalsConstant(const InputSection *a, const InputSection *b) {
// ...
const RelsOrRelas<ELFT> ra = a->template relsOrRelas<ELFT>();
const RelsOrRelas<ELFT> rb = b->template relsOrRelas<ELFT>();
return ra.areRelocsRel() ? constantEq(a, ra.rels, b, rb.rels)
: constantEq(a, ra.relas, b, rb.relas);
```
Here `ra.areRelocsRel()` is defined as `ra.rels.size()`. This means if `ra` has no relocations whatsoever but `rb` has some `rels` we will end up comparing `ra.relas` and `rb.relas` by mistake - both of which are empty and then we will end up merging `ra` and `rb`. Instead we should have compared `ra.rels` (empty) to non-empty `rb.rels`.
/cc @MaskRay
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzNWEtv4zgS_jX2hbAgUw9bBx9iJ54NMLsNpHt3gL0YlERZ6kiihqTieH_9fEXJjp24g96ZPawgOCJZrBerviomVflxNYn9us7xy4yUjWFWMdF19ZE9brY0sAfFukpm0jBVsEzlklX7Vumq3TMta5UJW6nWTPiGSW_vsYl_P_Hvxt_YH96s68YZHlRtVvfgMgk2mbF5pSbBw-UmY3WfWbZmk8V6mGHspdK2FzXEt8ayrBR6wu_YXtqvFl_LCU8uqenR0va6ZRPOVSsZeEJfDCbBmWqyuB_F4uM0faXCBjresa5P6yr7a_q8aSNaZUup_1uNJtEab1YL7Anu8LaqauuqlZPoHi_Y4RgyVrWWtHhs7S0taPVF1L28kMmYMI0j5o16gYah_7qgw5zwyMekc4FT8167IV8OLHhCb_DRyGsJbza5X1KhEVULNvQp9D4jYScPYvziTL1_p_yaVlMoc-_20Ad9k3pb1soD25wMhro0Xg_jCwU7ONwWg6kwzpCF-STatG68YekMgXh5hDR56cxrdqO1_g1LT1F_afhvpWwRLk1X1TJndfWMoFR0lDfTBWGlanlKmZANB89h1JrNvgRspqD91jYd_UqE4KwJOJsVRd9mlI4zI91fg7lcWHEx_q2GXbNZlRVwoKjra0Zeln1ixKGyJRPMdDKrCoTbi9QGXAkYSGunJHb8wKphuO2z0lRiMOnMYR55vudTdJXWdoZY8C3eun5pvL1S-1oa1etMevAh5sW4Nuu0-g7bWMajhGfBIpRhnBRBGACxxJLn8YIvYz-JIz_hIWjoEJ0e3xBH0lK4vC7jXRzO-va5VYd2hqzqX2f7tv_EEU_S9LUlCEyrVugjO1Tw5NeHX7Z3__z1G0tlJnojySnrIV3HuALK7pVlhapzRIFzJ-bGIMOiaHOaGEMVExUAWR9JEKDYTdM6pShRGzqMAUg-9Tkljy21FODOgzkFdisa6dJpwhd08PhD08aqDqEtDM6EVg2wHjj39fEXGPcv8lbVIsGrnIk819JQFi0LAWecJojGQUjyhgyFdtJ44A-rxSIPi3QRIw_SzDPKi6HqbrdDIUp3fUu65TtzbFJVB_MoYBT2wXLg93Nbzv6gpGaXIlF2wA7cUHWgTNZ0pGF6tJJ1VjOgjxSvDoA2LC_fbDizWI4sQn9k8V2yC4LMj9m7BxAx7pkv-bnaXTGVJ6bxyBRVkghIl4_UxYl6MVK_Ku2o81dXh_PX07kvyT-Ei6n9fwgFkv6_DIar4Jqf-c15ESQ_wY8n_sBvMQ-jGxz5maPvF0F0zfHxy27I00GjxY39wbDf90M_FUnCnHdjnwrgIDa4sSk8Cc2yNPHfCd3RaIdqr-3uzIbfzLXoQniCyj0KHza7fZH_CcS9wQxJuQQpYJKRNUAXEGbQIJbCssqyg-rrHMXdZuVN4KPW0WOPBdHmCu0kOiGWaWFK4KFmR9WzGfveo5jZgxTPBHgDM9CnR7QIMnsmILzB-6fraNpXdW6Vqg3VDwf0r3GI77HAblNqTbYfym32oeJ66udr7g8qLPRR6fe8BwLN9P0H9pMF2g8tsXiHNifaIOV3__7HfB0PZj-8DGz88WEf1x02nCPDpRuQbsn8BQP98J4f6v4Iq04NILqjK_C59biMe9oFKKBwBT2eVmS7h6o09-bvjuPqbvDISvEiXRCggiGUcpn2-z2dMB14jyqpVcOaI2FQTtOn6wf6Z5RCldayoVhENy0_L38pDpyuM_DQw6_bb6Nfgjv5O3p4s6EeXrgmb2jnH9uut1-HwyPEEq5DvbmUvmtTh6aFeZ53mhn2PcnafNH4FeZNCabF0NBSjbKy6WqBIqRvkX5oZj9lO_TJ6Z9gOza1WnhCyye63Rn8nlvr7SAXznr4HZPOMaAl3q6DduN0HCefx46ri3c_ZCjecxTmUtWbkPU3RALBw031R-TKZYFbU06900BJynqm-o88k3nsWwnKRgrkblUMdLS9FARal7dedgD4GSXRxrqQJdL0RGpUM6hD7sDcQQ5tokST13fuMiD0iGhno9-awLPZmAL-NRX8hFvDjKW4QVIyHMoK6AhLGQ7ZHt0-S9eMd4IaqfdnMVcCnLGPOABqB7DLlA7DXWIO6sn8wk0uhfnSSaOAAEK3wL1B-lljIvOusnGEu22Ge1vo_12Y5ydxZFO5mscxTxZLvpxP81WQJ0EipraytVyt-z3VHUp1-j8EPlG-p72uV9f3gz1a6D4dLwV0IXh3L8CwMqaXhPfRIk6CabmKOCpzFhVJUOCKEEXLJMljHmXzdJ4kWZBOa5HCjBVdRDmny6RjQdfE6H5arbjPuZ_M-dznYTj3krkIotQPUz8NwiLkMFGiNNeeu7sovZ_qlVMJ8GawWOMgzduiMNRYSbka7r1T0dtS6VWjRS27cupEr5zqfwDQWQ_i">