<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/61402>61402</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
RuntimeDyldELF doesn't clear `GOTOffsetMap` in `finalizeLoad()`, leading to invalid GOT relocations on AArch64
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
gmarkall
</td>
</tr>
</table>
<pre>
# Problem
(This was originally discovered in numba/numba#8738 and is reduced down from there)
In `finalizeLoad()` the `GOTSectionID` and `CurrentGOTIndex` are reset to 0 with the intention of resetting everything related to GOT sections for the next object, but the `GOTOffsetMap` isn't cleared:
https://github.com/llvm/llvm-project/blob/ff11d6b6f6e27f5de389002b8f6102b6cf3ed474/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp#L2409-L2410
This leaves stale entries in the map, which can prevent the creation of a GOT section and/or entries in the GOT for subsequent objects. If a GOT relocation in a later object happens to match one in the map when `findOrAllocateGOTEntry()` checks to see if an entry already exists:
https://github.com/llvm/llvm-project/blob/ff11d6b6f6e27f5de389002b8f6102b6cf3ed474/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp#L2259-L2261
then it will never allocate a GOT entry for the new relocation (and doesn't allocate a GOT section at all, e.g. if there was just one GOT relocation). GOT relocations then get replaced with references to addresses in the first section (section 0), which is always invalid.
# Reproducer
This can manifest on AArch64 with the following example:
accept_pointer.c:
```c
void accept_pointer(void (*f)(void)) { f(); }
```
send_pointer1.c:
```c
void accept_pointer(void *p);
void f1();
void send_pointer1() {
accept_pointer((void*)f1);
}
void f1() {}
```
send_pointer2.c:
```c
void accept_pointer(void *p);
void f1();
void send_pointer2() {
accept_pointer((void*)f1);
}
```
All compiled with:
```
gcc -fPIC -c accept_pointer.c
gcc -fPIC -c send_pointer1.c
gcc -fPIC -c send_pointer2.c
```
(`gcc --version` is `gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0` on my system, but I don't think it matters too much for these simple files).
`send_pointer1()` and `send_pointer2()` have a GOT relocation to `f1()` in them:
```
$ objdump -dr send_pointer1.o
send_pointer1.o: file format elf64-littleaarch64
Disassembly of section .text:
0000000000000000 <send_pointer1>:
0: a9bf7bfd stp x29, x30, [sp, #-16]!
4: 910003fd mov x29, sp
8: 90000000 adrp x0, 20 <f1>
8: R_AARCH64_ADR_GOT_PAGE f1
c: f9400000 ldr x0, [x0]
c: R_AARCH64_LD64_GOT_LO12_NC f1
10: 94000000 bl 0 <accept_pointer>
10: R_AARCH64_CALL26 accept_pointer
14: d503201f nop
18: a8c17bfd ldp x29, x30, [sp], #16
1c: d65f03c0 ret
0000000000000020 <f1>:
20: d503201f nop
24: d65f03c0 ret
```
```
$ objdump -dr send_pointer2.o
send_pointer2.o: file format elf64-littleaarch64
Disassembly of section .text:
0000000000000000 <send_pointer2>:
0: a9bf7bfd stp x29, x30, [sp, #-16]!
4: 910003fd mov x29, sp
8: 90000000 adrp x0, 0 <f1>
8: R_AARCH64_ADR_GOT_PAGE f1
c: f9400000 ldr x0, [x0]
c: R_AARCH64_LD64_GOT_LO12_NC f1
10: 94000000 bl 0 <accept_pointer>
10: R_AARCH64_CALL26 accept_pointer
14: d503201f nop
18: a8c17bfd ldp x29, x30, [sp], #16
1c: d65f03c0 ret
```
Loading these objects with `llvm-rtdyld` and entering `send_pointer2` results in a segfault:
```
$ llvm-rtdyld --execute --entry send_pointer2 accept_pointer.o send_pointer1.o send_pointer2.o
loaded 'send_pointer2' at: 0xffffb18ac000
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0. Program arguments: /home/gmarkall/numbadev/install-llvm/main-20230310/bin/llvm-rtdyld --execute --entry send_pointer2 c/accept_pointer.o c/send_pointer1.o c/send_pointer2.o
Segmentation fault (core dumped)
```
When I attached with GDB and looked at the `send_pointer2` function, it looked like:
```assembly
ffffafb6a078: a9bf7bfd stp x29, x30, [sp, #-16]!
ffffafb6a07c: 910003fd mov x29, sp
ffffafb6a080: 90000000 adrp x0, 0xffffafb6a000
ffffafb6a084: f9400000 ldr x0, [x0]
ffffafb6a088: 97ffffde bl 0xffffafb6a000
ffffafb6a08c: d503201f nop
ffffafb6a090: a8c17bfd ldp x29, x30, [sp], #16
ffffafb6a094: d65f03c0 ret
ffffafb6a098: d503201f nop
ffffafb6a09c: d65f03c0 ret
```
The address in the `adrp` instruction, `0xffffafb6a000`, is the beginning of the first section, the `.text` section of `accept_pointer()`:
```
(gdb) x/2x 0xffffafb6a000
0xffffafb6a000 <accept_pointer>: 0xd10043ff 0xf90007e0
```
i.e. offset 0 in section 0, which was mistaken for an already-extant GOT entry due to the stale entry in the `GOTOffsetMap`.
## Files
I couldn't attach the objects here because Github doesn't support them, but the object files can be obtained from https://github.com/gmarkall/numba-issue-8738/tree/main/c for convenience:
- [accept_pointer.o](https://github.com/gmarkall/numba-issue-8738/raw/main/c/accept_pointer.o)
- [send_pointer1.o](https://github.com/gmarkall/numba-issue-8738/raw/main/c/send_pointer1.o)
- [send_pointer2.o](https://github.com/gmarkall/numba-issue-8738/raw/main/c/send_pointer2.o)
# Patch / fix
I think the way to resolve this is to clear the `GOTOffsetMap` in `finalizeLoad()`:
```diff
diff --git a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
index 3c7f4ec47eb8..205ee5273b27 100644
--- a/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
+++ b/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
@@ -2407,6 +2407,7 @@ Error RuntimeDyldELF::finalizeLoad(const ObjectFile &Obj,
}
GOTSectionID = 0;
+ GOTOffsetMap.clear();
CurrentGOTIndex = 0;
return Error::success();
```
Making this change and rebuilding results in:
- The reproducer above running to completion as expected on AArch64.
- All the failing tests identified in numba/numba#8738 running successfully to completion on AArch64.
- All unit and regression tests (`ninja check-llvm` and `ninja check-llvm-unit`) green (no unexpected passes / fails) on AArch64 and x86_64.
I had planned to submit a patch - however, I'm not quite sure how to go about adding a test case for this scenario given that it involves multiple objects - can you provide any guidance here please?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzsWktv4zjy_zTKpWCDphzZPuTgPJx_gMw_je5e7LFBSSWbHYrUkJRj72E_-6Koh2XlMT0zvdhZYIWgbYtkVbEev2IVWzgntxrxKrq8ji5vL0Ttd8ZebUthn4VSF6nJj1cRj-GTNanCMmK3EVu3__Ll15108CIcGCu3UguljpBLl5k9WsxBatB1mYqIb9rPeLmIlyB0DtKBxbzOMIfcvGgorCnB79BixFdDNg8aooQVRF3-Ax-NyCO-pDkJo_k0eP_09QtmXhr9cEuviX6UsJvaWtT-_unrg87xEEYsgkWHHrwBBi_S7wIRqT1qIgCmaCZ4qbeAe7RHv6OvFpXwmNO6-6ev4Bp-DgpjAwWNBw8m_Y6Zj_gNpLUfSPdUFA79L6IiGaTTEV94yBQKi3kUr4fb3XlfOXrHNxHfbKXf1ek0M2XEN0rtu49JZU3DapMqk0Z8UxSzWZ6kSZEgXxSXOcbLFWM8XRbJjPE0yYoY8_liPqAjad3dAbOatnKnt1JjxDefa-1libdHlZ__unvcTLOqinj8yOdsNXnk8xkbyh7cQaHYowPnhUJA7a1ER55A2ihFRcp52clsB5nQUFnco25UlVkUnQnEUMlkz4hvjB2Tozmkf1enDn-tiVBjATeFh46IRWWyhrDUIIDMaNt5sBNVhdqRVUvhsx0YjQNh4WWHnfvlT3atAim8f_p6p709nhwx22H2HMg4RJAFCB2EPYJQFkV-BDxI591_p7H5JRmbJ7Oh7J5UIz28SKVAU6SAaPXTar5RwClAXoa2iPiSwjQ32IbDaHFv-zBCXoPT7ZRUG0AioM732vlgsXM7R3w1Hb1yEMTdogeLlRIEOyH4LRZoUWcYjCfy3KJzJw8rpHW-lyXiy-4rI8P3riwdCPUijrRuL5TMp-dAGcNnrKwhuLOvAobioBRaFhh2A-u1zXbJ_AROhVHKvAQ8OoiyUjjyIpFlWPlvlSEUs9NsNBwlrPnLmt97I3M4XxPxZXgbHHpdhL2FN-HbCqLFNRStt8fXEC1uR6SH_BzqvCM8-3PSrKuG42BqMevlGJINY2ecm2kkejPjNZN-j-uIr4jugGi_w9ecA8kf1gD_j2mA_yQNvLXDtVKQmbKSqg2k9_bY_NxmGUyKTw83MMnglbu-njJ2oY9n8G7Gm6LSNhMWFk_2aB0BRMjC0L6O-PJvaa19DavpfMomszr8mv2TsymbT0krzQgtMxrKI7ij81h2ef4BctNgGB0VngkUS-E9WsIUA2Wd7ToUdAhOUghDIRU6QqqR2t7w4cGR5g370uhO7PF1uvMmpK4BlQbVyo-NFfE5pce8LiuY5HZkDAPvx7qJ4jXQQ5ujLZfCA6oimU-U9F6hEAHazpiHf2-lE85hmaojpf8OZaceD34kLRs9EMU350qL7_olAMDCj5VYpcUiLXKI2Mr5KmKrQ4Pgh5jRR3R57cLhJOLxZJZEl7cRn_VE5g2R1YwxFjdESrM_EXFVO3XZTuyFYyuR28AusOFB3iII2W5q1fzRSvj8bb3-fPN_yfzb-vbzt_unr98-re_vIkbB2TBo0GRVrOY9A5Xbnn50eX1gJPw58eyc-ONtMg_UH59m_Nv_3wwZzFqFtQwCh1RFbBUkH0HIq12E1QNON-vHR56QFs4XtoqdtYrNL1nM2awgZtpU_XCrTrHMZp3xVP6e8chkwX6zpCfQqitPLgsWZ2EzFv37_jS0z8mJOPtQTD7_iMuboPTjIcc_CDn-1wg5_pcOuf9F3F8n4t6KBSrp6XDbpMe2gGtOv1HCQgFkfU4FSpsHkfZDK14lxIRR8V4r75paz-G2ELUa-_Mb0TdgA5MJhjoJ6VuoYc64jA8w5lWGHIdo4KKMyJGOdItREl-AIAGBHYqiKNLZUmSMtaJ9erxbf7mjAreUHgSk9ZYKGGNDA-N3l4_SuZpOHZumBaMzVefYFt_C7SAV2bO3IsP2VPLFi-wZCJZ6DbJpxFafrNlaUYKw27pEHepaiPhmZ0oqKrv2Udf3yXFPzLXzQqlJK1oppJ5wxmMWzxiVtlJ3Av-gHbKIb17Zgl6O7TF-19vkC25J_Oa4FDyFToOZsRg2jfmpF_WW5_6dasoHEN6LbNdVlPe310G7yphnzKl-bftAr1y1qHXW1Ks3dGZsFyj5PC7weuYdYPcgCeQyokgTwRbLLhUA9MjbPs5X8Huwd0g2O5Htsbh9SrMPn2NEPiOwZAMCHUZ3cua2CgQaoD6cVnUBMCQ0PxHqwbh9VG7hROgckkfiBDWtFvQmx259qprPN0U4Xx_00aNn-wwQdDh5Ndh7j6i9yNVQeb-Fq2dkB5rocbZ9eqwdzh-4xo9Knr3B4mMc_7rDroHStU_IZSkVh9LDeVv3_h4lbKTrJOxfhk4NpLiVWhPGm-J1I4YmtuSb40rC-vOLKQLX15UuFUC_lQeW2zylcu8Q8Q0_vOkN5-_eTtC95oAd8hlj87goiBi5_wLZBzqUU5yCCd1iCOXaoN_UNZtehINSOi-eUYe6UuiuyzjBgxfaD7pveY2UKUhbp47scWCfUXd63LiKeAybUKoO3j9AZmqVt327gH6BXJe7Q4MuxUzUDuE-pKVBo8_VVZPAdqcS-rS6KYxDUyyld15IjXlzPfBBvhunnElIdZPlIl5GfOMtYptyIr7JgtYyo_eoJepsDLcTisJxYgkxufzDEljxMhDgjcTVZ5rAfVxX_2TmY_Lv8ub_Xt58wLt3OfgUOvF0Sink4dzvmgYLecuLOJJjW3RG7ekMI10AD9Pcq7x7-fL-bdJ74JDLomhe0TeYTLZ0FPvzrXVIf0J7vkENneMB4mxRzDGbLzBdTqecXSJe8kWc8gXMGEvmbQ04mUx-hvSdua6bv5-3mWjOojmDCZ-zRcRvEoj4dft9Ae3gnbXGwjkFMl-8Hlk2M9p5eArIQjAGEU-e0u8Rv2mT3qjRCwDDy0SI4luqZq_73Ybx3qOmwddG7VgY3TuOiXSMLPra6mYvjfCuzjJ0btzefStR_CKem5pJOsh2Qm8xHDotprVUeXNh2RVCr-CNMrXtLyRApGaPYOsm4VIEmbJS2Fy_OMBDhZnHfHA3Me0orZVq0rOQKixGRyxz1F4W8oMr4I5bu-WiVuo4Yv0Ou1pT-IW9bumsETqdgW3T6dVSfxfNhVxTapy6p-OhCdEKB48VbC1iuOHRBmrd77kS4ToogJGQytHUwR0NET4sk2-9jB1Q7UQOlRJaN1fGfQFXBWybwM684J5OJjfwEPFFCdp4-LWWHsHVFmmc1m0NWaf2dLAidYmwVciEw7alLB24DLWw0sBW7pESu_BUT0i9J2h0UNbKy0qd8vMkpNejqaGyZi9z8p0jbGuZC51hk74rhcJhFG8u8qs4X8UrcYFXs2SxXCxXy1VysbtapYLlq_kqzrJilTGWx0meJXy-wMtkvmDzC3nVlnfzGYtncTItFnGSikV8SRNmyziaMyyFVFMyxtTY7UVIHVfJbM74hRIpKhf-ZwLnGl8gDEacR5e3F_YqGDCtty6aMyWddycqXnqFV-fwMDiANAnidyYHfgMK20aF6a75Xt0ynlzjorbq6o8X6EEF_woAAP__vtv2Fw">