<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/147963>147963</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Investigating llvm-objcopy --add-symbol with High 64-bit Addresses?
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
maskhero0308
</td>
</tr>
</table>
<pre>
LLVM Version:21.0.0git optimized build
Adding a symbol at a very high virtual address (in the 0xFFFFFFFF80xxxxxx range) appears to misbehave in llvm-objcopy 21.0.0git. In testing , llvm-objcopy does not preserve the full 64-bit value correctly – it effectively drops the 0x80000000 bit of the address. For example, a symbol intended at 0xFFFFFFFF80C9F0BC might end up recorded as 0xFFFFFFFF00C9F0BC (losing the 0x80 high part) or similarly 0xFFFFFFFF8170E2B0 might become 0xFFFFFFFF0170E2B0. In other words, the symbol’s address is truncated by exactly 0x8000_0000 (2^31), indicating an incorrect handling of the sign bit of a 32-bit value within the 64-bit address. This suggests an internal parsing or casting bug where addresses above 2^31 are being misinterpreted.
By contrast, GNU binutils objcopy is generally more reliable for injecting 64-bit addresses, but it has its own quirk. GNU objcopy will accept a hex value like 0xFFFFFFFF80C9F0BC, but internally it may parse the constant as a signed 64-bit number. If the value exceeds 0x7FFFFFFFFFFFFFFF (the max 64-bit signed), GNU’s parser saturates it to 0x7FFFFFFFFFFFFFFF. In practice, 0xFFFFFFFF80000000 (which is just over the 2^63 limit) gets clamped to 0x7FFFFFFFFFFFFFFF. Thus, GNU objcopy might not throw an error, but it can silently produce an incorrect symbol value (the maximum positive 64-bit) for addresses in the high-half kernel range. This is less obvious but equally problematic. In summary, neither tool properly handles an “absolute” 64-bit address above 2^63 by default – LLVM truncates the 0x8000_0000 bit, and GNU (at least in recent versions) saturates to the max value.
This issue with --add-symbol is not widely documented in official LLVM release notes or bug trackers. There is no explicit mention in the LLVM 21.x changelogs of fixing high-address symbol handling, and a search of LLVM’s issue tracker did not reveal an open bug specifically for 64-bit address truncation. However, the odd behavior has been observed by users. For example, one Stack Overflow report shows llvm-objcopy adding a symbol ended up with SHN_ABS (absolute) index and the wrong value, whereas GNU objcopy correctly placed the symbol in the section with the right address
. This was a different scenario (combining --rename-section and --add-symbol), but it highlights that LLVM’s implementation of --add-symbol can be buggy and handle symbol addresses differently (or incorrectly) compared to GNU.
Internally, there are hints that LLVM’s tools treat certain addresses as “sign-extended 32-bit”. In the context of other formats (e.g. Intel HEX), the LLVM code explicitly checks for addresses above 0xFFFFFFFF and considers those with high 32 bits all 1’s as “OK” (i.e. just sign-extensions of negative 32-bit addresses)
. This implies LLVM is aware of addresses like 0xFFFFFFFF80000000 being tricky. Unfortunately, in the ELF --add-symbol path, it appears the logic isn’t correctly preserving the full 64-bit value. The symptom – dropping the 0x80000000 part – suggests that the tool might be converting the address through a 32-bit signed type or otherwise zeroing that bit.
Comparison to GNU objcopy and Workarounds
GNU objcopy tends to follow the documented behavior: if you specify a section and an address, it treats the number as section-relative offset (and in an executable, adds the section’s base address to it). If you specify no section, it creates an absolute symbol with the value as given
. In practice, to avoid parsing issues, you can leverage this with LLVM’s tool as well: provide the address as a section-relative offset instead of a raw absolute. For example, if you know the symbol is in .text and the section’s VM start is 0xFFFFFFFF80000000, you can do:
--add-symbol myfunc=.text:0xC9F0BC,function
instead of myfunc=0xFFFFFFFF80C9F0BC. This way, the value “0xC9F0BC” is well within 32-bit range. In testing, this produces a correct symbol at 0xFFFFFFFF80C9F0BC in the output (the tool adds the section base) and avoids the truncation. In essence, using section:offset notation (relative addressing) works around the bug by keeping the numeric value small and letting the tool compute the full address. Be mindful that the section name must exist in the ELF and the offset should be within that section’s size.
If LLVM’s current version is giving you trouble, the safest course is to use GNU binutils objcopy for this task. GNU objcopy has been used historically for embedding symbols in high addresses (e.g. kernel symbols) and is less likely to mis-handle typical kernel address ranges. (Do double-check the symbol values afterward, since – as noted – extremely large values might still be parsed in a signed context by binutils, depending on version). Generally, addresses in the 0xFFFFFFFF80xxxxxx range (which are sign-extended 32-bit negatives) are common for 64-bit kernels and GNU objcopy is expected to support them. No specific bug in GNU objcopy is known for those ranges in released versions; the parsing quirk only shows up at the absolute 64-bit limits.
Downgrading LLVM is not known to help here – the llvm-objcopy tool had limited ELF maturity in early versions and no indication exists that, say, LLVM 12 or 13 handled this better. The issue seems to be in the fundamental parsing of --add-symbol values, which likely hasn’t changed (the Stack Overflow report was on LLVM 16 or 17 timeframe and 21.0.0git still exhibits it). Unless a specific patch is mentioned in LLVM’s issue tracker addressing 64-bit symbol addresses, switching to an older LLVM probably won’t solve it.
In summary, LLVM’s objcopy 21.0.0git does not correctly support adding symbols at very high addresses – it truncates or misinterprets those addresses due to a bug. There is no official documentation of this limitation aside from user bug reports. Until this is fixed in LLVM, you have a few options for reliability:
Use GNU objcopy for adding the symbols. It more faithfully handles ELF64 symbol addresses (ensuring the symbol ends up with the correct 0xFFFFFFFF8... value)
. This is the simplest workaround if GNU binutils is available in your build environment.
Use section-relative offsets with llvm-objcopy’s --add-symbol. Specifying the symbol’s section and an offset (the address minus the section’s start) avoids giving LLVM a huge integer to parse, and in tests this yields the correct address in the output.
If neither of the above is viable and you must use LLVM tools, consider generating the symbol via the linker or assembler (for example, by linking a tiny object that defines the symbol at that address) rather than using --add-symbol. This bypasses objcopy entirely.
Unless and until LLVM’s developers address this in llvm-objcopy, the most reliable solution is to use GNU binutils objcopy for high-address symbols or ensure the value passed doesn’t trigger the bug. In critical use (like kernel symbol tables), sticking with GNU objcopy or the linker is advised to guarantee the 64-bit addresses are preserved accurately.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJyEWUtz4ziv_TXKBhWXI-flRRbpR7pTt6dn0dNz726KEiGJE4rUkJRtza-_BZCU5CT9fdmkypZJPA4ODiDhvWoN4kNx86G4-XQhxtBZ99AL_9Khs9vd9v6isnJ6-Pbtz9_gT3ReWVPsHsurzXazbVUAOwTVq39RQjUqLYvtY7F9fJRSmRYE-KmvrAYRQMAB3QSdajs4KBdGoUFI6dB7KMp7ZSB0CNvTU_q73574D5wwLRblHsQwoHAegoVe-Qo7cUBQBrQ-9Je2-ru2wwSzYRt4NhDQBzKkKD-ePyYtejA2wODQozsg396MWsPt9WWlAhyEHhFq6xzWQU9QfC6L-22x34EKgE2DdVAH1BNIZwefjL_fxj-gE2zDnyYnN_BkHeBJ9INGsmcOjjIBjURJUVr7_3H_tP3wEXrVdgHQSBgHcFhbx8_61bPb_GxR3mvryeNsTwz4IFygEFoHXvVKC6en9V1Xd9vP5YdtuqvC2vbrXGzT9xxTGzp0cLROenKDLoqO5Ajt_ZxY5SG40dQiED4mcp-DGSP1F4eqKO_L4ubz7qoo93SgMlLVgtMmDCiTUgCdMFLTpymuBNwcZwG7cpW1owpdAlTK5pyEPzrlwY9tiz74eEFAZ4SmGHHkrINaRNhUYwvHDt2cRPQgKntAiBaDcAgV0qO98nzS4DCg3MQy-DBBbU1wwgfy7Mv3n1ApMwalPWQkKg8tGnRC6wl66xAcaiUqjdBYB8r8TUgz7StPkINfjYHg2AkPKniwRwP_jMq9bPiufMVRaQ2irnGgMuzwlMKk1Qu-g7j54BQZPdEdvZg4RLFSamt8ECYQDgWnAmW20Ix9hW4DzzFP8S481YiSQHv3dP5HAKDnenHKJ8TzEh6-fP-5Qhab4MCLMDoRkPwmQnh7LGN1cKIOquZ6WzuaqrQo74-dqjtKwt-jD2AP6Nhoyu_tDrTqFRdOi8FDrUU_oPzVfX90o89pzqGPBUVEEzpnjwQ4dM66VfJqYcArjYYKY3BWjjWeIz_xRAzkEi3Vjz0M1isiohQ6spVgs-A11QHRwGUndAMv6AzqSKupHpQHTeVqq4Oyo2fL8J-RUz84W2nsRVA1h9SPfS_cRA4YVMwFwVpNzw1ItMJ1ilxbOW8fReWtHgPmDz69QvOqqm53RBQSGzHqsGJdbkCZTNZ8-1cmXCZVIzn-RXkvAmgUnnBMtIkmUAei_uUpSguCgoWMPw5xKt4UGZ_4BC4vhZSXmbNj9zgqyS3A1mOPhkhOGbBNo2oldDTZIVmB9Dh6IhcileBE_YKO-YjohY8DPA1a1VRraIKyJueOzymvNieoO8qatq0n1mvUiYiBU5sDmezLZJljIsCjcHVHP6PjVhUVXUwWgVSSPXN4QOrQBuyAho32A9aqUTXDgkD2KocpOcqaDXy1Rzygyw3CSgncsJV1zFYVogFbcevlzjB6DserHmkNwo8g6hf4_YCu0fYIDgfrAvjOHv15TxevREdsquMQ0_fj6_e_Hj_8YGRkNJZ76jZ44hCRnUdnTRtRQNcz9wt_VtGLIhi0qFGuGmDOlyfGtibeSx84ZoEUp2L7mMruyOQpVdOgI3j6Go1wypKNte0rZcify0uHRvR4mY8lY9dgTDyZe4FqO033UY2I8CbbFFrCFyeK4HCGayKjCind7cQXxWqeddxMK7PVJI3Ke-5Uc2gosLXtB-EiW375_jMV1fPcUxI0qLc6oifzC4OJXAhbKALU6IJQZt2N_UIy1DUu8ZTEVBQEM-FEORg7V8ATq4aoZBrrehFYhOKmpecCavj6-f9SXOcKrK3EuUb1BHWH9Yt_RbeRyJZew0GkbqkkknbtrE98wsJsVxJzeRBaw9VaQK38-v1_FtokpbzBTexWi8PMauSSwVZwO0h6aKUW9gvwCAQKfXRLeRBHygHpqNmPN9ogC1tWO8Gp-mXawE_TWBdGIwLGjKYK-Pzt6RxXgwgdfx8WGd8haNuqGpQ3s-thXWBRm2c5-0adM3sSNIdg-1WrIEU-rEVwsp1U8OqxWQUy7OhZbmRZA1PWDuhCPmimuc7Zse0WzZnUT5gGJH5nVB2VR_gXnY2_FoHSTDXwkctCeWtSYczEQkD5X-tehLOjkT7Wy_oBAjZ3q8ZqYkIyatV5Mr8Wu0dQDUx2TIw9Mfsv3CHmAkoZ4eKKCYnajeCXfnHpUEdE2abxGJg_Dfc50jInrMdAYpUbjZR-TYArPFfUAucIWmClwhJxbaex80-jZTVZFrVEJu1MRTO5RlkkPLTqgCaG7Y34CxbEwSo5q3zuehwAMoBoT1PDEi3pW-JmOv4dKqKLjqg1RXlw9qAknqEj6uFfxE4ZH1DIOLE4cZydetP4UgJfTMrzIjuUgQ0TWO5Zb4P952_gA2Fd-XdKeO2ztMXuMYbsrFz7qRlNXew-8VXF7nF7mmcD-oYv5J-tXJp_9HakmBtepv0sZjPHzefPTKdinPMsl2otqdZltI_nKZ-VM4X_lW5-f6xOTGXHMIwhq-qY4VcwZvDyAoKKh1AUv17rnWcDxJsmom1kiOXE7B5T-o1NXbco72doJOCwK3saq188RArgS0h4VRO8IM6MZsYenapTCH1PvYNM0xhmsmJHqAVTxczkOU_BHxB6ZWQz6oX8srekNaCn_oInFeVzpvQMueSP7-yoiXiWgVuEd_Do1b9ZVD-_kZ_16NxKm_M8rJjzCaTB2TGxC9soGvTUIkYaRBVTyejx_cmaGjNDIwj_aiKeJejoUUKnfLBuJWyxrzCKyQghrjru10uDzHIhjVPpwQyTPFFRG9VTWlldJjEVpoEuyz_N1MHY9hs6-ZMFyY5fsshYUwBn3YNoArqjcJJC45WpcdXZBE8nKFcf4Sk47MkWLVyL-ZjY7HxQWlMeebqO5J7bWtZL1TRHmK6UOKDhEFmTc8eU_iUvM1JHOB9Df7XfW0ZxUiLvKblZ2sQYO-rPfW_NehKJAfXzFLjaseBpwDpEOerHgWeI0GG_ge92nmy42pR5_VtiYZPgRPItJirOlTzcyWWy3H1gP3Ob4W0MWKOnNLGMA6R6mxtaMp53DT7VySd7NK0THOCs0mgsi6YECx3qAVg-LzlmRbWeiJgFOiHj2Si5iHsafFWYyH7kRWA2nuNm7LyBsyZSQFRIDLRI4GzRVUlq52qXRgQZa63CENBFZRYnS4_Yc6FWmFHQjEYKHkJWa7dXk0gEaJzCCBepkjpxLhZ5IJaZv9-fFWnOsiZZfctW30FQPTaOuI68XvbZsRbw1CkW5kmo_DRczmKByiBCXBylYT2WzX-crReqnxddr8YqjvFRhbpjIrc8f2uJLho_OFuJSk9wtOsgeKsPCFFf8oy1XtK8sujNnnxZhS_COxeIOCdBEVY7_BUTrjfjy4bGurOlaJ59ViMkhcaCoKo7X4TMG5Qsb-dhlTHGaI4fCZqroHG25w0CF3DMuqekBaXjT5SHRp1WOUoiiN8hCGjwyC8yqAiozuMKVmkVplkgAQD8TL1m3WJSkBaK9ht4DnGV2wgVOmq9y1rs87en2-u38zS1E-NHd34UsOTPK4w4vkZtsyLSzWaTVxb7xdRltReX5aQsfWCBkeSFas77Jg2CB6E0756Vofi4-FIH0ByUs4ZSsTmPxi-kbhLQazZagXBd6Bv4EdX_uedr-XA-uyxzyFp398qMvxo-WAxz14jiLekLLikB3dgi77pbXmXGHpj3ZioKTR9RNCnUSfzlPMxvOtZychWj52Zek-b3QbwhUB4OcctP1xAUWXKRmImbTmtjo827g_SWILwCyEGJyPvKEMUQHr3HvtLoKETN-VhRTfxgXJMFZSZCMrnByk1io0zarS7amb-aB8Y9OBGXvp0wSeiep5NhV02DYFznSiGSdKinFJrMp0bCyFX6iqYkHlDbAZ1fzd1x-jnDVNKFvfVheW_CfTVpyf8mEN9ZnjJ1cTHialRhfySz5Zp7g1Ntm14aMI89G6idCizw6OaivOddyplOBJ6Zfdow-aBqzgnXzJpfrFsnlwpUHpSPKqYdhRMmIL7znosEosP55aYEUde87dbT5kI-7OR-txcX-HB1d1Ne3VzfbG8vuod7sZe3t3V5fYN3u_1tcyPK5qrcXtd3V2InG3GhHsptebO9u9pe7XY329uNaOr77V1dl7f7q30tyuJ6i71QekM52ljXXnADfLi6vtvf7i60qFB7ftVclgaPsT0WZVncfLpwD5zYamx9cb3VJDuWY4IKGh-ezYGGvjYWwZnSOZMOHMav1KRSVB7n_rp7uhidfuhCGDwxe_lUlE-tCt1YbWrbF-UTHZv-XQ7OUnUU5VNeFzwlVw4P5f8HAAD__6oevs0">