<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/108946>108946</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
llvm-objcopy does not seem to handle `-O binary` correctly, adds PE header to the output
</td>
</tr>
<tr>
<th>Labels</th>
<td>
tools:llvm-objcopy/strip
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
mgorny
</td>
</tr>
</table>
<pre>
In Gentoo, we're using `objcopy` to extract the Linux kernel image from UKI image.
Reproducer:
```
wget https://distfiles.gentoo.org/distfiles/1d/gentoo-kernel-6.10.9-1.amd64.gpkg.tar
tar -xf gentoo-kernel-6.10.9-1.amd64.gpkg.tar gentoo-kernel-6.10.9-1/image.tar.xz
tar -xf gentoo-kernel-6.10.9-1/image.tar.xz image/usr/src/linux-6.10.9-gentoo-dist/arch/x86/boot/uki.efi
objcopy -O binary -j.linux image/usr/src/linux-6.10.9-gentoo-dist/arch/x86/boot/uki.efi bzImage
```
Comparing the files created by GNU objcopy and LLVM objcopy:
```
-rwxr-xr-x 1 mgorny mgorny 19606512 09-17 11:15 bzImage.gnu
-rwxr-xr-x 1 mgorny mgorny 19607040 09-17 11:15 bzImage.llvm
```
The LLVM file has additional 512 bytes at the front:
```
00000000 4d 5a 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |MZ..............|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000030 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00 |............@...|
00000040 50 45 00 00 64 86 01 00 de 9a d0 66 00 00 00 00 |PE..d......f....|
00000050 00 00 00 00 f0 00 2e 02 0b 02 00 00 ae 87 00 00 |................|
00000060 00 00 00 00 00 00 00 00 e0 96 00 00 00 10 00 00 |................|
00000070 00 00 f9 4d 01 00 00 00 00 10 00 00 00 02 00 00 |...M............|
00000080 00 00 00 00 00 01 05 00 01 00 01 00 00 00 00 00 |................|
00000090 00 c0 dd 05 00 02 00 00 00 00 00 00 0a 00 60 01 |..............`.|
000000a0 00 00 10 00 00 00 00 00 00 10 00 00 00 00 00 00 |................|
*
000000c0 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00 |................|
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000e0 00 00 00 00 00 00 00 00 00 28 dd 05 f0 09 00 00 |.........(......|
000000f0 00 10 01 00 84 00 00 00 00 00 00 00 00 00 00 00 |................|
00000100 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000140 00 00 00 00 00 00 00 00 2e 6c 69 6e 75 78 00 00 |.........linux..|
00000150 f0 2b 2b 01 00 90 b2 04 00 2c 2b 01 00 02 00 00 |.++......,+.....|
00000160 00 00 00 00 00 00 00 00 00 00 00 00 20 00 00 40 |............ ..@|
00000170 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
```
And 16 bytes (of padding?) at the end:
```
012b2df0 cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc |................|
```
In fact, if I run GNU objcopy without `-O binary`, I get roughly the same format as LLVM gives. This leads me to conclude that LLVM objcopy does not implement `-O binary` correctly, and instead uses PE output, same as the original file.
```
$ file bzImage.*
bzImage.gnu: Linux kernel x86 boot executable bzImage, version 6.10.9-gentoo-dist (root@devbox) #1 SMP PREEMPT_DYNAMIC Sun Sep 8 11:45:05 -00 2024, RO-rootFS, swap_dev 0X12, Normal VGA
bzImage.gnu-without-O-binary: PE32+ executable (EFI application) x86-64 (stripped to external PDB), for MS Windows
bzImage.llvm: PE32+ executable (EFI application) x86-64 (stripped to external PDB), for MS Windows
```
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJy8V1tv2zwS_TX0y0ACRd0f_OBcVBjbtEHTdi8vBSWOZbWyKJBUYvfXL0jZsR073hSb7xMEXUjxzJnhjHjItW7qDnFK4isS30z4YJZSTVe1VN1mUkqxmc47-ICdkZKwa3hCwlKFMOimq4EkVJY_K9lvSELBSMC1UbwyYJYIH5tuWMMvVB220Kx4jbBQcgXf_jEfX31Cbwidjdcv2CsphgoVCWeHHSSh29O9PtVoYGlMr-13rCCsEI02i6ZF7deOpy9VfdhMWBEIwoqx1xsZeYkfUD_3Ap-vRBL5df-r9g1XoxXDFXjrBbxpyCtfEVaMbhqu_PXvtwC_GDKGibBi0IqwQquKsKK1Yd2N2MJYVwkruKqWhBXrLCGsKKW0bcOvxsdFM1rfThZ4n6FsOq424P30HeA7moLy99yBnZu-8XotVz1XNoNsorhJgkohNyig3MCHT99gR5V3Aj5-_H63a9hnxzGwp57WyrMnBDCm7-4GQZ7QJA4Y0NwLUggCEs6CeMfTr7vhbRgpjegrGG37uLrg8FdbD9YL6yssuQYuRGMa2fEWLLNyY1ADHytnoWRnLpcB3R4AkYCYA6UvzzNN2w6SXt_9xz86SHp9iBtcGn4J1_dfwSVsdmggfKuB6IIBEtFT6hEFiClE8XZQEkGWAA3ss0DIOQgKSXLC_P7W98WIuzgTkfgFr4W7MgTKAGhpb2MHR8jSt0RkxE0uBQIp5AdMgzdFesRNn3EXuU2RMQD7AAeHptgx7t0F3OwM3wBoPOIGu_c_yIwRN3cfVhSE2KLtWR2huVRPnJUTVJLQl7j8eXxwNtFOW_8wk6sXBIP_u1RGXPHuJTji4oXhNqOz7QzYBM_PwRKWnQVe7APqEiCL3oVwcHH4O_yLgugSDkNIKkhySBDSGNLsnAG3Tp4Qj91fgpX2HEOSUygZ0MjZY9W-4znXLSphV4Rd7YJ9vXt-gX7xx3E0o7uH6DQs4H6iL5DTvy7gZ9fGWScgSLYrIGGZXEBvV8euJmFBWL5bFbET_2NNDFjJhM3Dqjp_vt5xkf451vMOFrwyVhM3C5iDGroj3fLUmKUcjJXIz2rLQrBrmIPVsEoO9bLdONc0XyEspFpxA1yPSqFuHlH78HXZaGiRCw0rtBK7kl3VDgLBLLk5kkYgJGropIFm1be4wu7EPFRSKaxMu7FErLZqOm2QCxg0ari_BTmYfnBuOVJcO4JSNXVjtYrVL_6FOSAsGjXOThg9z_2h2gpncHIcbRfWWQJWVQKusRoML_eIltojKt3IDk6lqU0gZeVoRAU-lnJtE4iwMICHu3u4_3J7e3f_9cfNvz_N7ubX8DB08IA9QDYKuigm4YzG4Lm6YZG19eWzZwGLBxeTJ97_EPgI9F8Bsw2f7KS18P3DqZfeNgW8z942_uEM7m9DRtjVoV-EZbfFHHjft03FrSa0nNdZ4iWR7dRGNX2PYru_QmXn4f7mirDcMlhIBXcP8M-mE_JJH7NwqvRvs_qcChMxDUUe5nyC0yBlSZxEGcsny2kZ0zAPo5SHIS8xxipK0igNU4ZhhqwMJs3Uhp3mQUpzRlnuCwwjmvBAiDxgWZyRiOKKN61zzW70Jo3WA04DmuVRMml5ia12m1nGjJSt3SXaT73d9oEVzjPCmN3uqqnrLIdak4i2jTZ6j2wa0-L0cPS-wjTiyoZmyTtho3mxzIRwpbVELlDZUa6kXKFNBtVOj7ezdWOWQ-lXcmW3YHYGx5vXK_kT7S-ncD67fe3o9uOU_TcAAP__0dUMaA">