[llvm-dev] LLD support for ld64 mach-o linker synthesised symbols

Michael Clark via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 7 17:08:50 PDT 2017

> On 8 Jun 2017, at 10:39 AM, Michael Clark <michaeljclark at mac.com> wrote:
> In any case, I have the Mach-O linker invoking now. I will do some testing…

So the linker synthesised symbols don’t exist in the currently Mach-O LLD linker. They are currently used in Apple code so they’ll eventually be necessary. They are also critical to solve some problems like finding the mach headers from early CRT code if you are not dyld due to the kernel start protocol. i.e. necessary to support static PIE + ASLR which I have got working btw.

- segment$start$__SEGMENT
- segment$end$__SEGMENT
- section$start$__SEGMENT$__section
- section$end$__SEGMENT$__section

I can see a problem with the design of the Mach-O linker.

Based on the design it appears that the Resolver is supposedly generic code but it is in fact practically just the Mach-O resolver.

We would need the context of the Mach-O linked file to synthesise the Mach-O specific symbols inside of Resolver. In a generic or abstract design we could create an interface that is implemented by the Mach-O driver to allow it to register against the Resolver to resolve these synthesised symbols to the “generic” core. However if this is really just the Mach-O resolver, then it would be substantially simpler to give the resolver the context of the Mach-O file and Mach-O’isms could be use directly. It seems this was the direction of the ELF and PE/COFF linkers. It might make things much simpler to remove generic layers that hinder object file specific quirks. From reading about the ELF and PE/COFF linkers, it seems this allows them to easier optimise for their target. Generic abstractions likely get in the way.

These are the issues I found

- unimplemented pagezero option argument e.g. -pagezero_size 0x1000
- unimplemented segaddr option argument e.g. -segaddr __TEXT 0x7ffe00000000
- looking for dyld_stub_binder when static linking
- unimplemented synthesised symbols segment$(start|end)$__SEGMENT
- unimplemented synthesised symbols section$(start|end)$__SEGMENT$__section

I can work around the inability to set explicit segment addresses by using -image_base. I had used explicit segment addresses due to a bug with -image-base in ld64, where the -pagezero size is erroneously added to the image base. i.e. -pagezero_size 0x1000 -image-base 0x7ffe00000000 would give an  0x7ffe00000000 to __PAGEZERO and x7ffe00001000 for __TEXT. This seems to be related to the issue where the zero page is incorrectly assigned the virtual address of the image base instead of 0 which my post link tool addresses. I would have tried to fix ld64 but the upstream sources would not build. They likely depend on some internal Apple build setup. So this is how I solved it:

- https://gist.github.com/michaeljclark/0a805652ec4be987a782afb902f06a99 <https://gist.github.com/michaeljclark/0a805652ec4be987a782afb902f06a99>

In any case, a working linker is a good starting point…
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170608/3b124f80/attachment.html>

More information about the llvm-dev mailing list