Trying to understand some relocation handling on Mach-O X86-64

Rafael Espíndola rafael.espindola at gmail.com
Sun Feb 9 08:46:50 PST 2014


>> What is the expected meaning with Mach-O? That the relocation always
>> points 2 byte after the start of a given string (for L1+2)?
> Yes. The user wants the end of “f” - not the start of “b”.

Awesome, same as ELF.

>> Now, this is in no way specific to C strings. In the attached test2.s
>> the pointer in D always ends up pointing 4 bytes past the number 42.
>>
>> Why then the special case for C strings on Mach-O? Couldn't we use the
>> same logic as ELF?
> Are all local labels kept if they are referenced?  Or just ones used with an addend?
> If all are kept, then every function would have tons of local labels in the .o file
> for the target of every branch internal to the function.

No, only those used by relocations are kept. If it is fully resolved
(like internal branches in a function), then they are not needed and
are not output.

> The constraint is that we need a symbol based relocation for anything with
> an addend.  Most sections already have symbols that can be used.  But
> some sections are made up of unnamed entities, so naturally have no
> symbol name, so the compiler invents some local name.
>
> The cctools ‘as’ tool hack was to just preserve all ‘L’ labels in c-string sections.
> If MC already has a way to determine which ‘L’ labels are referenced using
> an addend and just those can be preserved into the .o file, that would work too.

Cool. I will give that a try once PR18743 is fixed.

>> BTW, in PR18743, is there a section we could put the constant and use
>> an L prefix or do all sections with an unknown datatype get atomized
>> using symbols?
> There are a few sections that the linker implicitly knows how to atomize
> (e.g. __literal8 (all 8-bytes), __cstring (zero terminated), __eh_frame (CFI chunks), etc).
> All other sections are atomized at (non-L) symbols.
>
> So the trick here is knowing that when an unnamed value gets put into a
> section atomized by symbols, that the unnamed value needs an ‘l’ symbol.

Thanks. I have just emailed a patch implementing just that.

> -Nick

Cheers,
Rafael




More information about the llvm-commits mailing list