[LLVMdev] [lld] Linker script findings.

Thu Jan 10 18:59:16 PST 2013

On Jan 10, 2013, at 4:09 PM, Sean Silva wrote:

> On Thu, Jan 10, 2013 at 1:55 PM, Nick Kledzik <kledzik at apple.com> wrote:
>> The reality is that some content is tied to the format.  That is, Writers do not
>> just lay down the content of the atoms supplied and then put some file format
>> wrapper around that content.  Some of the content exists because of
>> the file format, for instance GOT and PLT entires.
>> 
>> Now in the case of PLT entries the overall algorithm is similar to what
>> darwin needs for "stubs".  So the abstract algorithm is an a Pass and
>> each Writer supplies additions to that Pass which generate platform/Writer
>> specific atoms as needed.
>> 
>> Perhaps the name "Writer" is causing confusion.  In a previous incarnation
>> we called in "Platform".  But that caused confusion because does platform
>> mean the OS or the file format?  And LLVM uses "target" to mean many
>> things too.
>> 
>> My pragmatic approach is that any (non-driver) code that only mach-o/darwin
>> will need should go in lib/ReaderWriter/MachO.   Similarly, and code that
>> is only needed by some platform that uses ELF should go in lib/ReaderWriter/ELF.  Any common processing of atoms should be done
>> in a Pass which has hooks allowing specialization by Writers (aka platforms).
>> 
>> Another way to look at the current design is that WriterELF needs to support
>> every processor/platform ELF output.  Exactly what it does is controlled
>> (configured) by WriterOptionsELF.   The driver's job is to produce the right
>> WriteOptionsELF settings.  For library based linking (no command line)
>> you just just code to instantiate an appropriate WriteOptionsELF.
> 
> Interesting. As you have pointed out, currently the format and the os
> are conflated. Would it make sense to have Reader/Writer literally be
> concerned *only* with turning an atom graph into an output object file
> in a specific format? (each format may have format-specific atoms).
> Then the OS layer (+ architecture?) is implemented in a separate
> library that is a client of Reader/Writer.
This conflation problem is really only an issue with ELF (because it is
so widely used and because ELF was designed as a container format).  
In fact, the "conflation" actually makes MachO and PE/COFF code
structure in lld easier because everything is in one place.  In other words, 
for mach-o I would not know what to put in the "OS" layer vs the "format" layer.

> How do you expect WriterOptionsELF to scale with respect to the number
> of OS's and architectures supported? E.g. for each architecture, will
> we need to add to WriterOptionsELF? For each OS, will we need to add
> to it?
I've imagined that WriterELF would need (at least internally) a big set of
fine-grain options (e.g. which symbol table hash function, add special section
foo, etc).   Then the question is whether to express all the fine grain options
in the WriterOptionsELF, or instead have high level settings in WriterOptionsELF
and then have WriterELF translated the high level settings into the fine
grain settings.

I'd like to hear some examples of OS vs architectures difference needed in ELF.
I (naïvely) think of architectures as just being a "machine" field in the
WriterOptionsELF.  And OS options as just being "add this magic section", or
"put this section here".

-Nick