[llvm-commits] [lld] r156100 - in /lld/trunk: include/lld/Reader/ include/lld/Reader/Reader.h lib/CMakeLists.txt lib/Reader/ lib/Reader/CMakeLists.txt lib/Reader/COFFReader.cpp tools/lld-core/CMakeLists.txt tools/lld-core/lld-core.cpp

Nick Kledzik kledzik at apple.com
Wed May 9 15:05:17 PDT 2012


On May 8, 2012, at 2:01 PM, Michael Spencer wrote:

> On Fri, May 4, 2012 at 2:56 PM, Nick Kledzik <kledzik at apple.com> wrote:
>> On May 4, 2012, at 1:32 PM, Michael Spencer wrote:
>>> On Thu, May 3, 2012 at 2:27 PM, Nick Kledzik <kledzik at apple.com> wrote:
>>>> 
>>>> On May 3, 2012, at 1:52 PM, Michael J. Spencer wrote:
>>>>> --- lld/trunk/lib/Reader/COFFReader.cpp (added)
>>>> 
>>>> I would have imagined this file going in:
>>>>        lld/trunk/lib/Platforms/Windows/COFFReader.cpp
>>>> 
>>>> In my mind there are three (initial) platforms:  Darwin, Windows, Unix*.  So, we have:
>>>> 
>>>>        lld/trunk/lib/Platforms/Darwin/
>>>>        lld/trunk/lib/Platforms/Windows/
>>>>        lld/trunk/lib/Platforms/Unix/
>>>> 
>>>> And code only needed by one platform goes under its Platform directory.
>>>> 
>>>> The interface that all Platforms must provide is defined by Platform.h.  So one way to manage this would be to add:
>>>>    bool isObjectFile(llvm::MemoryBuffer*);
>>>>    File * parseObjectFile(llvm::MemoryBuffer*);
>>>> to the Platform class, and wire up your COFF reader behind that in the Windows Platform.
>>>> 
>>>> My assumption is that even though the lld internal model is platform independent, no one is going to want to use other platform object files (e.g. win32 in mach-o).  There are just too many ABI and other conventions implicit in the file format.
>>>> 
>>>> * By Unix I mean ELF based Linux, Solaris, etc. I also image that the Unix platform will be much more customizable to support all the *nix variants.
>>>> 
>>> I believe there are platforms that are sufficiently different to need
>>> a different Platform class, but still use ELF. For example, Haiku and
>>> PlayStation 3.
>> What sort of differences to you see driving the need for a different Platform classes for these ELF based systems?  I just see these as minor variants on the Unix platform.  As I said, I image the Unix Platform would have a lot customization points.  That may mean there are many subclasses of PlatformUnix, or that PlatformUnix has a bunch of knobs which can be set differently to support different OSs.  But in either case, that is all under the lib/Platforms/Unix.
> 
> In the case of the PS3 and VITA, they are not Unix like at all, and
> have very different ways of representing things like thread local
> storage and shared objects.
This sounds like the overall structure is ELF but different OSs use some different special sections.  

> 
>> As an example of variants under a Platform, on Darwin most of the Platform class methods will operate very differently depending on the architecture (e.g. i386, x86_64, armv7).   I have not decided whether to have different PlatformDarwin subclass for each arch, or have a arch specific delegate object underneath PlatformDarwin.  But that choice does not effect Core linking or any other platform, so it all will be in lib/Platforms/Darwin.
>> 
>> 
>>> PE/COFF is also used in EFI, which is very very
>>> different from Windows.
>> On Darwin, we use EFI to boot Macs.  Our tool chain is all mach-o based, and there is a final step to convert the generated mach-o executable to PE/COFF.
>> 
>> 
>> Perhaps using OS names (Darwin, Windows, Unix) is confusing.  What if the file format names were used instead (MachO, PE-COFF, ELF).   Would it then  make more sense to put the Reader in the Platform (Format) directory?
> 
> Depends on how it is laid out.
> 
>> A related issue is that in "ld -r" mode, the linker merges object files and creates a new object file.  Where should that Writer live? Given that Object File Readers and Writers need to be kept in sync (because they must remain inverses of each other for every feature), it would make sense to keep that code close together.  My preference is that they all live in the Platform (or file format dir).
>> 
>> -Nick
> 
> I agree that the reader/writer/exewriter should all be in the same directory.
> 
> One of the major reasons I want to have the formats in a different
> directory is layering. The readers and writers shouldn't depend on the
> platform at all. Any non-standard extensions (like all the .gnu
> weirdness) can be handled by the platforms while reading.
Huh?  If readers cannot depend on platforms, how can platforms handle reader extensions (like .gnu sections)?  


> So I propose:
> lib/Format/{COFF,ELF,Mach-O}/{Reader/ObjectWriter/ExecutableWriter}.cpp
> lib/Platform/{Unix,Darwin,Windows,etc...}

It is still not clear to me what code goes in /Format/ vs /Platform/.   If a particular OS does thread local variables in a unique way, that information needs to be encoded in an OS specific way in the file format.  Does that code belong in /Platform because its "OS specific" or does it belong in /Format/ because it encoding a chunk of the file format?

In the Darwin/Mach-o world, there are huge differences in the executable file format depending on what OS version you are targeting (which is a command line option to the linker (a runtime test - not a build time test)).  I image there being code to create each of the chunks of the various file format variations. And at runtime something picks the appropriates pieces of code and runs them to create the executable.  From that mind set, I see the various ELF extensions in a similar light, although they might be selected at build time (of the linker) instead of at runtime.  

Currently the Platform class is about providing platform-specific atoms to core linking and the various passes (e.g. synthesizing PLT entries).  The one other thing is does is provide a number<->name translation for References (fixups) for reading/writing YAML.  

I could see renaming the Platform class as FileFormat class, and then having:

  lib/FileFormat/{COFF,ELF,Mach-O}/{ObjectReader/ObjectWriter/ExecutableWriter}.cpp

In each FileFormat subdirectory there are files which contain the code for reading/writing various extensions.  The main file (e.g. ExecutableWriter.cpp) job is to select the right mixture of code.  For ELF this might be done at build time or done at runtime based on the target triple.

In summary, yes we need a design with layers and clean places to put OS specific file format extensions. But the kinds of extensions are file format dependent and so should be organized under a particular FileFormat directory.


-Nick






More information about the llvm-commits mailing list