[PATCH] Add DarwinInputGraph

Mon Oct 27 04:50:16 PDT 2014

On Thu Oct 23 2014 at 6:48:46 PM Nick Kledzik <kledzik at apple.com> wrote:

>
> On Oct 22, 2014, at 6:16 PM, Shankar Easwaran <shankare at codeaurora.org>
> wrote:
>
> > On 10/22/2014 8:01 PM, Rui Ueyama wrote:
> >> On Tue Oct 21 2014 at 9:38:57 PM Nick Kledzik <kledzik at apple.com>
> >>  wrote:
> >>
> >>
> >>> On Oct 21, 2014, at 6:31 PM, Rui Ueyama <ruiu at google.com>
> >>>  wrote:
> >>>
> >>> Let's not add a back reference to the input element. It will make the
> >>> complicated data structure even worse and would raise new issues (for
> >>> example, I could imagine that it's not clear whether the back reference
> >>> should point to a parsed file in an archive file or should point to the
> >>> entire archive file.)
> >>>
> >>> Before discussing possible alternatives, it's worth discussing the
> issues
> >>> that the current InputGraph has. Here's my take.
> >>>
> >>> 1. We have too many classes to represent the concept of "file".
> >>>
> >>> FileNode in the InputGraph is different from the File class that
> >>> represents parsed file. And the InputGraph has various kinds of
> FileNodes
> >>> -- FileNode, SimpleFileNode, <flavor>FileNode, etc. That needless
> >>> distinction makes hard to handle files.
> >>>
> >>> FileNode needs to be merged with the File class. Or if we are going to
> >>> replace the InputGraph with a simpler data structure, it should be
> removed.
> >>>
> >>> 2. We have too many (other) classes.
> >>>
> >>> We have too many classes for the InputGraph that are simply
> overdesigned.
> >>> Command line argument is not an easy stuff, but it shouldn't need that
> >>> amount of code and number of classes. We have created too deep class
> >>> hierarchy there.
> >>>
> >>> 3. Making the core linker platform neutral is not an ultimate goal, but
> >>> the InputGraph is designed for that.
> >>>
> >>> The InputGraph made made the core linker more or less platform neutral.
> >>> That's basically a good thing. We'd like to keep the core part as
> platform
> >>> neutral as possible because it would help us maintain the code.
> However,
> >>> neutrality is not a goal that we need to achieve at any cost. If
> separating
> >>> platform neutral code from dependent code will mess up the entire code
> >>> base, we simply shouldn't do that. (It's worth noting that the
> InputGraph
> >>> failed to hide everything already, one example is the ELF section
> group.)
> >>>
> >>> We should teach the core linker a little bit more about platform
> specific
> >>> features and removes complexity from the InputGraph.
> >>>
> >>> I agree on all the above.
> >>>
> >>> Looking at how core linking uses InputGraph, it is always through the
> >>> LinkingContext and:
> >>>   Resolver.cpp needs an iterator (getNextFile()) and a way to notify
> when
> >>> it used a file (notifyProgress())
> >>>   Driver.cpp needs a way to iterate files (inputElements) to parse
> them in
> >>> parallel
> >>>
> >>> So this could be simplified to hide/implement the “graph” in the
> >>> LinkingContext.  That is, the LinkingContext subclasses each use the
> data
> >>> structures natural for their needs.  The drivers just call addFile()
> on the
> >>> LinkingContext, and the gnu driver calls startGroup()/endGroup() on the
> >>> LinkingContext.  The Resolver could call a forEachFile() method on
> >>> LinkingContext which takes a lambda on what to do with each file.  The
> >>> lambda can return if the Resolver used the file (to track whether the
> >>> current group should be scanned again). The parallel parsing currently
> done
> >>> in the Driver could be a utility method in the base LinkingContext
> class
> >>> that each concrete LinkingContext uses.
> >>>
> >>>
> >> Why does the GNU ld driver have to ask the LinkingContext to construct a
> >> startGroup/endGroup? The GNU driver should be capable of understanding
> all
> >> the command line options for GNU ld, so it can construct a data
> structure
> >> directly.
>
> The driver and LinkingContext need to work together to build some data
> structure which is simple for the Resolver to iterate.  There are
> potentially other clients of ELFLinkingContext (such as JITs) that will
> need to create the data structures.   But the interface to *LinkContext
> should be whatever makes sense for that platform.
>
>
> >>
> >> forEachFile() doesn't seem different from what we have now. It's a
> >> getNextFile(). I don't think it will improve the situation.
> It will improve the situation for mach-o which needs two passes across the
> input files whereas the Resolver sees only one.
>
>
> >>
> >> It feels to me that we don't need a sophisticated technique here. I'd
> >> suggest making the Resolver to handle ELF groups to eliminate
> >> notifyProgress. I'd even remove InputGraph entirely and replace it with
> a
> >> std::vector<FileOrGroup> or something like that where FileOrGroup is a
> >> class representing a file or a ELF group.
> That abstraction won’t work for mach-o which does not use groups, but
> needs to first process relocatable object files, then keep looping over all
> libraries until no more undefines are removed.  It cannot be wedged into
> the group model because the which files are libraries is not known until
> parsing is done.
>

ELF group was just an example. What I was suggesting is, rather than
designing a complex communication mechanism between the core linker and the
input file manager (whichever it is, InputGraph or the Linking Context),
implement the behavior directly to the core linker.

In order to implement what you need for Mach-O, we could

1. parse all input files
2. put all libraries into a Mach-O library group object, which doesn't
currently exist, and add that to the end of the input file list.
3. the core linker directly handles the library group -- it iterates over
the library group until it finds no new symbols.

Note that this is another example. My point is to implement the feature to
the core linker. The core linker is currently overly simplified.

>
> > As this is one of the major things that we need to handle, I think We
> could move this discusssion to a hallway discussion during the LLVM
> developers meeting, if you are all ok.
> Unfortunately, I won’t be at the conference this year.
>
> -Nick
>
>
>
> >>
> >>>
> >>> On Tue, Oct 21, 2014 at 5:33 PM, Shankar Easwaram
> >>> <shankarke at gmail.com>
> >>>
> >>> wrote:
> >>>
> >>>
> >>>> If we could have lld::file have a back reference to the input element
> >>>> (or) use the file ordinal to access the input element in the input
> graph
> >>>> you could just add inputs on a need basis ?
> >>>>
> >>>> Let's discuss alternative proposals as part of this thread ?
> >>>>
> >>>> I am also interested in moving the input file which could be a linker
> >>>> script from the driver to the registry too in addition.
> >>>>
> >>>> We still need to have the getNextFile as it made the resolver more
> flavor
> >>>> neutral.
> >>>>
> >>>> Shankar easwaran
> >>>>
> >>>>
> >>>>
> >>>> On Oct 21, 2014, at 17:58, Rui Ueyama
> >>>> <ruiu at google.com>
> >>>>  wrote:
> >>>>
> >>>> On Tue, Oct 21, 2014 at 3:41 PM, Michael Spencer
> >>>> <bigcheesegs at gmail.com>
> >>>>
> >>>> wrote:
> >>>>
> >>>>
> >>>>> On Mon, Oct 20, 2014 at 9:21 PM, Rui Ueyama <ruiu at google.com>
> >>>>>  wrote:
> >>>>>
> >>>>>> I'm not still sure whether or not "InputGraph" thing was after all
> the
> >>>>>>
> >>>>> right way to abstract the input file list. That often makes easy
> things
> >>>>> hard. In this case, sorting a list of input files became that hard.
> When I
> >>>>> proposed the idea I was thinking that would make this kind of things
> easier
> >>>>> to handle. I don't think the current shape of the API is not
> desirable at
> >>>>> least. Or the fundamental idea was not very good.
> >>>>>
> >>>>>> One thing we probably should consider is, in my opinion, we are
> >>>>>>
> >>>>> pushing too hard to separate all ports. We are trying to write any
> >>>>> architecture-dependent code into architecture-specific file. But the
> >>>>> natural border of API doesn't always fit to the architecture
> >>>>> dependent/independent border. We should probably relax that
> constraint a
> >>>>> bit where that makes sense and write code that's short and easier to
> >>>>> understand.
> >>>>>
> >>>>> I agree. The current InputGraph code makes modifying the input very
> >>>>> difficult to do. Part of the reason for the model in the first place
> >>>>> was to make this easy. I'm currently working on adding #pragma lib
> >>>>> like support for ELF (PS4 specific stuff) and have found it basically
> >>>>> impossible to add a new input file in the correct position in the
> >>>>> graph with the current model.
> >>>>>
> >>>>>
> >>>> To deal with the issue of the InputGraph I have a few ugly hacks in
> >>>> PECOFF/LinkerGeneratedSymbolFile.h. There are a few virtual .a files
> there
> >>>> that generate magical symbols on the fly to control the core linker
> in an
> >>>> obscure way. These hacks need to be removed, but because the
> InputFile API
> >>>> exists in between, it's impossible to do.
> >>>>
> >>>> The InputGraph API design needs to be revisited and probably be
> rewritten.
> >>>>
> >>>>
> >>>>
> >>
> >>
> >> _______________________________________________
> >> llvm-commits mailing list
> >>
> >> llvm-commits at cs.uiuc.edu
> >> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> >
> > --
> > Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
> hosted by the Linux Foundation
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141027/ead33ded/attachment.html>