[LLVMdev] Advanced Command line processing with lld

Wed Apr 17 14:34:34 PDT 2013

On 4/17/2013 4:28 PM, Michael Spencer wrote:
> On Tue, Apr 16, 2013 at 10:25 AM, Shankar Easwaran
> <shankare at codeaurora.org>wrote:
>
>>   Hi Nick, Michael,
>>
>> I was trying to vision an advanced command line processing framework for
>> lld, which would essentially do the following
>>
>> a) Creating nodes from command line options
>> b) Validate the command line options
>> c) Rearrange command line options by running passes!
>>
>> *Creating nodes from command line options*
>>
>> For the linker to support various flavors, we wouldn't want to treat
>> command line options as just options as for example on ELF,
>> we have
>>
>> a) positional command line options
>> b) group command line options
>> c) non position based
>>
>> To accommodate all the options, the command line options that the linker
>> gets could be treated as a graph where the nodes represent
>>
>> * input files
>> * grouped commands
>> * positional commands
>>
>> all other inputs which don't fall in the category are also represented by
>> nodes and represented as a vector.
>>
>> *Advantages
>>
>> *a) don't need advanced logic to detect a file from being loaded twice
>> b) much cleaner code, because of less singleton patterns used
>> c) easy to debug and represent them
>> **
>> *Validate the command line options
>> *It would be nice to have Traits registered for each and every type of
>> node to check for valid options in them. The actual command line parsing
>> library will not be responsible for validating the options.
>>
>> *Rearrange command line options
>>
>> *Certain users keep playing with the linker command line options until
>> they figure out a set of options that work for them,
>>
>> Essentially I have seen people playing around with
>>
>> a) --start-group, --end-group (traverses the whole list of files in the
>> group many times until no new symbol is added)
>> b) --force-load-archive, --no-force-load-archive (force load all the
>> symbols from the archive library)
>>
>> *The disadvantage of using (a)/(b) are making the linker slower.
>>
>> *It would be useful to have a Pass Manager to change the order of files
>> that are seen in the command line for the below reasons :-
>>
>> a) Input file positioning, which positions the file depending on the
>> number of calls from one file to another file.
>> b) Improve locality of reference by positioning files closer to each other.
>>
>> *Other advantages*
>> Dead input file stripping. You can remove input files that are not
>> referenced in a static link.
>>
>>
>> *Representation
>>
>> *I thought of representing
>>
>> a) input files
>> b) group commands
>> c) positional commands
>>
>> by creating atoms as LLD has a framework to represent atoms that can be
>> tested and played around with.
>>
>> The whole command line could be represented as
>>
>> a) a set of command line atoms with their assigned ordinals
>> b) you have a *follow on* edge from one input file to the other
>> c) you have a in-group reference for all input files that are in a group
>>
>> For example, if you have a command line as below :-
>>
>> LLD -flavor gnu a.o b.o c.o --start-group libc.a libpthread.a --as-needed
>> libc.so --no-as-needed --end-group d.o mylib.a libc.a
>>
>> This could be represented by atoms
>>
>> a.o ------------------> b.o -------------------> c.o  ---> (GROUP) ----->
>> d.o ---> mylib.a ---> libc.a
>> followed by (fb)        fb                            fb             |
>>            fb             fb                fb
>>
>>     libc.a ---------------> libpthread.a --> as-needed -------> libc.so
>>
>>           fb          ingroup            (ingroup)    fb
>> (ingroup)
>>                                                                                                                                                                      &n
>> bsp;&nb sp;
>>
>> *Advantages
>> *a) Writer has a way to look at the command line options as atoms too.
>> b) Not sure if LTO could use this framework to call the compiler back with
>> the appropriate set of options.
>> c) Can use the ReaderWriterYAML framework for testing!
>>
>> Thanks
>>
>> Shankar Easwaran
>> **
>>
>> --
>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
>>
>>
> This should be sent to the llvm-dev list.
Done.

> I think we will need a model like this for input files in order to
> correctly evaluate in the resolver, but I think this is the wrong way to go
> about it.
>
> We simply need a graph data structure that represents the input semantics
> and provide mutable access to it throughout the link. So you're graph is
> correct, I just wouldn't model it with Atoms. As for the command line
> parser, the only part it has in this is generating the initial graph.
The only reason I modeled it with atoms was to rearrange them to 
increase locality of reference.
Different targets can optionally add command line passes to reorder them 
and optimize the command line to their needs.
> The Darwin DAG would basically be a single GROUP node with all the inputs.
Ok.

Thanks

Shankar Easwaran

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation