[LLVMdev] [RFC] Removing static initializers for command line options

Mon Aug 18 14:56:48 PDT 2014

> On Aug 18, 2014, at 2:42 PM, Rafael Espíndola <rafael.espindola at gmail.com> wrote:
> 
> On 18 August 2014 14:49, Chris Bieneman <beanz at apple.com <mailto:beanz at apple.com>> wrote:
>> Today command line arguments in LLVM are global variables. An example
>> argument from Scalarizer.cpp is:
>> 
>> static cl::opt<bool> ScalarizeLoadStore
>>  ("scalarize-load-store", cl::Hidden, cl::init(false),
>>   cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>> 
>> 
>> This poses a problem for clients of LLVM that aren’t traditional compilers
>> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
>> addressing this issue.
>> 
>> The first phase is to move the ownership of command line options to a
>> singleton, OptionRegistry. The OptionRegistry can be made to work with the
>> existing global command line definitions so that the changes to migrate
>> options can be done in small batches. The primary purpose of this change is
>> to move the ownership of the command line options out of the global scope,
>> and to provide a vehicle for threading them through the compiler. At the
>> completion of this phase, all the command line arguments will be constructed
>> during LLVM initialization and registered under the OptionRegistry. This
>> will replace the 100’s of static initialized cl::opt objects with a single
>> static initialized OptionRegistry.
>> 
>> With this change options can be constructed during initialization. For the
>> example option above the pass initialization would get a line like:
>> 
>> cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>>  "scalarize-load-store", cl::Hidden, cl::init(false),
>>  cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>> 
>> 
>> Also the pass would add a boolean member to store the value of the option
>> which would be initialized in the pass’s constructor like this:
>> 
>> ScalarizeLoadStore =
>> cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>> 
> 
> For the first step it might be better to keep the option value as a
> global. That way we only switch to using something like
> 
> 
> static bool ScalarizeLoadStore;
> cl::OptionRegistry::CreateOption<bool>(&ScalarizeLoadStore,
> "ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and store"));
> 
> and everything else remains as is.

I’d prefer to do the removal of global storage all at once. This can be done one pass at a time, but doing all at once means I don’t need to revisit each pass as many times. Whereas if I do this in two passes (where the option becomes owned, but the storage remains global) I’ll need to revisit each pass again to get rid of the global storage.

Keep in mind one of the ultimate goals that this is building toward is allowing a single process to compile multiple programs at the same time with different options.

> 
>> These two operations need to occur at separate times due to object
>> lifespans. At the time when command lines are parsed passes have been
>> initialized, but not constructed. That means making options live in passes
>> doesn’t really work, but since we want the data to be part of the pass we
>> need to initialize it during construction.
>> 
>> A large part of this phase will be finding appropriate places for all the
>> command line options to be initialized, and identifying all the places where
>> the option data will need to be threaded through the compiler. One of the
>> goals here is to get rid of all global state in the compiler to (eventually)
>> enable better multi-threading by clients like WebKit.
>> 
>> The second phase is to split the OptionRegistry into two pieces. The first
>> piece is the parsing logic, and the second piece is the Option data store.
>> The main goal of this phase is to make the OptionRegistry represent
>> everything needed to parse command line options and to define a new second
>> object, OptionStore, that is populated with values by parsing the command
>> line. The OptionRegistry will be responsible for initializing “blank” option
>> stores which can then be populated by either the command line parser, or API
>> calls.
>> 
>> The OptionRegistry should remain a singleton so that the parsing logic for
>> all options remains universally available. This is required to continue
>> supporting plugin loadable options.
>> 
>> The OptionStore should be created when a command line is parsed, or by an
>> API call in libraries, and can be passed through the pass manager and
>> targets to populate option data. The OptionStore should have a lifetime
>> independent of contexts, and pass managers because it can be re-used
>> indiscriminately.
>> 
>> The core principle in this design is that the objects involved in parsing
>> options only need to exist once, but you need a full list of all options in
>> order to parse a command line. You should be able to have multiple copies of
>> the actual stored option data. Having multiple copies of the data store is
>> one step toward enabling two instances of LLVM in the same process to use
>> optimization passes with different options.
>> 
>> I haven’t come up with a specific implementation proposal for this yet, but
>> I do have some rough ideas. The basic flow that I’m thinking of is for
>> command line parsing to create an object that maps option names to their
>> values without any of the parsing data involved. This would allow for either
>> parsing multiple command lines, or generally just constructing multiple
>> option data stores. **Here is where things get foggy because I haven’t yet
>> looked too deep.** Once you construct a data store it will get passed into
>> the pass manager (and everywhere else that needs it), and it will be used to
>> initialize all the option values.
>> 
>> There has been discussion about making the option store reside within the
>> context, but this doesn’t feel right because the biggest consumer of option
>> data is the passes, and you can use a single pass manager with multiple
>> contexts.
>> 
> 
> Some passes take options directly in the constructor. For example
> 
> Inliner::Inliner(char &ID, int Threshold, bool InsertLifetime)
> 
> Maybe we could just say that there are two different types of options.
> The ones we want to expose to users and the ones which we use for
> testing llvm itself. The options we want to expose should be just
> constructor arguments. With that distinction we should be able to just
> not use the options added by  cl::OptionRegistry::CreateOption unless
> cl::ParseCommandLineOptions is called. WebKit like clients would just
> not call cl::ParseCommandLineOptions. Would that work?

This is actually how some of our internal clients are already working. There are a few caveats with this approach:

(1) You can’t allow the pass manager to allocate your passes for you, because those passes only read from cl::opts
(2) Not all of our passes have constructors for overriding their cl::opts (the legacy Scalarizer is one)

I think it would in general be cleaner to provide a way for library clients to use cl::opts without being forced to parse a command line.

Thanks,
-Chris

> 
> Cheers,
> Rafael

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140818/4575e5d9/attachment.html>