[LLVMdev] [RFC] Removing static initializers for command line options

Sean Silva chisophugis at gmail.com
Tue Aug 19 21:52:31 PDT 2014


To be clear: I agree with Rafael that we need to tread very carefully about
how we expose this machinery in the C API, if we expose it at all. My
suggestion is completely orthogonal to this though; all I'm talking about
is how to avoid the static constructors and global state caused by the
cl::opt's in library code, which as I understand it is the motivation for
the OP.

-- Sean Silva


On Tue, Aug 19, 2014 at 9:43 PM, Sean Silva <chisophugis at gmail.com> wrote:

> One interesting issue with moving away from the current system of static
> initializers for cl::opt is that we will no longer have the automatic
> registration of all the options so that -help will print everything
> available and generally we will not be able to issue an error for an
> "unknown command line option" (without calling into any other code).
>
> The auto-registration is fundamentally tied with the globalness and the
> static initializers; pondering this has led me down an interesting path
> that has made me understand better my suggestion in the other thread. As I
> see it, there are two very different sorts of uses of llvm::cl in LLVM:
>
> 1. For regular command line processing. E.g. if a tool accepts an output
> file, then we need something that will parse the argument from the command
> line.
>
> 2. As a way to easily set up a conduit from A to B, where A is the command
> line and B is some place "deep" inside the LLVM library code that will do
> something in response to the command line.
>
> (and, pending discussion, someday point A might include a proper
> programmatic interface (i.e. in a way other than hijacking the command line
> processing))
>
> llvm::cl does a decent job for #1 and that is what it was designed for
> AFAICT; these uses of llvm::cl live outside of library code and everything
> is pretty happy, despite them being global and having static initializers.
>
> The problem is that llvm::cl is not very well-suited to #2, yet it is used
> for #2, and that is the real problem. We need a solution to problem #2
> which does not use llvm::cl. Thus, I don't think that the solution you
> propose here is the right direction.
>
> The first step is to clearly differentiate between #1 and #2. I will say
> "command line options" for #1 and "configuration/tweak points" for #2.
> (maybe "library options" is better for #2; neither is perfect terminology)
>
> The strawman I suggested in
> http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a
> stab at #2. There is no way to dodge being stringly typed since command
> lines are stringly typed, so really it is just a question of how long a
> solution stays stringly typed.
>
> My thought process for staying stringly typed "the whole time" (possibly
> with some caching) comes from these two desires:
> - adding a c/t point should require adding just one call into the c/t
> machinery (this is both for convenience and for DRY/SPOT), and
> - this change should be localized to the code being configured/tweaked
> This is the thought process:
>
> Note that llvm::cl is stringly typed until it parses the options. llvm::cl
> gives the appearance of a typed interface because it uses static
> initialization as a backdoor to globally transport the knowledge of the
> expected type to the option parsing machinery (very early in the program
> lifetime). Without this backdoor, we need to stay stringly typed longer, at
> least until we reach the "localized" place where the single call into the
> c/t machinery is made; this single call is the only place that has the type
> information needed for the c/t value to become properly typed. But there is
> no way to know how long it will be until we reach that point (or even *if*
> we reach that point; consider passes that are not run on this invocation).
>
> Hence my suggestion of just putting a stringly typed key-value store (or
> whatever) in an easily accessible place (like LLVMContext), and just
> translating any unrecognized command line options (ones that are not for
> #1) into that stringly typed storage.
>
> I agree with Rafael that "constructor arguments to passes" are not c/t
> points. In the future, there might be some way to integrate the two (from
> the referenced post, you can probably tell that I kind of like the idea of
> doing so), but for now, I think the clear incremental step is to attack #2
> and solve it without llvm::cl. I have suggested a way to do this that I
> think makes sense.
>
> -- Sean Silva
>
>
>
>
>
>
> On Mon, Aug 18, 2014 at 11:49 AM, Chris Bieneman <beanz at apple.com> wrote:
>
>> Today command line arguments in LLVM are global variables. An example
>> argument from Scalarizer.cpp is:
>>
>> static cl::opt<bool> ScalarizeLoadStore
>>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>>    cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>>
>>
>> This poses a problem for clients of LLVM that aren’t traditional
>> compilers (i.e. WebKit, and Mesa). My proposal is to take a phased approach
>> at addressing this issue.
>>
>> The first phase is to move the ownership of command line options to a
>> singleton, OptionRegistry. The OptionRegistry can be made to work with the
>> existing global command line definitions so that the changes to migrate
>> options can be done in small batches. The primary purpose of this change is
>> to move the ownership of the command line options out of the global scope,
>> and to provide a vehicle for threading them through the compiler. At the
>> completion of this phase, all the command line arguments will be
>> constructed during LLVM initialization and registered under the
>> OptionRegistry. This will replace the 100’s of static initialized cl::opt
>> objects with a single static initialized OptionRegistry.
>>
>> With this change options can be constructed during initialization. For
>> the example option above the pass initialization would get a line like:
>>
>> cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>>   "scalarize-load-store", cl::Hidden, cl::init(false),
>>   cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>>
>>
>> Also the pass would add a boolean member to store the value of the option
>> which would be initialized in the pass’s constructor like this:
>>
>> ScalarizeLoadStore =
>> cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>>
>>
>> These two operations need to occur at separate times due to object
>> lifespans. At the time when command lines are parsed passes have been
>> initialized, but not constructed. That means making options live in passes
>> doesn’t really work, but since we want the data to be part of the pass we
>> need to initialize it during construction.
>>
>> A large part of this phase will be finding appropriate places for all the
>> command line options to be initialized, and identifying all the places
>> where the option data will need to be threaded through the compiler. One of
>> the goals here is to get rid of all global state in the compiler to
>> (eventually) enable better multi-threading by clients like WebKit.
>>
>> The second phase is to split the OptionRegistry into two pieces. The
>> first piece is the parsing logic, and the second piece is the Option data
>> store. The main goal of this phase is to make the OptionRegistry represent
>> everything needed to parse command line options and to define a new second
>> object, OptionStore, that is populated with values by parsing the command
>> line. The OptionRegistry will be responsible for initializing “blank”
>> option stores which can then be populated by either the command line
>> parser, or API calls.
>>
>> The OptionRegistry should remain a singleton so that the parsing logic
>> for all options remains universally available. This is required to continue
>> supporting plugin loadable options.
>>
>> The OptionStore should be created when a command line is parsed, or by an
>> API call in libraries, and can be passed through the pass manager and
>> targets to populate option data. The OptionStore should have a lifetime
>> independent of contexts, and pass managers because it can be re-used
>> indiscriminately.
>>
>> The core principle in this design is that the objects involved in parsing
>> options only need to exist once, but you need a full list of all options in
>> order to parse a command line. You should be able to have multiple copies
>> of the actual stored option data. Having multiple copies of the data store
>> is one step toward enabling two instances of LLVM in the same process to
>> use optimization passes with different options.
>>
>> I haven’t come up with a specific implementation proposal for this yet,
>> but I do have some rough ideas. The basic flow that I’m thinking of is for
>> command line parsing to create an object that maps option names to their
>> values without any of the parsing data involved. This would allow for
>> either parsing multiple command lines, or generally just constructing
>> multiple option data stores. **Here is where things get foggy because I
>> haven’t yet looked too deep.** Once you construct a data store it will get
>> passed into the pass manager (and everywhere else that needs it), and it
>> will be used to initialize all the option values.
>>
>> There has been discussion about making the option store reside within the
>> context, but this doesn’t feel right because the biggest consumer of option
>> data is the passes, and you can use a single pass manager with multiple
>> contexts.
>>
>> Thanks,
>> -Chris
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140819/85fc82c7/attachment.html>


More information about the llvm-dev mailing list