[LLVMdev] [RFC] Removing static initializers for command line options

Sean Silva chisophugis at gmail.com
Tue Aug 19 21:43:15 PDT 2014


One interesting issue with moving away from the current system of static
initializers for cl::opt is that we will no longer have the automatic
registration of all the options so that -help will print everything
available and generally we will not be able to issue an error for an
"unknown command line option" (without calling into any other code).

The auto-registration is fundamentally tied with the globalness and the
static initializers; pondering this has led me down an interesting path
that has made me understand better my suggestion in the other thread. As I
see it, there are two very different sorts of uses of llvm::cl in LLVM:

1. For regular command line processing. E.g. if a tool accepts an output
file, then we need something that will parse the argument from the command
line.

2. As a way to easily set up a conduit from A to B, where A is the command
line and B is some place "deep" inside the LLVM library code that will do
something in response to the command line.

(and, pending discussion, someday point A might include a proper
programmatic interface (i.e. in a way other than hijacking the command line
processing))

llvm::cl does a decent job for #1 and that is what it was designed for
AFAICT; these uses of llvm::cl live outside of library code and everything
is pretty happy, despite them being global and having static initializers.

The problem is that llvm::cl is not very well-suited to #2, yet it is used
for #2, and that is the real problem. We need a solution to problem #2
which does not use llvm::cl. Thus, I don't think that the solution you
propose here is the right direction.

The first step is to clearly differentiate between #1 and #2. I will say
"command line options" for #1 and "configuration/tweak points" for #2.
(maybe "library options" is better for #2; neither is perfect terminology)

The strawman I suggested in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075503.html was a
stab at #2. There is no way to dodge being stringly typed since command
lines are stringly typed, so really it is just a question of how long a
solution stays stringly typed.

My thought process for staying stringly typed "the whole time" (possibly
with some caching) comes from these two desires:
- adding a c/t point should require adding just one call into the c/t
machinery (this is both for convenience and for DRY/SPOT), and
- this change should be localized to the code being configured/tweaked
This is the thought process:

Note that llvm::cl is stringly typed until it parses the options. llvm::cl
gives the appearance of a typed interface because it uses static
initialization as a backdoor to globally transport the knowledge of the
expected type to the option parsing machinery (very early in the program
lifetime). Without this backdoor, we need to stay stringly typed longer, at
least until we reach the "localized" place where the single call into the
c/t machinery is made; this single call is the only place that has the type
information needed for the c/t value to become properly typed. But there is
no way to know how long it will be until we reach that point (or even *if*
we reach that point; consider passes that are not run on this invocation).

Hence my suggestion of just putting a stringly typed key-value store (or
whatever) in an easily accessible place (like LLVMContext), and just
translating any unrecognized command line options (ones that are not for
#1) into that stringly typed storage.

I agree with Rafael that "constructor arguments to passes" are not c/t
points. In the future, there might be some way to integrate the two (from
the referenced post, you can probably tell that I kind of like the idea of
doing so), but for now, I think the clear incremental step is to attack #2
and solve it without llvm::cl. I have suggested a way to do this that I
think makes sense.

-- Sean Silva






On Mon, Aug 18, 2014 at 11:49 AM, Chris Bieneman <beanz at apple.com> wrote:

> Today command line arguments in LLVM are global variables. An example
> argument from Scalarizer.cpp is:
>
> static cl::opt<bool> ScalarizeLoadStore
>   ("scalarize-load-store", cl::Hidden, cl::init(false),
>    cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>
>
> This poses a problem for clients of LLVM that aren’t traditional compilers
> (i.e. WebKit, and Mesa). My proposal is to take a phased approach at
> addressing this issue.
>
> The first phase is to move the ownership of command line options to a
> singleton, OptionRegistry. The OptionRegistry can be made to work with the
> existing global command line definitions so that the changes to migrate
> options can be done in small batches. The primary purpose of this change is
> to move the ownership of the command line options out of the global scope,
> and to provide a vehicle for threading them through the compiler. At the
> completion of this phase, all the command line arguments will be
> constructed during LLVM initialization and registered under the
> OptionRegistry. This will replace the 100’s of static initialized cl::opt
> objects with a single static initialized OptionRegistry.
>
> With this change options can be constructed during initialization. For the
> example option above the pass initialization would get a line like:
>
> cl::OptionRegistry::CreateOption<bool>("ScalarizeLoadStore",
>   "scalarize-load-store", cl::Hidden, cl::init(false),
>   cl::desc("Allow the scalarizer pass to scalarize loads and store"));
>
>
> Also the pass would add a boolean member to store the value of the option
> which would be initialized in the pass’s constructor like this:
>
> ScalarizeLoadStore =
> cl::OptionRegistry::GetValue<bool>("ScalarizeLoadStore");
>
>
> These two operations need to occur at separate times due to object
> lifespans. At the time when command lines are parsed passes have been
> initialized, but not constructed. That means making options live in passes
> doesn’t really work, but since we want the data to be part of the pass we
> need to initialize it during construction.
>
> A large part of this phase will be finding appropriate places for all the
> command line options to be initialized, and identifying all the places
> where the option data will need to be threaded through the compiler. One of
> the goals here is to get rid of all global state in the compiler to
> (eventually) enable better multi-threading by clients like WebKit.
>
> The second phase is to split the OptionRegistry into two pieces. The first
> piece is the parsing logic, and the second piece is the Option data store.
> The main goal of this phase is to make the OptionRegistry represent
> everything needed to parse command line options and to define a new second
> object, OptionStore, that is populated with values by parsing the command
> line. The OptionRegistry will be responsible for initializing “blank”
> option stores which can then be populated by either the command line
> parser, or API calls.
>
> The OptionRegistry should remain a singleton so that the parsing logic for
> all options remains universally available. This is required to continue
> supporting plugin loadable options.
>
> The OptionStore should be created when a command line is parsed, or by an
> API call in libraries, and can be passed through the pass manager and
> targets to populate option data. The OptionStore should have a lifetime
> independent of contexts, and pass managers because it can be re-used
> indiscriminately.
>
> The core principle in this design is that the objects involved in parsing
> options only need to exist once, but you need a full list of all options in
> order to parse a command line. You should be able to have multiple copies
> of the actual stored option data. Having multiple copies of the data store
> is one step toward enabling two instances of LLVM in the same process to
> use optimization passes with different options.
>
> I haven’t come up with a specific implementation proposal for this yet,
> but I do have some rough ideas. The basic flow that I’m thinking of is for
> command line parsing to create an object that maps option names to their
> values without any of the parsing data involved. This would allow for
> either parsing multiple command lines, or generally just constructing
> multiple option data stores. **Here is where things get foggy because I
> haven’t yet looked too deep.** Once you construct a data store it will get
> passed into the pass manager (and everywhere else that needs it), and it
> will be used to initialize all the option values.
>
> There has been discussion about making the option store reside within the
> context, but this doesn’t feel right because the biggest consumer of option
> data is the passes, and you can use a single pass manager with multiple
> contexts.
>
> Thanks,
> -Chris
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140819/84bfece1/attachment.html>


More information about the llvm-dev mailing list