[LLVMdev] [RFC] Removing static initializers for command line options

Tue Aug 26 10:41:55 PDT 2014

A bunch of IRC discussions etc have taken place, and I wanted to surface
some of the points from them here. Also, I wanted to give a more fully
thought-through set of my own perspectives.

First, I'd like to re-state the goals as I think those have gotten muddied.

1) Remove most of LLVM's static initializers (that is, dynamic
initialization code for static storage duration objects) by moving away
from global objects with non-trivial constructors of 'llvm::cl::opt' types
in the LLVM *libraries* (they remain fine in the tools of course).

2) Support separate LLVM users in a single address space having different
values for the debugging knobs currently managed through the
'llvm::cl::opt' command line flags mechanism.

3) (related to #2) Support some library API for setting the debugging knobs
currently managed through the 'llvm::cl::opt' command line flags mechanism.

Now, I want to change my position on one point because after talking to
Chris, Jim, and others I understand the problem significantly better. I am
now convinced that it is impossible to fix even #1 without making a change
for every single option in LLVM. Why? Because we have to *register* all of
the flags, and this registration inherently cannot be done from a global's
constructor without introducing the undesirable static initializers[1].

I also think that in order to address #2 we would need to make some change
to essentially every option in LLVM as we would have to connect it to
non-global storage.

Given that, I think it does make sense to try to address #1 and #2 at the
same time -- trying to limit the change to just one or the other would very
likely result in a significantly different API for creating, registering,
and using these options, and I would rather not make all of the LLVM
developers re-learn how to do this 2 times if we can instead do it 1 time.

However, doing #2 only really makes sense if we also do #3 to let people
*use* this new freedom. =] Owen convinced me that there really is a good
reason to support #3 as part of the LLVM libraries, however I wanted to
raise my primary concern about this on the mailing list because it hasn't
really been discussed.

These "debugging knobs" or library level options which *aren't* exposed in
the library's APIs are very risky to expose in a side-channel API: they are
often completely unsupported or unsupportable options that were never
intended for "production" usage. Once we expose these options via some API
to library users, we run a very serious risk of getting locked into
supporting behaviors or debugging misbehavior that were never expected or
planned for...

However, the debugging use case is real and important. So I want to propose
that we guard many of the paths to setting these options with NDEBUG so
that in non-asserts builds, library users can't override these options.
This would mean the C++ API for setting this in LLVMContext (and the C API
if we ever expose it there) would only support setting these in asserts
builds. We could even sink the checking into the registry. The name should
probably have "debug" in it somewhere. It would then have some
InternalUnsupportedScaryBad API which is not actually in the header files
but is used in LLVM's tools to maintain their current behavior without
documenting a public API. It's not bulletproof, but it seemed that it would
be sufficient for the debugging use cases and is a strong statement of
"unsupported" IMO.

The only other observation I would like to make is that if we're going to
do this, we really should take the time to try to sort out a really *good*
API. Hopefully we'll be slightly less wrong this time, and maybe the next
transition can be more incremental.

Ok, on to the more low-level parts of this email, here some specific
thoughts for how to design the API for further discussion:

- I really like the idea of using a 'void*' derived from some static tag as
a way to identify uniquely each option. I also think we can use some
minimal "reflection" techniques to significantly simplify this.

- The need for a global registry is... distasteful, but I agree inevitable.

- I'd like to avoid a separate "option store". My current suggestion is to
use the LLVMContext, and to add a context to the layers which don't have an
LLVMContext and where these kinds of options make sense.

- I would actually use a simple 'void*' -> string mapping in the "store" of
the context. (Maybe we can actually use StringMap?) I would still provide
the type to the registry and use that to validate options when parsing.
When getting the value out of the "store" in the context, we can parse it
into the data-type and assert fail on parse errors. I don't feel strongly
about this, but without "std::any" it seems simpler than rolling our own
with a void* or a union.

- I would provide a simple API on the LLVMContext to get an "option" string
for a given key. This API could be exposed by any context, and we can write
the option value extraction library to accept a generic T which supports
such a query method. That way there is *very* loose coupling between these
and it is easy to support any context object at any layer.

- The registry should essentially provide an interface to parse an argv
string list into such a map from 'void*' -> string. The LLVMContext (or any
other context) can then be populated by the result of this parse. Each
library can parse whatever argv it wants.

- I don't think we should keep piling on the interface design of the
existing cl::opt stuff. I would make a clean break to a new header /
library, and just build some code to map these into the cl::opt parsing
code from the registry (we already do stuff like this for pass names). This
will also help the two systems co-exist but stay separate during the
(inevitably long) transition period.

- As a specific example of what I would change from the cl::opt stuff: I
think we should make the required arguments just be required and
positional, and the optional arguments use a fluent-style interface (yea, I
know, but I think it works well here) rather than the weird variadic-ish
thing...

Once there is a patch that seems like a reasonable start at this, I'd
actually like to get the new library added and just one pass (the one
you're using seems great) converted. I'd then like to try a few experiments
to see if we can improve some of the nitty-gritty details of the API before
we roll it out widely.

Seem reasonable? I'm actually happy to help with some of the API
experiments and the rollout here.
-Chandler

[1]: I'd actually like to understand why on some platforms the explicit
global initialization routines are so much better than global
constructors... But that's probably a topic for on off-list or IRC
conversation. I don't want to derail this thread with that.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140826/189c99c2/attachment.html>