[LLVMdev] [RFC] Internal command line options should not be statically initialized.

Thu Sep 19 13:00:41 PDT 2013

On Sep 19, 2013, at 10:34 AM, Chandler Carruth <chandlerc at google.com> wrote:

> On Wed, Sep 18, 2013 at 9:06 PM, Andrew Trick <atrick at apple.com> wrote:
> 
> On Sep 18, 2013, at 8:58 AM, Chris Lattner <clattner at apple.com> wrote:
> 
>> On Sep 17, 2013, at 10:10 AM, Andrew Trick <atrick at apple.com> wrote:
>>> LLVM's internal command line library needs to evolve. We have an immediate need to build LLVM as a library free of static initializers, but before brute-force fixing this problem, I'd like outline the incremental steps that will lead to a desirable long term solution. We want infrastructure in place to provide an evolutionary path.
>> 
>> Thank you for tackling this, we should have fixed this years ago.
>> 
>> Please do a pass over the cl::opts we have, and remove ones that are long dead or unused.   Do we still need -join-liveintervals? :-)
>> 
>> 
>> On Sep 17, 2013, at 12:03 PM, Daniel Dunbar <daniel at zuster.org> wrote:
>>> On Tue, Sep 17, 2013 at 11:29 AM, Reid Kleckner <rnk at google.com> wrote:
>>> Wait, I have a terrible idea.  Why don't we roll our own .init_array style appending section?  I think we can make this work for all toolchains we support.
>>> 
>>> Andy and I talked about this, but I don't think its worth it. My opinion is:
>>> 1. For tool options (the top-level llc, opt, llvm-as etc. opts) it doesn't matter.
>>> 2. For experimental options (options that we would be happy if they were compiled out of a production compiler/JIT client/whatever), it doesn't matter.
>>> 3. For backend options that need to always be available, lots of them probably already need to get promoted to real API.
>>> 4. For the remaining options (ones that don't need to become API, but also aren't purely experimental), many of them can probably easily be initialized by some existing initialization hook (pass initialization, target initialization).
>>> 5. There aren't enough options left not in those categories to motivate some kind of clever solution.
>> 
>> I think that this is a great summary of the problem.  Having cl::opt's compiled *out* of non-assert build by default makes a lot of sense to me, and having tool options use toolopt<> (or something) also makes perfect sense.
>> 
>> If you're going to go and tackle pass-specific options, I think that we should consider changing the syntax and overall design of the command line options.   We already have some manual name mangling/namespacification of options (e.g. -tail-dup-limit=). Perhaps we should formalize this somehow?
> 
> Obviously, based on the 18 responses I've gotten, the tone of my first email was misleading.
> 
> I don’t want to stifle discussion, but to be clear, the only thing I propose to tackle immediately is the removal of static initializers from libraries. There are several isolated issues that Filip has found good workarounds for. cl::opt is the one pervasive problem that can't be weeded out one case at a time.
> 
> The purpose of posting an RFC and opening up discussion was to find out from people who have already thought about this, how the ideal cl::opt framework should work. I won't be making that happen, rather I'll make sure that the changes we make don't get in the way of future progress.
> 
> I would certainly love to see LLVM internal options be reorganized and help however I can, but I'll be very sad if that holds up removing static initializers.
> 
> =/ I think we should actually implement the right long-term design rather than something short term.

I’d like to send out another proposal for redesigning internal options—I have a much clearer idea of what I want now—but that discussion will revolve around user interface and API design and dwarf the three problems you mention below. Hence the specific subject line of this thread.

> Anyways, I feel like there are (at least) three possible problems you want to solve here, and I'd like to understand which (maybe all) you're actually trying to solve, and which ones seem most important:
> 
> 1) threads still alive during program termination reading from flags that are being destroyed along with all globals
> 
> 2) initialization ordering issues between flags in different translation units
> 
> 3) the existence of (non-zero-initializing) static initializers at all
> 
> 
> For me, #1 and #2 are things I care a lot about and would be happy to see solved. But #3 doesn't seem necessary or even desirable. We have a lot of registration patterns in LLVM that make working with it very simple and easy. It's not clear why we would want to preclude this, or re-invent the mechanisms that already exist to automatically trigger static initialization with the arbitrary fan-out of 'initializeFoo' global functions. So if #3 is really an important goal, I'm curious about the why. =] This is especially relevant as it impacts all of the work I'm starting to do on the pass management and registration system.

There’s no question that any new API must provide a solution to #1 and #2.

Interesting that you don’t agree with #3. I assumed it was conventional wisdom by now that C++ static initializers are pure evil.

This is the real issue: #3 is a strict requirement for some libraries (which Filip just explained). Globals don’t need to be zero initialized, they can be initialized to any constant data, as long as they don't require static initialization code to run when the library is dynamically linked. If we don’t provide a way to build LLVM libraries without static initializers, then those libraries cannot link against any LLVM libraries, period.

At least one of the reasons for this requirement is that the overhead of running the static initializers is non-neglible. Deferring initialization allows the tool to control when it pays for the overhead. It may turn out to be unnecessary, or concurrency may hide it at some point. But why this requirement exists is somewhat beside the point given that:
- it prevents LLVM adoption
- fixing the problem will only benefit other tools that don’t have such a strict requirement

The problem is really ridiculous what you think about it. We currently prevent LLVM from being linked against non-command line based tools in order to support a feature that is obviously useless for those tools.

In the very short term (this week), I’d like to provide an answer. I think it will be inevitable that we need a temporary build flag -DLLVM_NO_STATICINIT, the only question is how much we rely on that flag. Some people would prefer to go straight to making the non-asserts build work this way instead. I can see both sides and don’t have a strong opinion either way.

-Andy

> As a somewhat separable point, I completely agree that every flag which any frontend actually needs to control for correct functionality should be moved from flags to an actual, proper interface as global flags just don't work for a library. Essentially, they should be "debugging" tools or "developer" tools, not actual interfaces. This isn't true today, and the most egregious cases are the emission of debug information. All of that is controlled through global flags, which causes lots of problems today for our library users of LLVM.
> 
> However, I don't think the flags should only be present in !NDEBUG builds. I think its reasonable for developers to debug problems with released binaries by causing these flags to be toggled using '-mllvm' or related tools in the frontends to manually parse flags, or by 'opt' automatically handing the flag parsing down to this layer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130919/280a96ce/attachment.html>