[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Mon Nov 17 10:17:41 PST 2014

On Fri, Nov 14, 2014 at 3:24 PM, Pete Cooper <peter_cooper at apple.com> wrote:

>
> On Nov 14, 2014, at 2:57 PM, Chris Bieneman <cbieneman at apple.com> wrote:
>
> There are parts of this proposal that I really like, and there are others
> that I think are actually at opposition to the work we’re trying to do with
> cl::opt.
>
> On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
>
> +chrisb
>
> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com> wrote:
>
> I'm working on fixing PR21471, which is about embedding codegen command
> line options into the bitcode as function or module-level attributes so
> that they don't get ignored when doing LTO.
>
> http://llvm.org/bugs/show_bug.cgi?id=21471
>
> I have an initial patch (attached to this email) which enables clang/llvm
> to recognize one command line option, write it to the IR, and read it out
> in a backend pass. I'm looking to get feedback from the community on
> whether I'm headed in the right direction or whether there are alternate
> ideas before I go all the way on fixing the PR. Specifically, I'd like to
> know the answers to the following questions:
>
> 1. How do we make sure we continue to be able to use the command line
> options we've been using for llc and other tools?
>
>
> In discussions about the new cl::opt API I believe the general idea was
> that most of the options expressed using cl::opt are actually only relevant
> as debug options, so I think one big part of this work is really going to
> be identifying a subset of the current options which are actually relevant
> to expose in the IR.
>
> Ideally this will be a small set as it could get expensive to represent
> otherwise.  I’ll get to why later.
>
> So with LTO we already have issues where modules have different metadata.
> I’m not sure, but we might also have issues with weak functions and
> different attribute sets.
>
> We need to work out whether the option a function was compiled with is
> interesting enough to be stored on that function, even if its the default.
> For example, lets say I tag a function with ‘loop-unroll-threshold=100’, I
> would expect that to override the one given on the command line, but
> perhaps others would want the command line to always win.
>
>
There is a text file attached to PR21471 which lists the command line
options that are generated by clang and are necessary for code generation
as of r217366. Many of them are relevant as losing them during LTO can
result in incorrect code generation.

> Then there’s the issue of whether a default is interesting or not.  For
> example, the default loop unroll threshold is 150.  We probably want to tag
> all functions with that threshold as how to we know that the default will
> stay the same in a later LLVM.  Or you could save the fact that something
> is a default.  So for example, store ‘unroll-threadhold=default(150)’ as
> then you can either:
> - Always choose 150 for this function, because thats what it was tagged
> with
> - Always choose the default, so if we change ToT to default 200, you
> choose 200.
>
> Now if you have to store all ‘interesting’ options, the set of things you
> store could start to get quite large quite quickly.
>
>
> I didn't think about storing the default values, but if the set of
"interesting options" is small, we can still store them in the bitcode.

> 2. How to handle cases where two functions in a module have different sets
> of command line options?
>
> I would store them in the attributes set.  I don’t think there’s anywhere
> else you can do this right now.  Attributes or metadata.  I would say
> attributes because then you can make them “option”=“value” (i.e.,
> StringAttribute) and you don’t need to worry about anyone knowing about the
> names or not.
>
>
> Today I don’t believe we have this ability.
>
> 3. Where should the command line options or module/function attributes be
> stored once they are read out from the IR?
>
>
> My suggestion would be the OptionStore that I proposed here:
> http://reviews.llvm.org/D6207
>
> I don’t think the OptionStore will work either, unless you put an
> OptionStore on the Function which I don’t think will be feasible.
>
> Instead I think you need either the function pass manager (perhaps
> LPPassManager, BB PM, etc too) to parse the options from the function once
> before it runs all the passes.  Or, for each pass with an option you care
> about, move the storage of that option itself to the pass.  This is similar
> to what Chris is doing with static initializers.  So I can imagine pass
> initialization looking something like
>
> class LoopUnrollPass {
>   unsigned Threshold = ...
> doIniit… {
>   if (cl opt has value)
>     Threshold = ‘cl opt value’
>   if (function.getattribute(‘unroll-threshold’)
>     Threadhold = function.getAttributeAsInteger(‘unroll-threshold’)
> }
>
> For performance reasons, I would actually add a new type of Attribute for
> a string key and integer value as then you don’t actually need to do any
> parsing in the new function.getAttributeAsInteger function I introduced
> here.
>
> Thanks,
> Pete
>
>
>
> The short description of the approach I took in my patch is that command
> line options that are important to codegen are collected by
> cl::ParseCommandLineOptions, written to the bitcode as function or module
> attributes, and read out directly by the optimization passes that need
> them. cl::opt options are replaced with CodeGenOpt options (subclass of
> cl::opt), which are needed only to parse the command line and provide the
> default value when the corresponding options are not in the bitcode.
>
>
> I like this approach, since it means the frontend doesn't have to
> understand
> options in order to pass them on to the backend.
>
> The static variables should be straightforward to migrate to an LLVMContext
> once ParseCommandLineOptions stores things there instead of in globals.
>
>
> I also think that the OptionStore in conjunction with the OptionRegistry
> (rather than any of the cl APIs) should have all the parsing code. In fact,
> you shouldn’t have to call ParseCommandLineOptions, we could make encoding
> and decoding the stored options associated with a module part of loading
> and storing the module.
>
>
> diff --git a/lib/CodeGen/CodeGenOption.cpp b/lib/CodeGen/CodeGenOption.cpp
> new file mode 100644
> index 0000000..2d74c2f
> --- /dev/null
> +++ b/lib/CodeGen/CodeGenOption.cpp
> @@ -0,0 +1,59 @@
> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.           --*- C++
> -*-===//
> +//
> +//                     The LLVM Compiler Infrastructure
> +//
> +// This file is distributed under the University of Illinois Open Source
> +// License. See LICENSE.TXT for details.
> +//
>
> +//===----------------------------------------------------------------------===//
> +//
>
> +//===----------------------------------------------------------------------===//
> +
> +#include "llvm/CodeGen/CodeGenOption.h"
> +#include "llvm/IR/Attributes.h"
> +#include "llvm/IR/Module.h"
> +
> +using namespace llvm;
> +
> +static std::map<std::string, bool> FunctionBoolAttrs;
> +static std::map<std::string, bool> ModuleBoolAttrs;
> +
>
>
> @Chris, should these be using ManagedStatic?
>
>
> I’d much rather they just weren’t static at all. Using globals to store
> state that inherently isn’t global just feels wrong.
>
> -Chris
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141117/71b4e29c/attachment.html>