[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Wed Dec 3 11:39:23 PST 2014

On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <echristo at gmail.com> wrote:

>
>
> On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <ahatanak at gmail.com>
> wrote:
>
>> On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <echristo at gmail.com>
>> wrote:
>>
>>> On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <bob.wilson at apple.com>
>>> wrote:
>>>
>>>> Thanks for your feedback, Eric.
>>>>
>>>> I still think we may be talking past each other a little bit, but
>>>> rather than
>>>>
>>>
>>> Might be, sorry if so :(
>>>
>>>
>>>> delving further into the details right now, I’ve suggested to Akira
>>>> that he look into how we should handle other kinds of options. I’m hoping
>>>> that as we look at more of them, we will gain some insight into the
>>>> approach that we want to take here. This patch really only deals with the
>>>> easy cases, and whether we use Akira’s approach or not, I’m not especially
>>>> worried about this kind of option.
>>>>
>>>> Let’s come back to this in a little while after we’ve looked at what to
>>>> do for some other options…..
>>>>
>>>>
>>> I apologize if this seems a bit rambly, but I'm trying to get a lot of
>>> thoughts on this down:
>>>
>>> FWIW I'm looking at some of this at the moment which is why I spoke to a
>>> few of the different options and needing to control TargetMachine creation.
>>> Just to continue with some of the discussion (at least to point some things
>>> the right direction and get feedback) let's think about the -target-abi
>>> option for ARM. It sets the ABI for the backend on the target machine. It
>>> probably shouldn't change on a per function basis (wait, except for the
>>> calling convention which can be changed via an attribute, fun!), so we want
>>> it to control global data layout and valid types which means we need it on
>>> the TargetMachine. The question is how to pass this information into the
>>> backend for such an option - right now, if we wanted, we could punt to
>>> using the command line options that currently exist and the existing
>>> ParseCommandLineOptions call from clang.
>>>
>>>
>> When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine
>> passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have
>> anything else we have to pass to the backend in this case?
>>
>
> Not yet, but the option can be a backend option rather than a subtarget
> feature - and I'm probably going to do that in the short term as it's no
> worse than the current status, see the next line here below :)
>
> -eric
>
>
>>
>>
>>> That said, it's a pretty awful hack to use them. For some of my
>>> particular needs I've got some code around unifying the DataLayout creation
>>> between the front end and back end, but it's a partial solution for the
>>> general problem and I'll probably run into more issues in this arena.
>>>
>>> So, trying to come up with ways to communicate all of this via the API
>>> so that we can set ABIs for targets that need it (ARM, MIPS, others), as
>>> well as probably other options (anything passed via -backend-option is a
>>> good candidate).
>>>
>>> One thought is expanding and/or deriving target specific versions of
>>> TargetOptions and populating that on creation from the front end. We'd want
>>> to be careful with the target specific bits and constructing defaults,
>>> probably something off of the bits in Targets.cpp would be appropriate. A
>>> lot of this type of code leads us down the TargetSupport library or
>>> something of the sort - classes to help describe or create backend
>>> constructs.
>>>
>>> Anyhow, this is my current thinking on how to do API building of
>>> TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds
>>> complicated in all of the same ways without any particular gain).
>>>
>>>

If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass
module-level CodeGenOpts.BackendOptions (backend options that don't need to
change on a per-function basis) to Target::createTargetMachine. We can
either pass it directly to the function or add a new field in
llvm::TargetOptions that holds the options. If we can do that, we don't
have to call ParseCommandLineOptions from
EmitAssemblyHelper::CreateTargetMachine.

> -eric
>>>
>>>
>>>> On Dec 1, 2014, at 10:53 AM, Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>>
>>>> Hi Bob,
>>>>
>>>> I've been giving this some thought, comments inline:
>>>>
>>>> On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com>
>>>> wrote:
>>>>
>>>>> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> So my general concern here is that lots of command line options that
>>>>>>> don't need to be or shouldn't be IR level constructs become oddly named
>>>>>>> string options. It's bad enough that we're doing that with the target cpu
>>>>>>> and target feature string let alone the rest of it.
>>>>>>>
>>>>>>> If we're talking about things that end up changing the subtarget
>>>>>>> used we definitely need to make this part of the lookup key - at which
>>>>>>> point we're going to want a unified interface to all of them. I haven't
>>>>>>> applied your patch to see what the IR changes look like here.
>>>>>>>
>>>>>>>
>>>>>> With the patches applied, the attributes clang embeds in the IR look
>>>>>> like this:
>>>>>>
>>>>>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>>>>>> -menable-no-nans
>>>>>>
>>>>>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>>>>>> "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false"
>>>>>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>>>>>> "no-infs-fp-math"="false" "stack-protector-buffer-size"="8"
>>>>>> "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs"
>>>>>> "unsafe-fp-math"="false" "use-soft-float"="true" }
>>>>>>
>>>>>> I'm guessing you want the front-end to convert command line options
>>>>>> like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as
>>>>>> odd?
>>>>>>
>>>>>
>>>>> Right, that was part of the idea, the rest is that we define things
>>>>> that actually make sense as front end command line options and use those to
>>>>> communicate via IR constructs or api calls I.e. reserved registers don't
>>>>> need to be a specific back end option, but rather can be passed along when
>>>>> creating a target machine, etc.
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>> No, reserved registers (I assume you’re referring to “arm-reserve-r9”
>>>>> in Akira’s example) don’t necessarily need to be backend options. But,
>>>>> those kinds of options do need to be recorded in the IR and the target
>>>>> machine needs to reflect the relevant IR settings. Otherwise, we’ll still
>>>>> be left with a broken LTO model. E.G., I build one LTO object with
>>>>> “arm-reserve-r9” and another without it, but the LTO compilation still
>>>>> needs to respect those options for the code in each object. The details of
>>>>> how a particular option should be handled will vary depending on the
>>>>> option, and we need to discuss each one separately.
>>>>>
>>>>>
>>>> ABI breaking options should probably be considered part of
>>>> TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
>>>> creation it can be on the function as a feature string attribute, if it
>>>> affects global code generation it needs to be at TargetMachine creation
>>>> time).
>>>>
>>>>
>>>>> It’s pretty clear, though, that there is a class of backend options
>>>>> that translate nicely to function attributes. This RFC is about the
>>>>> mechanism that we use to handle those (without specifying which specific
>>>>> options will be handled that way).
>>>>>
>>>>>
>>>> Fair enough. For function attributes I can see a desire to have (as
>>>> Chris posted in an email from someone at synopsys) a generic target
>>>> specific dictionary mapping on functions for backend specific uses. I just
>>>> haven't seen anything in the thread that would be applicable for this
>>>> rather than an actual function attribute. That said, I would like some sort
>>>> of delineation for target function attributes as well. Thoughts on that?
>>>> Perhaps it could look like the string dictionary we've mentioned, maybe not.
>>>>
>>>>
>>>>>
>>>>>
>>>>>> From looking at the patch it seems that the command line option for
>>>>>>> NaNs should be a general attribute on the function rather than a
>>>>>>> TargetMachine option anyhow, so it seems that should be migrated down.
>>>>>>> Whether or not it is exposed as an llvm attribute or a string based key
>>>>>>> value pair is an interesting discussion. Also, would we want to, instead,
>>>>>>> expose the backend options as clang IR level options? Seems to make sense
>>>>>>> from that perspective and then they can add the appropriate attributes to
>>>>>>> the functions themselves. (-mlongcalls for the long calls option for
>>>>>>> example).
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>> -eric
>>>>>>>
>>>>>>
>>>>> Again, I think we can evaluate specific options on a case-by-case
>>>>> basis. For some of them, it might make sense to expose them as clang
>>>>> options. Like you, I had previously assumed we would teach the front-end
>>>>> about all the backend options. I was a little surprised by the approach
>>>>> Akira suggested here, but after giving it some thought, I think it’s a
>>>>> great idea. There are a LOT of target-specific backend options. Some of
>>>>> them do not need to be IR, but many do. Teaching clang about all of them
>>>>> would be a lot of work, and I don’t see any clear benefit from that. This
>>>>> approach lets clang remain relatively independent of the backend options.
>>>>> We still need to think carefully about which backend options use this
>>>>> mechanism, of course.
>>>>>
>>>>>
>>>> I'm unconvinced here...
>>>>
>>>>
>>>>> Are there other reasons why you think we should add all those options
>>>>> to clang? I just don’t see much to be gained by that.
>>>>>
>>>>>
>>>> .... I think that anything you'd like to expose should probably be a
>>>> front end option. If nothing else I don't see how you plan on annotating
>>>> the IR. Basically you're constructing a parallel option hierarchy and that
>>>> seems... redundant. I agree that it's nice for tools to be able to override
>>>> things in the backend - for testing if nothing else. We do this a lot by
>>>> checking options and values in the code. A way of streamlining this would
>>>> be nice - but I don't think that's the goal of these patches no?
>>>>
>>>>
>>>>> Do you have objections to the use of string-based key/value pairs? I
>>>>> believe the intention all along, going back to Bill’s work on this a few
>>>>> years ago, was that we would add new attributes for target-independent
>>>>> things and use strings for the target-specific options. That seems like a
>>>>> reasonable starting point to me.
>>>>>
>>>>>
>>>> Only some as I mentioned above - I haven't seen anything that needs to
>>>> be target specific in this way, but I'm open to the idea :)
>>>>
>>>> As a note, this is all for function level things. We need a solution
>>>> for non-function/module level attributes and I think that's going to look
>>>> like API calls into the backend for TargetMachine creation - there may be a
>>>> list of things that would turn into module level attributes, but that is a
>>>> direction fraught with danger and I'd like to avoid that as much as
>>>> possible.
>>>>
>>>> (As a note, the particular patch that Akira has is unacceptable because
>>>> of the changes to the subtarget lookup, but that's a separate issue :)
>>>>
>>>> -eric
>>>>
>>>>
>>>>>
>>>>>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Updated patch is attached. Note this is just a work-in-progress
>>>>>>>> patch and I plan to address the feedback comments later if this patch is in
>>>>>>>> the right direction.
>>>>>>>>
>>>>>>>> This is how the command line options are parsed and used by the
>>>>>>>> backend passes:
>>>>>>>>
>>>>>>>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions.
>>>>>>>> Any of the options that should be written to the bitcode as function or
>>>>>>>> module attributes (these options are declared as CodeGenOpt, which is a
>>>>>>>> subclass of cl::opt) are saved to Function or ModuleIntAttrs in
>>>>>>>> CodeGenOptions.cpp.
>>>>>>>> 2. setFunctionAttribute is called to move the function and module
>>>>>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>>>>>> TargetOptions are moved to the bitcode too. Pre-existing attributes in the
>>>>>>>> bitcode are overwritten by the options in the command line.
>>>>>>>> 3. The backend passes and subtarget classes call
>>>>>>>> getFunctionAttribute to read the attributes stored to the bitcode in step
>>>>>>>> 2. Function::hasFnAttribute can be called directly (for example,
>>>>>>>> NoNaNsFPMath in the patch), if the default value is known to be "false".
>>>>>>>>
>>>>>>>> I also made the following changes in the patch:
>>>>>>>>
>>>>>>>> 1. In the constructor of MachineFunction, call the version of
>>>>>>>> getSubtargetImpl that takes a Function parameter, so that it gets the
>>>>>>>> Subtarget specific to the Function being compiled.
>>>>>>>> 2. Change ARMTargetMachine::SubtargetMap to be a
>>>>>>>> DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary
>>>>>>>> change to ease debugging and should be reverted to a StringMap or changed
>>>>>>>> to another type of map later.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at gmail.
>>>>>>>> com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <
>>>>>>>>>> dexonsmith at apple.com> wrote:
>>>>>>>>>> >
>>>>>>>>>> > +chrisb
>>>>>>>>>> >
>>>>>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> I'm working on fixing PR21471, which is about embedding
>>>>>>>>>> codegen command line options into the bitcode as function or module-level
>>>>>>>>>> attributes so that they don't get ignored when doing LTO.
>>>>>>>>>> >>
>>>>>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>>>>>> >>
>>>>>>>>>> >> I have an initial patch (attached to this email) which enables
>>>>>>>>>> clang/llvm to recognize one command line option, write it to the IR, and
>>>>>>>>>> read it out in a backend pass. I'm looking to get feedback from the
>>>>>>>>>> community on whether I'm headed in the right direction or whether there are
>>>>>>>>>> alternate ideas before I go all the way on fixing the PR. Specifically, I'd
>>>>>>>>>> like to know the answers to the following questions:
>>>>>>>>>> >>
>>>>>>>>>> >> 1. How do we make sure we continue to be able to use the
>>>>>>>>>> command line options we've been using for llc and other tools?
>>>>>>>>>> >> 2. How to handle cases where two functions in a module have
>>>>>>>>>> different sets of command line options?
>>>>>>>>>> >> 3. Where should the command line options or module/function
>>>>>>>>>> attributes be stored once they are read out from the IR?
>>>>>>>>>> >>
>>>>>>>>>> >> The short description of the approach I took in my patch is
>>>>>>>>>> that command line options that are important to codegen are collected by
>>>>>>>>>> cl::ParseCommandLineOptions, written to the bitcode as function or module
>>>>>>>>>> attributes, and read out directly by the optimization passes that need
>>>>>>>>>> them. cl::opt options are replaced with CodeGenOpt options (subclass of
>>>>>>>>>> cl::opt), which are needed only to parse the command line and provide the
>>>>>>>>>> default value when the corresponding options are not in the bitcode.
>>>>>>>>>> >
>>>>>>>>>> > I like this approach, since it means the frontend doesn't have
>>>>>>>>>> to understand
>>>>>>>>>> > options in order to pass them on to the backend.
>>>>>>>>>>
>>>>>>>>>> I agree. It’s not the approach that was I was expecting, but
>>>>>>>>>> after reading the patches, I like it.
>>>>>>>>>>
>>>>>>>>>> If we can get consensus on the approach, I suggest that we stage
>>>>>>>>>> the work to first adopt Akira’s patches and then we can incrementally
>>>>>>>>>> convert various options over to use it. There may be complications for some
>>>>>>>>>> options, so I’d like to see separate patches for each one (possibly with a
>>>>>>>>>> few similar options combined in one patch).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> I'm not sold quite yet :)
>>>>>>>>>
>>>>>>>>> Akira: Can you show an example of how you expect this to work in
>>>>>>>>> the IR and how the backend is going to process?
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>> -eric
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> >
>>>>>>>>>> > The static variables should be straightforward to migrate to an
>>>>>>>>>> LLVMContext
>>>>>>>>>> > once ParseCommandLineOptions stores things there instead of in
>>>>>>>>>> globals.
>>>>>>>>>> >
>>>>>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>> >> new file mode 100644
>>>>>>>>>> >> index 0000000..2d74c2f
>>>>>>>>>> >> --- /dev/null
>>>>>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>> >> @@ -0,0 +1,59 @@
>>>>>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.
>>>>>>>>>>  --*- C++ -*-===//
>>>>>>>>>> >> +//
>>>>>>>>>> >> +//                     The LLVM Compiler Infrastructure
>>>>>>>>>> >> +//
>>>>>>>>>> >> +// This file is distributed under the University of Illinois
>>>>>>>>>> Open Source
>>>>>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>>>>>> >> +//
>>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>>> ----------------===//
>>>>>>>>>> >> +//
>>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>>> ----------------===//
>>>>>>>>>> >> +
>>>>>>>>>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>>>>>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>>>>>> >> +
>>>>>>>>>> >> +using namespace llvm;
>>>>>>>>>> >> +
>>>>>>>>>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>>>>>>>>>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>>>>>>>>>> >> +
>>>>>>>>>> >
>>>>>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>>>>>> >
>>>>>>>>>> > _______________________________________________
>>>>>>>>>> > LLVM Developers mailing list
>>>>>>>>>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141203/bfbcecf5/attachment.html>