[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Fri Dec 5 14:40:42 PST 2014

On Wed Dec 03 2014 at 11:39:23 AM Akira Hatanaka <ahatanak at gmail.com> wrote:

> On Tue, Dec 2, 2014 at 4:38 PM, Eric Christopher <echristo at gmail.com>
> wrote:
>
>>
>>
>> On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <ahatanak at gmail.com>
>> wrote:
>>
>>> On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <echristo at gmail.com>
>>> wrote:
>>>
>>>> On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <bob.wilson at apple.com>
>>>> wrote:
>>>>
>>>>> Thanks for your feedback, Eric.
>>>>>
>>>>> I still think we may be talking past each other a little bit, but
>>>>> rather than
>>>>>
>>>>
>>>> Might be, sorry if so :(
>>>>
>>>>
>>>>> delving further into the details right now, I’ve suggested to Akira
>>>>> that he look into how we should handle other kinds of options. I’m hoping
>>>>> that as we look at more of them, we will gain some insight into the
>>>>> approach that we want to take here. This patch really only deals with the
>>>>> easy cases, and whether we use Akira’s approach or not, I’m not especially
>>>>> worried about this kind of option.
>>>>>
>>>>> Let’s come back to this in a little while after we’ve looked at what
>>>>> to do for some other options…..
>>>>>
>>>>>
>>>> I apologize if this seems a bit rambly, but I'm trying to get a lot of
>>>> thoughts on this down:
>>>>
>>>> FWIW I'm looking at some of this at the moment which is why I spoke to
>>>> a few of the different options and needing to control TargetMachine
>>>> creation. Just to continue with some of the discussion (at least to point
>>>> some things the right direction and get feedback) let's think about the
>>>> -target-abi option for ARM. It sets the ABI for the backend on the target
>>>> machine. It probably shouldn't change on a per function basis (wait, except
>>>> for the calling convention which can be changed via an attribute, fun!), so
>>>> we want it to control global data layout and valid types which means we
>>>> need it on the TargetMachine. The question is how to pass this information
>>>> into the backend for such an option - right now, if we wanted, we could
>>>> punt to using the command line options that currently exist and the
>>>> existing ParseCommandLineOptions call from clang.
>>>>
>>>>
>>> When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine
>>> passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have
>>> anything else we have to pass to the backend in this case?
>>>
>>
>> Not yet, but the option can be a backend option rather than a subtarget
>> feature - and I'm probably going to do that in the short term as it's no
>> worse than the current status, see the next line here below :)
>>
>> -eric
>>
>>
>>>
>>>
>>>> That said, it's a pretty awful hack to use them. For some of my
>>>> particular needs I've got some code around unifying the DataLayout creation
>>>> between the front end and back end, but it's a partial solution for the
>>>> general problem and I'll probably run into more issues in this arena.
>>>>
>>>> So, trying to come up with ways to communicate all of this via the API
>>>> so that we can set ABIs for targets that need it (ARM, MIPS, others), as
>>>> well as probably other options (anything passed via -backend-option is a
>>>> good candidate).
>>>>
>>>> One thought is expanding and/or deriving target specific versions of
>>>> TargetOptions and populating that on creation from the front end. We'd want
>>>> to be careful with the target specific bits and constructing defaults,
>>>> probably something off of the bits in Targets.cpp would be appropriate. A
>>>> lot of this type of code leads us down the TargetSupport library or
>>>> something of the sort - classes to help describe or create backend
>>>> constructs.
>>>>
>>>> Anyhow, this is my current thinking on how to do API building of
>>>> TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds
>>>> complicated in all of the same ways without any particular gain).
>>>>
>>>>
>
> If we wanted to, in EmitAssemblyHelper::CreateTargetMachine, we could pass
> module-level CodeGenOpts.BackendOptions (backend options that don't need to
> change on a per-function basis) to Target::createTargetMachine. We can
> either pass it directly to the function or add a new field in
> llvm::TargetOptions that holds the options. If we can do that, we don't
> have to call ParseCommandLineOptions from
> EmitAssemblyHelper::CreateTargetMachine.
>
>
Yeah, I've been debating something like that. I'm not a huge fan of strings
as a method of passing options around, but I'm starting to get worn down
for sure. As far as ParseCommandLineOptions, the only deal is either we'd
need to invent a new option format or we'd basically be calling it in the
constructor for TargetMachine. I think I'd rather the new option format
than that though. :\

For my current problem I am going to end up just passing ABI as a string in
TargetOptions, it's honestly the cleanest way to do it at the moment and it
can be shared by multiple targets.

-eric

>
>
>> -eric
>>>>
>>>>
>>>>> On Dec 1, 2014, at 10:53 AM, Eric Christopher <echristo at gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Bob,
>>>>>
>>>>> I've been giving this some thought, comments inline:
>>>>>
>>>>> On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com>
>>>>> wrote:
>>>>>
>>>>>> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.
>>>>>>> com> wrote:
>>>>>>>
>>>>>>>> So my general concern here is that lots of command line options
>>>>>>>> that don't need to be or shouldn't be IR level constructs become oddly
>>>>>>>> named string options. It's bad enough that we're doing that with the target
>>>>>>>> cpu and target feature string let alone the rest of it.
>>>>>>>>
>>>>>>>> If we're talking about things that end up changing the subtarget
>>>>>>>> used we definitely need to make this part of the lookup key - at which
>>>>>>>> point we're going to want a unified interface to all of them. I haven't
>>>>>>>> applied your patch to see what the IR changes look like here.
>>>>>>>>
>>>>>>>>
>>>>>>> With the patches applied, the attributes clang embeds in the IR look
>>>>>>> like this:
>>>>>>>
>>>>>>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>>>>>>> -menable-no-nans
>>>>>>>
>>>>>>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>>>>>>> "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false"
>>>>>>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>>>>>>> "no-infs-fp-math"="false" "stack-protector-buffer-size"="8"
>>>>>>> "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs"
>>>>>>> "unsafe-fp-math"="false" "use-soft-float"="true" }
>>>>>>>
>>>>>>> I'm guessing you want the front-end to convert command line options
>>>>>>> like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as
>>>>>>> odd?
>>>>>>>
>>>>>>
>>>>>> Right, that was part of the idea, the rest is that we define things
>>>>>> that actually make sense as front end command line options and use those to
>>>>>> communicate via IR constructs or api calls I.e. reserved registers don't
>>>>>> need to be a specific back end option, but rather can be passed along when
>>>>>> creating a target machine, etc.
>>>>>>
>>>>>> -eric
>>>>>>
>>>>>>
>>>>>> No, reserved registers (I assume you’re referring to “arm-reserve-r9”
>>>>>> in Akira’s example) don’t necessarily need to be backend options. But,
>>>>>> those kinds of options do need to be recorded in the IR and the target
>>>>>> machine needs to reflect the relevant IR settings. Otherwise, we’ll still
>>>>>> be left with a broken LTO model. E.G., I build one LTO object with
>>>>>> “arm-reserve-r9” and another without it, but the LTO compilation still
>>>>>> needs to respect those options for the code in each object. The details of
>>>>>> how a particular option should be handled will vary depending on the
>>>>>> option, and we need to discuss each one separately.
>>>>>>
>>>>>>
>>>>> ABI breaking options should probably be considered part of
>>>>> TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
>>>>> creation it can be on the function as a feature string attribute, if it
>>>>> affects global code generation it needs to be at TargetMachine creation
>>>>> time).
>>>>>
>>>>>
>>>>>> It’s pretty clear, though, that there is a class of backend options
>>>>>> that translate nicely to function attributes. This RFC is about the
>>>>>> mechanism that we use to handle those (without specifying which specific
>>>>>> options will be handled that way).
>>>>>>
>>>>>>
>>>>> Fair enough. For function attributes I can see a desire to have (as
>>>>> Chris posted in an email from someone at synopsys) a generic target
>>>>> specific dictionary mapping on functions for backend specific uses. I just
>>>>> haven't seen anything in the thread that would be applicable for this
>>>>> rather than an actual function attribute. That said, I would like some sort
>>>>> of delineation for target function attributes as well. Thoughts on that?
>>>>> Perhaps it could look like the string dictionary we've mentioned, maybe not.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> From looking at the patch it seems that the command line option for
>>>>>>>> NaNs should be a general attribute on the function rather than a
>>>>>>>> TargetMachine option anyhow, so it seems that should be migrated down.
>>>>>>>> Whether or not it is exposed as an llvm attribute or a string based key
>>>>>>>> value pair is an interesting discussion. Also, would we want to, instead,
>>>>>>>> expose the backend options as clang IR level options? Seems to make sense
>>>>>>>> from that perspective and then they can add the appropriate attributes to
>>>>>>>> the functions themselves. (-mlongcalls for the long calls option for
>>>>>>>> example).
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>>
>>>>>>>> -eric
>>>>>>>>
>>>>>>>
>>>>>> Again, I think we can evaluate specific options on a case-by-case
>>>>>> basis. For some of them, it might make sense to expose them as clang
>>>>>> options. Like you, I had previously assumed we would teach the front-end
>>>>>> about all the backend options. I was a little surprised by the approach
>>>>>> Akira suggested here, but after giving it some thought, I think it’s a
>>>>>> great idea. There are a LOT of target-specific backend options. Some of
>>>>>> them do not need to be IR, but many do. Teaching clang about all of them
>>>>>> would be a lot of work, and I don’t see any clear benefit from that. This
>>>>>> approach lets clang remain relatively independent of the backend options.
>>>>>> We still need to think carefully about which backend options use this
>>>>>> mechanism, of course.
>>>>>>
>>>>>>
>>>>> I'm unconvinced here...
>>>>>
>>>>>
>>>>>> Are there other reasons why you think we should add all those options
>>>>>> to clang? I just don’t see much to be gained by that.
>>>>>>
>>>>>>
>>>>> .... I think that anything you'd like to expose should probably be a
>>>>> front end option. If nothing else I don't see how you plan on annotating
>>>>> the IR. Basically you're constructing a parallel option hierarchy and that
>>>>> seems... redundant. I agree that it's nice for tools to be able to override
>>>>> things in the backend - for testing if nothing else. We do this a lot by
>>>>> checking options and values in the code. A way of streamlining this would
>>>>> be nice - but I don't think that's the goal of these patches no?
>>>>>
>>>>>
>>>>>> Do you have objections to the use of string-based key/value pairs? I
>>>>>> believe the intention all along, going back to Bill’s work on this a few
>>>>>> years ago, was that we would add new attributes for target-independent
>>>>>> things and use strings for the target-specific options. That seems like a
>>>>>> reasonable starting point to me.
>>>>>>
>>>>>>
>>>>> Only some as I mentioned above - I haven't seen anything that needs to
>>>>> be target specific in this way, but I'm open to the idea :)
>>>>>
>>>>> As a note, this is all for function level things. We need a solution
>>>>> for non-function/module level attributes and I think that's going to look
>>>>> like API calls into the backend for TargetMachine creation - there may be a
>>>>> list of things that would turn into module level attributes, but that is a
>>>>> direction fraught with danger and I'd like to avoid that as much as
>>>>> possible.
>>>>>
>>>>> (As a note, the particular patch that Akira has is unacceptable
>>>>> because of the changes to the subtarget lookup, but that's a separate issue
>>>>> :)
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>>>
>>>>>>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <
>>>>>>>> ahatanak at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Updated patch is attached. Note this is just a work-in-progress
>>>>>>>>> patch and I plan to address the feedback comments later if this patch is in
>>>>>>>>> the right direction.
>>>>>>>>>
>>>>>>>>> This is how the command line options are parsed and used by the
>>>>>>>>> backend passes:
>>>>>>>>>
>>>>>>>>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions.
>>>>>>>>> Any of the options that should be written to the bitcode as function or
>>>>>>>>> module attributes (these options are declared as CodeGenOpt, which is a
>>>>>>>>> subclass of cl::opt) are saved to Function or ModuleIntAttrs in
>>>>>>>>> CodeGenOptions.cpp.
>>>>>>>>> 2. setFunctionAttribute is called to move the function and module
>>>>>>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>>>>>>> TargetOptions are moved to the bitcode too. Pre-existing attributes in the
>>>>>>>>> bitcode are overwritten by the options in the command line.
>>>>>>>>> 3. The backend passes and subtarget classes call
>>>>>>>>> getFunctionAttribute to read the attributes stored to the bitcode in step
>>>>>>>>> 2. Function::hasFnAttribute can be called directly (for example,
>>>>>>>>> NoNaNsFPMath in the patch), if the default value is known to be "false".
>>>>>>>>>
>>>>>>>>> I also made the following changes in the patch:
>>>>>>>>>
>>>>>>>>> 1. In the constructor of MachineFunction, call the version of
>>>>>>>>> getSubtargetImpl that takes a Function parameter, so that it gets the
>>>>>>>>> Subtarget specific to the Function being compiled.
>>>>>>>>> 2. Change ARMTargetMachine::SubtargetMap to be a
>>>>>>>>> DenseMap<Function*, unique_ptr<ARMSubtarget>>. This is just a temporary
>>>>>>>>> change to ease debugging and should be reverted to a StringMap or changed
>>>>>>>>> to another type of map later.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <
>>>>>>>>> echristo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <
>>>>>>>>>>> dexonsmith at apple.com> wrote:
>>>>>>>>>>> >
>>>>>>>>>>> > +chrisb
>>>>>>>>>>> >
>>>>>>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> I'm working on fixing PR21471, which is about embedding
>>>>>>>>>>> codegen command line options into the bitcode as function or module-level
>>>>>>>>>>> attributes so that they don't get ignored when doing LTO.
>>>>>>>>>>> >>
>>>>>>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>>>>>>> >>
>>>>>>>>>>> >> I have an initial patch (attached to this email) which
>>>>>>>>>>> enables clang/llvm to recognize one command line option, write it to the
>>>>>>>>>>> IR, and read it out in a backend pass. I'm looking to get feedback from the
>>>>>>>>>>> community on whether I'm headed in the right direction or whether there are
>>>>>>>>>>> alternate ideas before I go all the way on fixing the PR. Specifically, I'd
>>>>>>>>>>> like to know the answers to the following questions:
>>>>>>>>>>> >>
>>>>>>>>>>> >> 1. How do we make sure we continue to be able to use the
>>>>>>>>>>> command line options we've been using for llc and other tools?
>>>>>>>>>>> >> 2. How to handle cases where two functions in a module have
>>>>>>>>>>> different sets of command line options?
>>>>>>>>>>> >> 3. Where should the command line options or module/function
>>>>>>>>>>> attributes be stored once they are read out from the IR?
>>>>>>>>>>> >>
>>>>>>>>>>> >> The short description of the approach I took in my patch is
>>>>>>>>>>> that command line options that are important to codegen are collected by
>>>>>>>>>>> cl::ParseCommandLineOptions, written to the bitcode as function or module
>>>>>>>>>>> attributes, and read out directly by the optimization passes that need
>>>>>>>>>>> them. cl::opt options are replaced with CodeGenOpt options (subclass of
>>>>>>>>>>> cl::opt), which are needed only to parse the command line and provide the
>>>>>>>>>>> default value when the corresponding options are not in the bitcode.
>>>>>>>>>>> >
>>>>>>>>>>> > I like this approach, since it means the frontend doesn't have
>>>>>>>>>>> to understand
>>>>>>>>>>> > options in order to pass them on to the backend.
>>>>>>>>>>>
>>>>>>>>>>> I agree. It’s not the approach that was I was expecting, but
>>>>>>>>>>> after reading the patches, I like it.
>>>>>>>>>>>
>>>>>>>>>>> If we can get consensus on the approach, I suggest that we stage
>>>>>>>>>>> the work to first adopt Akira’s patches and then we can incrementally
>>>>>>>>>>> convert various options over to use it. There may be complications for some
>>>>>>>>>>> options, so I’d like to see separate patches for each one (possibly with a
>>>>>>>>>>> few similar options combined in one patch).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> I'm not sold quite yet :)
>>>>>>>>>>
>>>>>>>>>> Akira: Can you show an example of how you expect this to work in
>>>>>>>>>> the IR and how the backend is going to process?
>>>>>>>>>>
>>>>>>>>>> Thanks.
>>>>>>>>>>
>>>>>>>>>> -eric
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> >
>>>>>>>>>>> > The static variables should be straightforward to migrate to
>>>>>>>>>>> an LLVMContext
>>>>>>>>>>> > once ParseCommandLineOptions stores things there instead of in
>>>>>>>>>>> globals.
>>>>>>>>>>> >
>>>>>>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>>> >> new file mode 100644
>>>>>>>>>>> >> index 0000000..2d74c2f
>>>>>>>>>>> >> --- /dev/null
>>>>>>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>>>> >> @@ -0,0 +1,59 @@
>>>>>>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.
>>>>>>>>>>>    --*- C++ -*-===//
>>>>>>>>>>> >> +//
>>>>>>>>>>> >> +//                     The LLVM Compiler Infrastructure
>>>>>>>>>>> >> +//
>>>>>>>>>>> >> +// This file is distributed under the University of Illinois
>>>>>>>>>>> Open Source
>>>>>>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>>>>>>> >> +//
>>>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>>>> ----------------===//
>>>>>>>>>>> >> +//
>>>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>>>> ----------------===//
>>>>>>>>>>> >> +
>>>>>>>>>>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>>>>>>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>>>>>>> >> +
>>>>>>>>>>> >> +using namespace llvm;
>>>>>>>>>>> >> +
>>>>>>>>>>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>>>>>>>>>>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>>>>>>>>>>> >> +
>>>>>>>>>>> >
>>>>>>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>>>>>>> >
>>>>>>>>>>> > _______________________________________________
>>>>>>>>>>> > LLVM Developers mailing list
>>>>>>>>>>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> _______________________________________________
>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141205/754bf915/attachment.html>