[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Eric Christopher echristo at gmail.com
Tue Dec 2 16:38:19 PST 2014


On Tue Dec 02 2014 at 4:31:43 PM Akira Hatanaka <ahatanak at gmail.com> wrote:

> On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <echristo at gmail.com>
> wrote:
>
>> On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <bob.wilson at apple.com> wrote:
>>
>>> Thanks for your feedback, Eric.
>>>
>>> I still think we may be talking past each other a little bit, but rather
>>> than
>>>
>>
>> Might be, sorry if so :(
>>
>>
>>> delving further into the details right now, I’ve suggested to Akira that
>>> he look into how we should handle other kinds of options. I’m hoping that
>>> as we look at more of them, we will gain some insight into the approach
>>> that we want to take here. This patch really only deals with the easy
>>> cases, and whether we use Akira’s approach or not, I’m not especially
>>> worried about this kind of option.
>>>
>>> Let’s come back to this in a little while after we’ve looked at what to
>>> do for some other options…..
>>>
>>>
>> I apologize if this seems a bit rambly, but I'm trying to get a lot of
>> thoughts on this down:
>>
>> FWIW I'm looking at some of this at the moment which is why I spoke to a
>> few of the different options and needing to control TargetMachine creation.
>> Just to continue with some of the discussion (at least to point some things
>> the right direction and get feedback) let's think about the -target-abi
>> option for ARM. It sets the ABI for the backend on the target machine. It
>> probably shouldn't change on a per function basis (wait, except for the
>> calling convention which can be changed via an attribute, fun!), so we want
>> it to control global data layout and valid types which means we need it on
>> the TargetMachine. The question is how to pass this information into the
>> backend for such an option - right now, if we wanted, we could punt to
>> using the command line options that currently exist and the existing
>> ParseCommandLineOptions call from clang.
>>
>>
> When option -target-abi=apcs-gnu for ARM is used for clang, EmitAssemblyHelper::CreateTargetMachine
> passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have
> anything else we have to pass to the backend in this case?
>

Not yet, but the option can be a backend option rather than a subtarget
feature - and I'm probably going to do that in the short term as it's no
worse than the current status, see the next line here below :)

-eric


>
>
>> That said, it's a pretty awful hack to use them. For some of my
>> particular needs I've got some code around unifying the DataLayout creation
>> between the front end and back end, but it's a partial solution for the
>> general problem and I'll probably run into more issues in this arena.
>>
>> So, trying to come up with ways to communicate all of this via the API so
>> that we can set ABIs for targets that need it (ARM, MIPS, others), as well
>> as probably other options (anything passed via -backend-option is a good
>> candidate).
>>
>> One thought is expanding and/or deriving target specific versions of
>> TargetOptions and populating that on creation from the front end. We'd want
>> to be careful with the target specific bits and constructing defaults,
>> probably something off of the bits in Targets.cpp would be appropriate. A
>> lot of this type of code leads us down the TargetSupport library or
>> something of the sort - classes to help describe or create backend
>> constructs.
>>
>> Anyhow, this is my current thinking on how to do API building of
>> TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds
>> complicated in all of the same ways without any particular gain).
>>
>> -eric
>>
>>
>>> On Dec 1, 2014, at 10:53 AM, Eric Christopher <echristo at gmail.com>
>>> wrote:
>>>
>>> Hi Bob,
>>>
>>> I've been giving this some thought, comments inline:
>>>
>>> On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com>
>>> wrote:
>>>
>>>> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>>
>>>>
>>>>
>>>> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com>
>>>> wrote:
>>>>
>>>>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.com>
>>>>>  wrote:
>>>>>
>>>>>> So my general concern here is that lots of command line options that
>>>>>> don't need to be or shouldn't be IR level constructs become oddly named
>>>>>> string options. It's bad enough that we're doing that with the target cpu
>>>>>> and target feature string let alone the rest of it.
>>>>>>
>>>>>> If we're talking about things that end up changing the subtarget used
>>>>>> we definitely need to make this part of the lookup key - at which point
>>>>>> we're going to want a unified interface to all of them. I haven't applied
>>>>>> your patch to see what the IR changes look like here.
>>>>>>
>>>>>>
>>>>> With the patches applied, the attributes clang embeds in the IR look
>>>>> like this:
>>>>>
>>>>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>>>>> -menable-no-nans
>>>>>
>>>>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>>>>> "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false"
>>>>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>>>>> "no-infs-fp-math"="false" "stack-protector-buffer-size"="8"
>>>>> "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs"
>>>>> "unsafe-fp-math"="false" "use-soft-float"="true" }
>>>>>
>>>>> I'm guessing you want the front-end to convert command line options
>>>>> like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as
>>>>> odd?
>>>>>
>>>>
>>>> Right, that was part of the idea, the rest is that we define things
>>>> that actually make sense as front end command line options and use those to
>>>> communicate via IR constructs or api calls I.e. reserved registers don't
>>>> need to be a specific back end option, but rather can be passed along when
>>>> creating a target machine, etc.
>>>>
>>>> -eric
>>>>
>>>>
>>>> No, reserved registers (I assume you’re referring to “arm-reserve-r9”
>>>> in Akira’s example) don’t necessarily need to be backend options. But,
>>>> those kinds of options do need to be recorded in the IR and the target
>>>> machine needs to reflect the relevant IR settings. Otherwise, we’ll still
>>>> be left with a broken LTO model. E.G., I build one LTO object with
>>>> “arm-reserve-r9” and another without it, but the LTO compilation still
>>>> needs to respect those options for the code in each object. The details of
>>>> how a particular option should be handled will vary depending on the
>>>> option, and we need to discuss each one separately.
>>>>
>>>>
>>> ABI breaking options should probably be considered part of
>>> TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
>>> creation it can be on the function as a feature string attribute, if it
>>> affects global code generation it needs to be at TargetMachine creation
>>> time).
>>>
>>>
>>>> It’s pretty clear, though, that there is a class of backend options
>>>> that translate nicely to function attributes. This RFC is about the
>>>> mechanism that we use to handle those (without specifying which specific
>>>> options will be handled that way).
>>>>
>>>>
>>> Fair enough. For function attributes I can see a desire to have (as
>>> Chris posted in an email from someone at synopsys) a generic target
>>> specific dictionary mapping on functions for backend specific uses. I just
>>> haven't seen anything in the thread that would be applicable for this
>>> rather than an actual function attribute. That said, I would like some sort
>>> of delineation for target function attributes as well. Thoughts on that?
>>> Perhaps it could look like the string dictionary we've mentioned, maybe not.
>>>
>>>
>>>>
>>>>
>>>>> From looking at the patch it seems that the command line option for
>>>>>> NaNs should be a general attribute on the function rather than a
>>>>>> TargetMachine option anyhow, so it seems that should be migrated down.
>>>>>> Whether or not it is exposed as an llvm attribute or a string based key
>>>>>> value pair is an interesting discussion. Also, would we want to, instead,
>>>>>> expose the backend options as clang IR level options? Seems to make sense
>>>>>> from that perspective and then they can add the appropriate attributes to
>>>>>> the functions themselves. (-mlongcalls for the long calls option for
>>>>>> example).
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> -eric
>>>>>>
>>>>>
>>>> Again, I think we can evaluate specific options on a case-by-case
>>>> basis. For some of them, it might make sense to expose them as clang
>>>> options. Like you, I had previously assumed we would teach the front-end
>>>> about all the backend options. I was a little surprised by the approach
>>>> Akira suggested here, but after giving it some thought, I think it’s a
>>>> great idea. There are a LOT of target-specific backend options. Some of
>>>> them do not need to be IR, but many do. Teaching clang about all of them
>>>> would be a lot of work, and I don’t see any clear benefit from that. This
>>>> approach lets clang remain relatively independent of the backend options.
>>>> We still need to think carefully about which backend options use this
>>>> mechanism, of course.
>>>>
>>>>
>>> I'm unconvinced here...
>>>
>>>
>>>> Are there other reasons why you think we should add all those options
>>>> to clang? I just don’t see much to be gained by that.
>>>>
>>>>
>>> .... I think that anything you'd like to expose should probably be a
>>> front end option. If nothing else I don't see how you plan on annotating
>>> the IR. Basically you're constructing a parallel option hierarchy and that
>>> seems... redundant. I agree that it's nice for tools to be able to override
>>> things in the backend - for testing if nothing else. We do this a lot by
>>> checking options and values in the code. A way of streamlining this would
>>> be nice - but I don't think that's the goal of these patches no?
>>>
>>>
>>>> Do you have objections to the use of string-based key/value pairs? I
>>>> believe the intention all along, going back to Bill’s work on this a few
>>>> years ago, was that we would add new attributes for target-independent
>>>> things and use strings for the target-specific options. That seems like a
>>>> reasonable starting point to me.
>>>>
>>>>
>>> Only some as I mentioned above - I haven't seen anything that needs to
>>> be target specific in this way, but I'm open to the idea :)
>>>
>>> As a note, this is all for function level things. We need a solution for
>>> non-function/module level attributes and I think that's going to look like
>>> API calls into the backend for TargetMachine creation - there may be a list
>>> of things that would turn into module level attributes, but that is a
>>> direction fraught with danger and I'd like to avoid that as much as
>>> possible.
>>>
>>> (As a note, the particular patch that Akira has is unacceptable because
>>> of the changes to the subtarget lookup, but that's a separate issue :)
>>>
>>> -eric
>>>
>>>
>>>>
>>>>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Updated patch is attached. Note this is just a work-in-progress
>>>>>>> patch and I plan to address the feedback comments later if this patch is in
>>>>>>> the right direction.
>>>>>>>
>>>>>>> This is how the command line options are parsed and used by the
>>>>>>> backend passes:
>>>>>>>
>>>>>>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any
>>>>>>> of the options that should be written to the bitcode as function or module
>>>>>>> attributes (these options are declared as CodeGenOpt, which is a subclass
>>>>>>> of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
>>>>>>> 2. setFunctionAttribute is called to move the function and module
>>>>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>>>>> TargetOptions are moved to the bitcode too. Pre-existing attributes in the
>>>>>>> bitcode are overwritten by the options in the command line.
>>>>>>> 3. The backend passes and subtarget classes call
>>>>>>> getFunctionAttribute to read the attributes stored to the bitcode in step
>>>>>>> 2. Function::hasFnAttribute can be called directly (for example,
>>>>>>> NoNaNsFPMath in the patch), if the default value is known to be "false".
>>>>>>>
>>>>>>> I also made the following changes in the patch:
>>>>>>>
>>>>>>> 1. In the constructor of MachineFunction, call the version of
>>>>>>> getSubtargetImpl that takes a Function parameter, so that it gets the
>>>>>>> Subtarget specific to the Function being compiled.
>>>>>>> 2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*,
>>>>>>> unique_ptr<ARMSubtarget>>. This is just a temporary change to ease
>>>>>>> debugging and should be reverted to a StringMap or changed to another type
>>>>>>> of map later.
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at gmail.
>>>>>>> com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <
>>>>>>>>> dexonsmith at apple.com> wrote:
>>>>>>>>> >
>>>>>>>>> > +chrisb
>>>>>>>>> >
>>>>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com>
>>>>>>>>> wrote:
>>>>>>>>> >>
>>>>>>>>> >> I'm working on fixing PR21471, which is about embedding codegen
>>>>>>>>> command line options into the bitcode as function or module-level
>>>>>>>>> attributes so that they don't get ignored when doing LTO.
>>>>>>>>> >>
>>>>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>>>>> >>
>>>>>>>>> >> I have an initial patch (attached to this email) which enables
>>>>>>>>> clang/llvm to recognize one command line option, write it to the IR, and
>>>>>>>>> read it out in a backend pass. I'm looking to get feedback from the
>>>>>>>>> community on whether I'm headed in the right direction or whether there are
>>>>>>>>> alternate ideas before I go all the way on fixing the PR. Specifically, I'd
>>>>>>>>> like to know the answers to the following questions:
>>>>>>>>> >>
>>>>>>>>> >> 1. How do we make sure we continue to be able to use the
>>>>>>>>> command line options we've been using for llc and other tools?
>>>>>>>>> >> 2. How to handle cases where two functions in a module have
>>>>>>>>> different sets of command line options?
>>>>>>>>> >> 3. Where should the command line options or module/function
>>>>>>>>> attributes be stored once they are read out from the IR?
>>>>>>>>> >>
>>>>>>>>> >> The short description of the approach I took in my patch is
>>>>>>>>> that command line options that are important to codegen are collected by
>>>>>>>>> cl::ParseCommandLineOptions, written to the bitcode as function or module
>>>>>>>>> attributes, and read out directly by the optimization passes that need
>>>>>>>>> them. cl::opt options are replaced with CodeGenOpt options (subclass of
>>>>>>>>> cl::opt), which are needed only to parse the command line and provide the
>>>>>>>>> default value when the corresponding options are not in the bitcode.
>>>>>>>>> >
>>>>>>>>> > I like this approach, since it means the frontend doesn't have
>>>>>>>>> to understand
>>>>>>>>> > options in order to pass them on to the backend.
>>>>>>>>>
>>>>>>>>> I agree. It’s not the approach that was I was expecting, but after
>>>>>>>>> reading the patches, I like it.
>>>>>>>>>
>>>>>>>>> If we can get consensus on the approach, I suggest that we stage
>>>>>>>>> the work to first adopt Akira’s patches and then we can incrementally
>>>>>>>>> convert various options over to use it. There may be complications for some
>>>>>>>>> options, so I’d like to see separate patches for each one (possibly with a
>>>>>>>>> few similar options combined in one patch).
>>>>>>>>>
>>>>>>>>>
>>>>>>>> I'm not sold quite yet :)
>>>>>>>>
>>>>>>>> Akira: Can you show an example of how you expect this to work in
>>>>>>>> the IR and how the backend is going to process?
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> -eric
>>>>>>>>
>>>>>>>>
>>>>>>>>> >
>>>>>>>>> > The static variables should be straightforward to migrate to an
>>>>>>>>> LLVMContext
>>>>>>>>> > once ParseCommandLineOptions stores things there instead of in
>>>>>>>>> globals.
>>>>>>>>> >
>>>>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>> >> new file mode 100644
>>>>>>>>> >> index 0000000..2d74c2f
>>>>>>>>> >> --- /dev/null
>>>>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>>> >> @@ -0,0 +1,59 @@
>>>>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.
>>>>>>>>>  --*- C++ -*-===//
>>>>>>>>> >> +//
>>>>>>>>> >> +//                     The LLVM Compiler Infrastructure
>>>>>>>>> >> +//
>>>>>>>>> >> +// This file is distributed under the University of Illinois
>>>>>>>>> Open Source
>>>>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>>>>> >> +//
>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>> ----------------===//
>>>>>>>>> >> +//
>>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>>> ----------------===//
>>>>>>>>> >> +
>>>>>>>>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>>>>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>>>>> >> +
>>>>>>>>> >> +using namespace llvm;
>>>>>>>>> >> +
>>>>>>>>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>>>>>>>>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>>>>>>>>> >> +
>>>>>>>>> >
>>>>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>>>>> >
>>>>>>>>> > _______________________________________________
>>>>>>>>> > LLVM Developers mailing list
>>>>>>>>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> _______________________________________________
>>>>>>>>> LLVM Developers mailing list
>>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>
>>>>>>>
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141203/d5bf02da/attachment.html>


More information about the llvm-dev mailing list