[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Akira Hatanaka ahatanak at gmail.com
Tue Dec 2 16:31:43 PST 2014


On Tue, Dec 2, 2014 at 3:21 PM, Eric Christopher <echristo at gmail.com> wrote:

> On Mon Dec 01 2014 at 4:22:15 PM Bob Wilson <bob.wilson at apple.com> wrote:
>
>> Thanks for your feedback, Eric.
>>
>> I still think we may be talking past each other a little bit, but rather
>> than
>>
>
> Might be, sorry if so :(
>
>
>> delving further into the details right now, I’ve suggested to Akira that
>> he look into how we should handle other kinds of options. I’m hoping that
>> as we look at more of them, we will gain some insight into the approach
>> that we want to take here. This patch really only deals with the easy
>> cases, and whether we use Akira’s approach or not, I’m not especially
>> worried about this kind of option.
>>
>> Let’s come back to this in a little while after we’ve looked at what to
>> do for some other options…..
>>
>>
> I apologize if this seems a bit rambly, but I'm trying to get a lot of
> thoughts on this down:
>
> FWIW I'm looking at some of this at the moment which is why I spoke to a
> few of the different options and needing to control TargetMachine creation.
> Just to continue with some of the discussion (at least to point some things
> the right direction and get feedback) let's think about the -target-abi
> option for ARM. It sets the ABI for the backend on the target machine. It
> probably shouldn't change on a per function basis (wait, except for the
> calling convention which can be changed via an attribute, fun!), so we want
> it to control global data layout and valid types which means we need it on
> the TargetMachine. The question is how to pass this information into the
> backend for such an option - right now, if we wanted, we could punt to
> using the command line options that currently exist and the existing
> ParseCommandLineOptions call from clang.
>
>
When option -target-abi=apcs-gnu for ARM is used for clang,
EmitAssemblyHelper::CreateTargetMachine
passes FeatruresStr="+apcs" to Target::createTargetMachine. Do we have
anything else we have to pass to the backend in this case?


> That said, it's a pretty awful hack to use them. For some of my particular
> needs I've got some code around unifying the DataLayout creation between
> the front end and back end, but it's a partial solution for the general
> problem and I'll probably run into more issues in this arena.
>
> So, trying to come up with ways to communicate all of this via the API so
> that we can set ABIs for targets that need it (ARM, MIPS, others), as well
> as probably other options (anything passed via -backend-option is a good
> candidate).
>
> One thought is expanding and/or deriving target specific versions of
> TargetOptions and populating that on creation from the front end. We'd want
> to be careful with the target specific bits and constructing defaults,
> probably something off of the bits in Targets.cpp would be appropriate. A
> lot of this type of code leads us down the TargetSupport library or
> something of the sort - classes to help describe or create backend
> constructs.
>
> Anyhow, this is my current thinking on how to do API building of
> TargetMachine etc. (Alternately a TargetMachineBuilder? But that sounds
> complicated in all of the same ways without any particular gain).
>
> -eric
>
>
>> On Dec 1, 2014, at 10:53 AM, Eric Christopher <echristo at gmail.com> wrote:
>>
>> Hi Bob,
>>
>> I've been giving this some thought, comments inline:
>>
>> On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com> wrote:
>>
>>> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com>
>>> wrote:
>>>
>>>
>>>
>>> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com>
>>> wrote:
>>>
>>>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.com>
>>>> wrote:
>>>>
>>>>> So my general concern here is that lots of command line options that
>>>>> don't need to be or shouldn't be IR level constructs become oddly named
>>>>> string options. It's bad enough that we're doing that with the target cpu
>>>>> and target feature string let alone the rest of it.
>>>>>
>>>>> If we're talking about things that end up changing the subtarget used
>>>>> we definitely need to make this part of the lookup key - at which point
>>>>> we're going to want a unified interface to all of them. I haven't applied
>>>>> your patch to see what the IR changes look like here.
>>>>>
>>>>>
>>>> With the patches applied, the attributes clang embeds in the IR look
>>>> like this:
>>>>
>>>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>>>> -menable-no-nans
>>>>
>>>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>>>> "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false"
>>>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>>>> "no-infs-fp-math"="false" "stack-protector-buffer-size"="8"
>>>> "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs"
>>>> "unsafe-fp-math"="false" "use-soft-float"="true" }
>>>>
>>>> I'm guessing you want the front-end to convert command line options
>>>> like "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as
>>>> odd?
>>>>
>>>
>>> Right, that was part of the idea, the rest is that we define things that
>>> actually make sense as front end command line options and use those to
>>> communicate via IR constructs or api calls I.e. reserved registers don't
>>> need to be a specific back end option, but rather can be passed along when
>>> creating a target machine, etc.
>>>
>>> -eric
>>>
>>>
>>> No, reserved registers (I assume you’re referring to “arm-reserve-r9” in
>>> Akira’s example) don’t necessarily need to be backend options. But, those
>>> kinds of options do need to be recorded in the IR and the target machine
>>> needs to reflect the relevant IR settings. Otherwise, we’ll still be left
>>> with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9”
>>> and another without it, but the LTO compilation still needs to respect
>>> those options for the code in each object. The details of how a particular
>>> option should be handled will vary depending on the option, and we need to
>>> discuss each one separately.
>>>
>>>
>> ABI breaking options should probably be considered part of
>> TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
>> creation it can be on the function as a feature string attribute, if it
>> affects global code generation it needs to be at TargetMachine creation
>> time).
>>
>>
>>> It’s pretty clear, though, that there is a class of backend options that
>>> translate nicely to function attributes. This RFC is about the mechanism
>>> that we use to handle those (without specifying which specific options will
>>> be handled that way).
>>>
>>>
>> Fair enough. For function attributes I can see a desire to have (as Chris
>> posted in an email from someone at synopsys) a generic target specific
>> dictionary mapping on functions for backend specific uses. I just haven't
>> seen anything in the thread that would be applicable for this rather than
>> an actual function attribute. That said, I would like some sort of
>> delineation for target function attributes as well. Thoughts on that?
>> Perhaps it could look like the string dictionary we've mentioned, maybe not.
>>
>>
>>>
>>>
>>>> From looking at the patch it seems that the command line option for
>>>>> NaNs should be a general attribute on the function rather than a
>>>>> TargetMachine option anyhow, so it seems that should be migrated down.
>>>>> Whether or not it is exposed as an llvm attribute or a string based key
>>>>> value pair is an interesting discussion. Also, would we want to, instead,
>>>>> expose the backend options as clang IR level options? Seems to make sense
>>>>> from that perspective and then they can add the appropriate attributes to
>>>>> the functions themselves. (-mlongcalls for the long calls option for
>>>>> example).
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -eric
>>>>>
>>>>
>>> Again, I think we can evaluate specific options on a case-by-case basis.
>>> For some of them, it might make sense to expose them as clang options. Like
>>> you, I had previously assumed we would teach the front-end about all the
>>> backend options. I was a little surprised by the approach Akira suggested
>>> here, but after giving it some thought, I think it’s a great idea. There
>>> are a LOT of target-specific backend options. Some of them do not need to
>>> be IR, but many do. Teaching clang about all of them would be a lot of
>>> work, and I don’t see any clear benefit from that. This approach lets clang
>>> remain relatively independent of the backend options. We still need to
>>> think carefully about which backend options use this mechanism, of course.
>>>
>>>
>> I'm unconvinced here...
>>
>>
>>> Are there other reasons why you think we should add all those options to
>>> clang? I just don’t see much to be gained by that.
>>>
>>>
>> .... I think that anything you'd like to expose should probably be a
>> front end option. If nothing else I don't see how you plan on annotating
>> the IR. Basically you're constructing a parallel option hierarchy and that
>> seems... redundant. I agree that it's nice for tools to be able to override
>> things in the backend - for testing if nothing else. We do this a lot by
>> checking options and values in the code. A way of streamlining this would
>> be nice - but I don't think that's the goal of these patches no?
>>
>>
>>> Do you have objections to the use of string-based key/value pairs? I
>>> believe the intention all along, going back to Bill’s work on this a few
>>> years ago, was that we would add new attributes for target-independent
>>> things and use strings for the target-specific options. That seems like a
>>> reasonable starting point to me.
>>>
>>>
>> Only some as I mentioned above - I haven't seen anything that needs to be
>> target specific in this way, but I'm open to the idea :)
>>
>> As a note, this is all for function level things. We need a solution for
>> non-function/module level attributes and I think that's going to look like
>> API calls into the backend for TargetMachine creation - there may be a list
>> of things that would turn into module level attributes, but that is a
>> direction fraught with danger and I'd like to avoid that as much as
>> possible.
>>
>> (As a note, the particular patch that Akira has is unacceptable because
>> of the changes to the subtarget lookup, but that's a separate issue :)
>>
>> -eric
>>
>>
>>>
>>>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Updated patch is attached. Note this is just a work-in-progress patch
>>>>>> and I plan to address the feedback comments later if this patch is in the
>>>>>> right direction.
>>>>>>
>>>>>> This is how the command line options are parsed and used by the
>>>>>> backend passes:
>>>>>>
>>>>>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any
>>>>>> of the options that should be written to the bitcode as function or module
>>>>>> attributes (these options are declared as CodeGenOpt, which is a subclass
>>>>>> of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
>>>>>> 2. setFunctionAttribute is called to move the function and module
>>>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>>>> TargetOptions are moved to the bitcode too. Pre-existing attributes in the
>>>>>> bitcode are overwritten by the options in the command line.
>>>>>> 3. The backend passes and subtarget classes call getFunctionAttribute
>>>>>> to read the attributes stored to the bitcode in step 2.
>>>>>> Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath
>>>>>> in the patch), if the default value is known to be "false".
>>>>>>
>>>>>> I also made the following changes in the patch:
>>>>>>
>>>>>> 1. In the constructor of MachineFunction, call the version of
>>>>>> getSubtargetImpl that takes a Function parameter, so that it gets the
>>>>>> Subtarget specific to the Function being compiled.
>>>>>> 2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*,
>>>>>> unique_ptr<ARMSubtarget>>. This is just a temporary change to ease
>>>>>> debugging and should be reverted to a StringMap or changed to another type
>>>>>> of map later.
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at gmail.
>>>>>> com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <
>>>>>>>> dexonsmith at apple.com> wrote:
>>>>>>>> >
>>>>>>>> > +chrisb
>>>>>>>> >
>>>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com>
>>>>>>>> wrote:
>>>>>>>> >>
>>>>>>>> >> I'm working on fixing PR21471, which is about embedding codegen
>>>>>>>> command line options into the bitcode as function or module-level
>>>>>>>> attributes so that they don't get ignored when doing LTO.
>>>>>>>> >>
>>>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>>>> >>
>>>>>>>> >> I have an initial patch (attached to this email) which enables
>>>>>>>> clang/llvm to recognize one command line option, write it to the IR, and
>>>>>>>> read it out in a backend pass. I'm looking to get feedback from the
>>>>>>>> community on whether I'm headed in the right direction or whether there are
>>>>>>>> alternate ideas before I go all the way on fixing the PR. Specifically, I'd
>>>>>>>> like to know the answers to the following questions:
>>>>>>>> >>
>>>>>>>> >> 1. How do we make sure we continue to be able to use the command
>>>>>>>> line options we've been using for llc and other tools?
>>>>>>>> >> 2. How to handle cases where two functions in a module have
>>>>>>>> different sets of command line options?
>>>>>>>> >> 3. Where should the command line options or module/function
>>>>>>>> attributes be stored once they are read out from the IR?
>>>>>>>> >>
>>>>>>>> >> The short description of the approach I took in my patch is that
>>>>>>>> command line options that are important to codegen are collected by
>>>>>>>> cl::ParseCommandLineOptions, written to the bitcode as function or module
>>>>>>>> attributes, and read out directly by the optimization passes that need
>>>>>>>> them. cl::opt options are replaced with CodeGenOpt options (subclass of
>>>>>>>> cl::opt), which are needed only to parse the command line and provide the
>>>>>>>> default value when the corresponding options are not in the bitcode.
>>>>>>>> >
>>>>>>>> > I like this approach, since it means the frontend doesn't have to
>>>>>>>> understand
>>>>>>>> > options in order to pass them on to the backend.
>>>>>>>>
>>>>>>>> I agree. It’s not the approach that was I was expecting, but after
>>>>>>>> reading the patches, I like it.
>>>>>>>>
>>>>>>>> If we can get consensus on the approach, I suggest that we stage
>>>>>>>> the work to first adopt Akira’s patches and then we can incrementally
>>>>>>>> convert various options over to use it. There may be complications for some
>>>>>>>> options, so I’d like to see separate patches for each one (possibly with a
>>>>>>>> few similar options combined in one patch).
>>>>>>>>
>>>>>>>>
>>>>>>> I'm not sold quite yet :)
>>>>>>>
>>>>>>> Akira: Can you show an example of how you expect this to work in the
>>>>>>> IR and how the backend is going to process?
>>>>>>>
>>>>>>> Thanks.
>>>>>>>
>>>>>>> -eric
>>>>>>>
>>>>>>>
>>>>>>>> >
>>>>>>>> > The static variables should be straightforward to migrate to an
>>>>>>>> LLVMContext
>>>>>>>> > once ParseCommandLineOptions stores things there instead of in
>>>>>>>> globals.
>>>>>>>> >
>>>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>> >> new file mode 100644
>>>>>>>> >> index 0000000..2d74c2f
>>>>>>>> >> --- /dev/null
>>>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>>>> >> @@ -0,0 +1,59 @@
>>>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.
>>>>>>>>  --*- C++ -*-===//
>>>>>>>> >> +//
>>>>>>>> >> +//                     The LLVM Compiler Infrastructure
>>>>>>>> >> +//
>>>>>>>> >> +// This file is distributed under the University of Illinois
>>>>>>>> Open Source
>>>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>>>> >> +//
>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>> ----------------===//
>>>>>>>> >> +//
>>>>>>>> >> +//===------------------------------------------------------
>>>>>>>> ----------------===//
>>>>>>>> >> +
>>>>>>>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>>>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>>>> >> +
>>>>>>>> >> +using namespace llvm;
>>>>>>>> >> +
>>>>>>>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>>>>>>>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>>>>>>>> >> +
>>>>>>>> >
>>>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>>>> >
>>>>>>>> > _______________________________________________
>>>>>>>> > LLVM Developers mailing list
>>>>>>>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> LLVM Developers mailing list
>>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> LLVM Developers mailing list
>>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141202/f31d1118/attachment.html>


More information about the llvm-dev mailing list