[LLVMdev] [RFC] Embedding command line options in bitcode (PR21471)

Mon Dec 1 10:53:54 PST 2014

Hi Bob,

I've been giving this some thought, comments inline:

On Wed Nov 19 2014 at 6:24:18 PM Bob Wilson <bob.wilson at apple.com> wrote:

> On Nov 19, 2014, at 4:52 PM, Eric Christopher <echristo at gmail.com> wrote:
>
>
>
> On Wed Nov 19 2014 at 4:39:42 PM Akira Hatanaka <ahatanak at gmail.com>
> wrote:
>
>> On Wed, Nov 19, 2014 at 3:28 PM, Eric Christopher <echristo at gmail.com>
>> wrote:
>>
>>> So my general concern here is that lots of command line options that
>>> don't need to be or shouldn't be IR level constructs become oddly named
>>> string options. It's bad enough that we're doing that with the target cpu
>>> and target feature string let alone the rest of it.
>>>
>>> If we're talking about things that end up changing the subtarget used we
>>> definitely need to make this part of the lookup key - at which point we're
>>> going to want a unified interface to all of them. I haven't applied your
>>> patch to see what the IR changes look like here.
>>>
>>>
>> With the patches applied, the attributes clang embeds in the IR look like
>> this:
>>
>> $ clang -cc1 ... -mllvm -arm-long-calls -mllvm -arm-reserve-r9
>> -menable-no-nans
>>
>> attributes #0 = { nounwind readonly ssp "NoNaNsFPMath"
>> "arm-long-calls"="1" "arm-reserve-r9"="1" "less-precise-fpmad"="false"
>> "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"
>> "no-infs-fp-math"="false" "stack-protector-buffer-size"="8"
>> "target-cpu"="arm7tdmi" "target-features"="-neon,-crypto,+apcs"
>> "unsafe-fp-math"="false" "use-soft-float"="true" }
>>
>> I'm guessing you want the front-end to convert command line options like
>> "arm-long-calls" or "arm-reserve-r9" to something that doesn't look as odd?
>>
>
> Right, that was part of the idea, the rest is that we define things that
> actually make sense as front end command line options and use those to
> communicate via IR constructs or api calls I.e. reserved registers don't
> need to be a specific back end option, but rather can be passed along when
> creating a target machine, etc.
>
> -eric
>
>
> No, reserved registers (I assume you’re referring to “arm-reserve-r9” in
> Akira’s example) don’t necessarily need to be backend options. But, those
> kinds of options do need to be recorded in the IR and the target machine
> needs to reflect the relevant IR settings. Otherwise, we’ll still be left
> with a broken LTO model. E.G., I build one LTO object with “arm-reserve-r9”
> and another without it, but the LTO compilation still needs to respect
> those options for the code in each object. The details of how a particular
> option should be handled will vary depending on the option, and we need to
> discuss each one separately.
>
>
ABI breaking options should probably be considered part of
TargetMachine/Subtarget creation and handled accordingly (if it's subtarget
creation it can be on the function as a feature string attribute, if it
affects global code generation it needs to be at TargetMachine creation
time).

> It’s pretty clear, though, that there is a class of backend options that
> translate nicely to function attributes. This RFC is about the mechanism
> that we use to handle those (without specifying which specific options will
> be handled that way).
>
>
Fair enough. For function attributes I can see a desire to have (as Chris
posted in an email from someone at synopsys) a generic target specific
dictionary mapping on functions for backend specific uses. I just haven't
seen anything in the thread that would be applicable for this rather than
an actual function attribute. That said, I would like some sort of
delineation for target function attributes as well. Thoughts on that?
Perhaps it could look like the string dictionary we've mentioned, maybe not.

>
>
>> From looking at the patch it seems that the command line option for NaNs
>>> should be a general attribute on the function rather than a TargetMachine
>>> option anyhow, so it seems that should be migrated down. Whether or not it
>>> is exposed as an llvm attribute or a string based key value pair is an
>>> interesting discussion. Also, would we want to, instead, expose the backend
>>> options as clang IR level options? Seems to make sense from that
>>> perspective and then they can add the appropriate attributes to the
>>> functions themselves. (-mlongcalls for the long calls option for example).
>>>
>>> Thanks!
>>>
>>> -eric
>>>
>>
> Again, I think we can evaluate specific options on a case-by-case basis.
> For some of them, it might make sense to expose them as clang options. Like
> you, I had previously assumed we would teach the front-end about all the
> backend options. I was a little surprised by the approach Akira suggested
> here, but after giving it some thought, I think it’s a great idea. There
> are a LOT of target-specific backend options. Some of them do not need to
> be IR, but many do. Teaching clang about all of them would be a lot of
> work, and I don’t see any clear benefit from that. This approach lets clang
> remain relatively independent of the backend options. We still need to
> think carefully about which backend options use this mechanism, of course.
>
>
I'm unconvinced here...

> Are there other reasons why you think we should add all those options to
> clang? I just don’t see much to be gained by that.
>
>
.... I think that anything you'd like to expose should probably be a front
end option. If nothing else I don't see how you plan on annotating the IR.
Basically you're constructing a parallel option hierarchy and that seems...
redundant. I agree that it's nice for tools to be able to override things
in the backend - for testing if nothing else. We do this a lot by checking
options and values in the code. A way of streamlining this would be nice -
but I don't think that's the goal of these patches no?

> Do you have objections to the use of string-based key/value pairs? I
> believe the intention all along, going back to Bill’s work on this a few
> years ago, was that we would add new attributes for target-independent
> things and use strings for the target-specific options. That seems like a
> reasonable starting point to me.
>
>
Only some as I mentioned above - I haven't seen anything that needs to be
target specific in this way, but I'm open to the idea :)

As a note, this is all for function level things. We need a solution for
non-function/module level attributes and I think that's going to look like
API calls into the backend for TargetMachine creation - there may be a list
of things that would turn into module level attributes, but that is a
direction fraught with danger and I'd like to avoid that as much as
possible.

(As a note, the particular patch that Akira has is unacceptable because of
the changes to the subtarget lookup, but that's a separate issue :)

-eric

>
>>> On Tue Nov 18 2014 at 12:26:49 PM Akira Hatanaka <ahatanak at gmail.com>
>>> wrote:
>>>
>>>> Updated patch is attached. Note this is just a work-in-progress patch
>>>> and I plan to address the feedback comments later if this patch is in the
>>>> right direction.
>>>>
>>>> This is how the command line options are parsed and used by the backend
>>>> passes:
>>>>
>>>> 1. Tools such as clang and llc call cl::ParseCommandLineOptions. Any of
>>>> the options that should be written to the bitcode as function or module
>>>> attributes (these options are declared as CodeGenOpt, which is a subclass
>>>> of cl::opt) are saved to Function or ModuleIntAttrs in CodeGenOptions.cpp.
>>>> 2. setFunctionAttribute is called to move the function and module
>>>> options saved in step 1 to the bitcode. Also, options stored in
>>>> TargetOptions are moved to the bitcode too. Pre-existing attributes in the
>>>> bitcode are overwritten by the options in the command line.
>>>> 3. The backend passes and subtarget classes call getFunctionAttribute
>>>> to read the attributes stored to the bitcode in step 2.
>>>> Function::hasFnAttribute can be called directly (for example, NoNaNsFPMath
>>>> in the patch), if the default value is known to be "false".
>>>>
>>>> I also made the following changes in the patch:
>>>>
>>>> 1. In the constructor of MachineFunction, call the version of
>>>> getSubtargetImpl that takes a Function parameter, so that it gets the
>>>> Subtarget specific to the Function being compiled.
>>>> 2. Change ARMTargetMachine::SubtargetMap to be a DenseMap<Function*,
>>>> unique_ptr<ARMSubtarget>>. This is just a temporary change to ease
>>>> debugging and should be reverted to a StringMap or changed to another type
>>>> of map later.
>>>>
>>>>
>>>> On Mon, Nov 17, 2014 at 11:40 AM, Eric Christopher <echristo at gmail.com>
>>>>  wrote:
>>>>
>>>>>
>>>>>
>>>>> On Mon Nov 17 2014 at 9:56:40 AM Bob Wilson <bob.wilson at apple.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> > On Nov 14, 2014, at 2:44 PM, Duncan P. N. Exon Smith <
>>>>>> dexonsmith at apple.com> wrote:
>>>>>> >
>>>>>> > +chrisb
>>>>>> >
>>>>>> >> On 2014-Nov-13, at 16:33, Akira Hatanaka <ahatanak at gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> I'm working on fixing PR21471, which is about embedding codegen
>>>>>> command line options into the bitcode as function or module-level
>>>>>> attributes so that they don't get ignored when doing LTO.
>>>>>> >>
>>>>>> >> http://llvm.org/bugs/show_bug.cgi?id=21471
>>>>>> >>
>>>>>> >> I have an initial patch (attached to this email) which enables
>>>>>> clang/llvm to recognize one command line option, write it to the IR, and
>>>>>> read it out in a backend pass. I'm looking to get feedback from the
>>>>>> community on whether I'm headed in the right direction or whether there are
>>>>>> alternate ideas before I go all the way on fixing the PR. Specifically, I'd
>>>>>> like to know the answers to the following questions:
>>>>>> >>
>>>>>> >> 1. How do we make sure we continue to be able to use the command
>>>>>> line options we've been using for llc and other tools?
>>>>>> >> 2. How to handle cases where two functions in a module have
>>>>>> different sets of command line options?
>>>>>> >> 3. Where should the command line options or module/function
>>>>>> attributes be stored once they are read out from the IR?
>>>>>> >>
>>>>>> >> The short description of the approach I took in my patch is that
>>>>>> command line options that are important to codegen are collected by
>>>>>> cl::ParseCommandLineOptions, written to the bitcode as function or module
>>>>>> attributes, and read out directly by the optimization passes that need
>>>>>> them. cl::opt options are replaced with CodeGenOpt options (subclass of
>>>>>> cl::opt), which are needed only to parse the command line and provide the
>>>>>> default value when the corresponding options are not in the bitcode.
>>>>>> >
>>>>>> > I like this approach, since it means the frontend doesn't have to
>>>>>> understand
>>>>>> > options in order to pass them on to the backend.
>>>>>>
>>>>>> I agree. It’s not the approach that was I was expecting, but after
>>>>>> reading the patches, I like it.
>>>>>>
>>>>>> If we can get consensus on the approach, I suggest that we stage the
>>>>>> work to first adopt Akira’s patches and then we can incrementally convert
>>>>>> various options over to use it. There may be complications for some
>>>>>> options, so I’d like to see separate patches for each one (possibly with a
>>>>>> few similar options combined in one patch).
>>>>>>
>>>>>>
>>>>> I'm not sold quite yet :)
>>>>>
>>>>> Akira: Can you show an example of how you expect this to work in the
>>>>> IR and how the backend is going to process?
>>>>>
>>>>> Thanks.
>>>>>
>>>>> -eric
>>>>>
>>>>>
>>>>>> >
>>>>>> > The static variables should be straightforward to migrate to an
>>>>>> LLVMContext
>>>>>> > once ParseCommandLineOptions stores things there instead of in
>>>>>> globals.
>>>>>> >
>>>>>> >> diff --git a/lib/CodeGen/CodeGenOption.cpp
>>>>>> b/lib/CodeGen/CodeGenOption.cpp
>>>>>> >> new file mode 100644
>>>>>> >> index 0000000..2d74c2f
>>>>>> >> --- /dev/null
>>>>>> >> +++ b/lib/CodeGen/CodeGenOption.cpp
>>>>>> >> @@ -0,0 +1,59 @@
>>>>>> >> +//===- CodeGen/CodeGenOptions.cpp - Code-gen option.
>>>>>>  --*- C++ -*-===//
>>>>>> >> +//
>>>>>> >> +//                     The LLVM Compiler Infrastructure
>>>>>> >> +//
>>>>>> >> +// This file is distributed under the University of Illinois Open
>>>>>> Source
>>>>>> >> +// License. See LICENSE.TXT for details.
>>>>>> >> +//
>>>>>> >> +//===------------------------------------------------------
>>>>>> ----------------===//
>>>>>> >> +//
>>>>>> >> +//===------------------------------------------------------
>>>>>> ----------------===//
>>>>>> >> +
>>>>>> >> +#include "llvm/CodeGen/CodeGenOption.h"
>>>>>> >> +#include "llvm/IR/Attributes.h"
>>>>>> >> +#include "llvm/IR/Module.h"
>>>>>> >> +
>>>>>> >> +using namespace llvm;
>>>>>> >> +
>>>>>> >> +static std::map<std::string, bool> FunctionBoolAttrs;
>>>>>> >> +static std::map<std::string, bool> ModuleBoolAttrs;
>>>>>> >> +
>>>>>> >
>>>>>> > @Chris, should these be using ManagedStatic?
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > LLVM Developers mailing list
>>>>>> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141201/58e6b1cb/attachment.html>