[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes
Bill Wendling
wendling at apple.com
Mon Nov 26 13:20:45 PST 2012
On Nov 20, 2012, at 11:03 AM, Meador Inge <meadori at codesourcery.com> wrote:
> On Nov 13, 2012, at 12:20 AM, Bill Wendling wrote:
>
>> IR Changes
>> ----------
>>
>> The attributes will be specified within the IR. This allows us to generate code
>> that the user wants. This also has the advantage that it will no longer be
>> necessary to specify all of the command line options when compiling the bit code
>> (via 'llc' or 'clang'). E.g., '-mcpu=cortex-a8' will be an attribute and won't
>> be required on llc's command line. However, explicit flags (like `-mcpu') on the
>> llc command line will override flags specified in the module.
>>
>> The core of this proposal is the idea of an "attribute group". As the name
>> implies, it's a group of attributes that are then referenced by objects within
>> the IR. An attribute group is a module-level object. The BNF of the syntax is:
>>
>> attribute_group := attrgroup <attrgroup_id> = { <attribute_list> }
>> attrgroup_id := #<number>
>> attribute_list := <attribute> (, <attribute>)*
>> attribute := <name> (= <value>)?
>>
>> To use an attribute group, an object references the attribute group's ID:
>>
>> attribute_group_ref := attrgroup(<attrgroup_id>)
>>
>> This is an example of an attribute group for a function that should always be
>> inlined, has stack alignment of 4, and doesn't unwind:
>>
>> attrgroup #1 = { alwaysinline, nounwind, alignstack=4 }
>>
>> void @foo() attrgroup(#1) { ret void }
>>
>> An object may refer to more than one attribute group. In that situation, the
>> attributes are merged.
>>
>> Attribute groups are important for keeping `.ll' files readable, because a lot
>> of functions will use the same attributes. In the degenerative case of a `.ll'
>> file that corresponds to a single `.c' file, the single `attrgroup' will capture
>> the command line flags used to build that file.
>
> A few comments on the new syntax:
>
> 1. I think most folks will understand what 'attrgroup' means, but it is a little cryptic.
> How about just 'attributes'? The following reads easier to my eyes:
>
> attributes #1 = { alwaysinline, nounwind, alignstack=4 }
> void @foo() attributes(#1) { ret void }
>
I don't have a very strong opinion on this.
> 2. Are group references allowed in all attribute contexts (parameter, return value, function)?
> I think the answer should be yes.
It would seem a natural expansion of the attribute groups concept. But I want to make these changes incrementally. So at the beginning this won't happen.
> Also, it might be worth considering using the same attribute
> list syntax in the current context and the new attribute group definition (i.e. comma-separated
> v.s. space-separated). This way we have a consistent syntax for groups of attributes and the
> main addition this proposal adds is to give a name to those attributes for later reference.
>
I also prefer comma separated lists of things. But this could cause some confusion if we expand the concept to parameter attributes. But see below for a potential alternative syntax for the attribute groups.
> 3. Can attribute groups and single attributes be inter-mixed?
> For example:
>
> void @foo attrgroup(#1) alwaysinline attrgroup(#2) nounwind
>
This will be necessary for backwards compatibility. However, running this through this sequence:
$ llvm-as < foo.ll | llvm-dis
would produce:
attrgroup #1 = { ... }
attrgroup #2 = { ... }
attrgroup #3 = { alwaysinline, nounwind }
void @foo() attrgroup(#1) attrgroup(#2) attrgroup(#3)
This is because of how the attributes will be represented internally to LLVM. Let me know if you have strong objections to this.
> 4. Do we really want the attribute references limited to a number? Code will be more readable
> if you can use actual names that indicate the intent. For example:
>
> attrgroup #compile_options = { … }
> void @foo attrgroup(#compile_options)
>
The problem with this is it limits the number of attribute groups to a specific set -- compile options, non-compile options, etc.. There could be many different attribute groups involved, especially during LTO. I realize that the names will be uniqued. But that just adds a number to the existing name. I also want to avoid partitioning of the attributes into arbitrary groups -- i.e., groups with specific names which imply their usage or type.
> 5. Can attributes be nested? For example:
>
> attrgroup #1 = { foo, bar }
> attrgroup #2 = { #1, baz }
>
> Might be nice.
>
I'm not a big fan of this idea. This could open it up to circular attribute groups:
attrgroup #1 = { foo, #2 }
attrgroup #2 = { #1, bar }
which I'm opposed to on moral groups. ;-) A less compelling (but IMHO valid) argument is that it makes the internal representation of attributes that much more complex.
> 6. Do we really need to specify the attrgroup keyword twice? (Once in the group definition and once in the use)
> ISTM, that the hash-mark is enough to announce a group reference in the use. For example:
>
> void @foo #1 alwaysinline #2 no unwind
>
Looking at my example above, my syntax can get a bit wordy. How about this alternative representation?
define void @foo() attrgroup(#1, #2, #3) { ret void }
I don't have a strong opinion though. You're correct that the hash-number combo unambiguously defines an attribute group's use. If others are amenable to this, I can drop the keyword here.
> In other words, I think something like the following might be nicer:
>
> attribute_group := attributes <attrgroup_id> = { <attribute_list> }
> attrgroup_id := #<id>
> attribute_list := <attribute> ( <attribute>)*
> attribute := <name> (= <value>)?
> | <attribuge_id>
>
> …
>
> function_def := <attribute_list> <result_type> @<id> ([argument_list]) <attribute_list>
>
So something like this (no references inside of the 'attributes' statement allowed, cf. above)?
attributes #1 = { noinline, alignstack=4 }
attributes #2 = { "no-sse" }
define void @foo() #1 #2 { ret void }
This seems reasonable to me.
>> Target-Dependent Attributes in IR
>> ---------------------------------
>>
>> The front-end is responsible for knowing which target-dependent options are
>> interesting to the target. Target-dependent attributes are specified as strings,
>> which are understood by the target's back-end. E.g.:
>>
>> attrgroup #0 = { "long-calls", "cpu=cortex-a8", "thumb" }
>>
>> define void @func() attrgroup(#0) { ret void }
>>
>> The ARM back-end is the only target that knows about these options and what to
>> do with them.
>>
>> Some of the `cl::opt' options in the backend could move into attribute groups.
>> This will clean up the compiler.
>>
>
> Isn't calling these "target-dependent" a little artificial? Surely there are many uses
> for string attributes one of which is for target-specific data. I think organizing the
> proposal to add these new arbitrary string attributes and using the target-specific bits
> as examples will be clearer.
>
It's a bit artificial. I basically want to make a small distinction here where anything not target-specific will be defined inside of LangRef.html. So anything that could be used by all targets should be defined there.
>> Updating IR
>> -----------
>>
>> The current attributes that are specified on functions will be moved into an
>> attribute group. The LLVM assembly reader will still honor those but when the
>> assembly file is emitted, those attributes will be output as an attribute group
>> by the assembly writer. As usual, LLVM 3.3 will be able to read and auto-upgrade
>> previous bitcode and `.ll' files.
>>
>> Querying
>> --------
>>
>> The attributes are attached to the function. It's therefore trivial to access
>> the attributes within the middle- and the back-ends. Here's an example of how
>> attributes are queried:
>>
>> Attributes &A = F.getAttributes();
>>
>> // Target-independent attribute query.
>> A.hasAttribute(Attributes::NoInline);
>>
>> // Target-dependent attribute query.
>> A.hasAttribute("no-sse");
>>
>> // Retrieving value of a target-independent attribute.
>> int Alignment = A.getIntValue(Attributes::Alignment);
>>
>> // Retrieving value of a target-dependent attribute.
>> StringRef CPU = A.getStringValue("cpu");
>
> Maybe some set attribute examples too?
>
That would be done through the current AttrBuilder class:
AttrBuilder B;
// Add a target-independent attribute.
B.addAttribute(Attributes::NoInline);
// Add a target-dependent attribute.
B.addAttribute("no-sse");
// Create the attribute object.
Attributes A = Attributes::get(Context, B);
> Overall, I think this is a nice addition!
>
Thanks!
-bw
More information about the llvm-dev
mailing list