[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

Mon Nov 26 13:20:45 PST 2012

On Nov 20, 2012, at 11:03 AM, Meador Inge <meadori at codesourcery.com> wrote:

> On Nov 13, 2012, at 12:20 AM, Bill Wendling wrote:
> 
>> IR Changes
>> ----------
>> 
>> The attributes will be specified within the IR. This allows us to generate code
>> that the user wants. This also has the advantage that it will no longer be
>> necessary to specify all of the command line options when compiling the bit code
>> (via 'llc' or 'clang'). E.g., '-mcpu=cortex-a8' will be an attribute and won't
>> be required on llc's command line. However, explicit flags (like `-mcpu') on the
>> llc command line will override flags specified in the module.
>> 
>> The core of this proposal is the idea of an "attribute group". As the name
>> implies, it's a group of attributes that are then referenced by objects within
>> the IR. An attribute group is a module-level object. The BNF of the syntax is:
>> 
>> attribute_group := attrgroup <attrgroup_id> = { <attribute_list> }
>> attrgroup_id    := #<number>
>> attribute_list  := <attribute> (, <attribute>)*
>> attribute       := <name> (= <value>)?
>> 
>> To use an attribute group, an object references the attribute group's ID:
>> 
>> attribute_group_ref := attrgroup(<attrgroup_id>)
>> 
>> This is an example of an attribute group for a function that should always be
>> inlined, has stack alignment of 4, and doesn't unwind:
>> 
>> attrgroup #1 = { alwaysinline, nounwind, alignstack=4 }
>> 
>> void @foo() attrgroup(#1) { ret void }
>> 
>> An object may refer to more than one attribute group. In that situation, the
>> attributes are merged.
>> 
>> Attribute groups are important for keeping `.ll' files readable, because a lot
>> of functions will use the same attributes. In the degenerative case of a `.ll'
>> file that corresponds to a single `.c' file, the single `attrgroup' will capture
>> the command line flags used to build that file.
> 
> A few comments on the new syntax:
> 
>   1. I think most folks will understand what 'attrgroup' means, but it is a little cryptic. 
>      How about just 'attributes'?  The following reads easier to my eyes:
> 
>         attributes #1 = { alwaysinline, nounwind, alignstack=4 }
>         void @foo() attributes(#1) { ret void }
> 
I don't have a very strong opinion on this.

>   2. Are group references allowed in all attribute contexts (parameter, return value, function)?
>      I think the answer should be yes.

It would seem a natural expansion of the attribute groups concept. But I want to make these changes incrementally. So at the beginning this won't happen.

> Also, it might be worth considering using the same attribute
>      list syntax in the current context and the new attribute group definition (i.e. comma-separated
>      v.s. space-separated).  This way we have a consistent syntax for groups of attributes and the
>      main addition this proposal adds is to give a name to those attributes for later reference.
> 
I also prefer comma separated lists of things. But this could cause some confusion if we expand the concept to parameter attributes. But see below for a potential alternative syntax for the attribute groups.

>   3. Can attribute groups and single attributes be inter-mixed?
>      For example:
> 
>         void @foo attrgroup(#1) alwaysinline attrgroup(#2) nounwind
> 
This will be necessary for backwards compatibility. However, running this through this sequence:

	$ llvm-as < foo.ll | llvm-dis

would produce:

	attrgroup #1 = { ... }
	attrgroup #2 = { ... }
	attrgroup #3 = { alwaysinline, nounwind }

	void @foo() attrgroup(#1) attrgroup(#2) attrgroup(#3)

This is because of how the attributes will be represented internally to LLVM. Let me know if you have strong objections to this.

>   4. Do we really want the attribute references limited to a number?  Code will be more readable
>      if you can use actual names that indicate the intent.  For example:
> 
>         attrgroup #compile_options = { … }
>         void @foo attrgroup(#compile_options)
> 
The problem with this is it limits the number of attribute groups to a specific set -- compile options, non-compile options, etc.. There could be many different attribute groups involved, especially during LTO. I realize that the names will be uniqued. But that just adds a number to the existing name. I also want to avoid partitioning of the attributes into arbitrary groups -- i.e., groups with specific names which imply their usage or type.

>   5. Can attributes be nested?  For example:
> 
>         attrgroup #1 = { foo, bar }
>         attrgroup #2 = { #1, baz }
> 
>      Might be nice.
> 
I'm not a big fan of this idea. This could open it up to circular attribute groups:

	attrgroup #1 = { foo, #2 }
	attrgroup #2 = { #1, bar }

which I'm opposed to on moral groups. ;-) A less compelling (but IMHO valid) argument is that it makes the internal representation of attributes that much more complex.

>   6. Do we really need to specify the attrgroup keyword twice? (Once in the group definition and once in the use)
>      ISTM, that the hash-mark is enough to announce a group reference in the use.  For example:
> 
>         void @foo #1 alwaysinline #2 no unwind
> 
Looking at my example above, my syntax can get a bit wordy. How about this alternative representation?

	define void @foo() attrgroup(#1, #2, #3) { ret void }

I don't have a strong opinion though. You're correct that the hash-number combo unambiguously defines an attribute group's use. If others are amenable to this, I can drop the keyword here.

> In other words, I think something like the following might be nicer:
> 
> attribute_group := attributes <attrgroup_id> = { <attribute_list> }
> attrgroup_id    := #<id>
> attribute_list  := <attribute> ( <attribute>)*
> attribute       := <name> (= <value>)?
>                 | <attribuge_id>
> 
> …
> 
> function_def    := <attribute_list> <result_type> @<id> ([argument_list]) <attribute_list>
> 
So something like this (no references inside of the 'attributes' statement allowed, cf. above)?

	attributes #1 = { noinline, alignstack=4 }
	attributes #2 = { "no-sse" }

	define void @foo() #1 #2 { ret void }

This seems reasonable to me.

>> Target-Dependent Attributes in IR
>> ---------------------------------
>> 
>> The front-end is responsible for knowing which target-dependent options are 
>> interesting to the target. Target-dependent attributes are specified as strings,
>> which are understood by the target's back-end. E.g.:
>> 
>> attrgroup #0 = { "long-calls", "cpu=cortex-a8", "thumb" }
>> 
>> define void @func() attrgroup(#0) { ret void }
>> 
>> The ARM back-end is the only target that knows about these options and what to
>> do with them.
>> 
>> Some of the `cl::opt' options in the backend could move into attribute groups.
>> This will clean up the compiler.
>> 
> 
> Isn't calling these "target-dependent" a little artificial?  Surely there are many uses
> for string attributes one of which is for target-specific data.  I think organizing the
> proposal to add these new arbitrary string attributes and using the target-specific bits
> as examples will be clearer.
> 
It's a bit artificial. I basically want to make a small distinction here where anything not target-specific will be defined inside of LangRef.html. So anything that could be used by all targets should be defined there.

>> Updating IR
>> -----------
>> 
>> The current attributes that are specified on functions will be moved into an
>> attribute group. The LLVM assembly reader will still honor those but when the
>> assembly file is emitted, those attributes will be output as an attribute group
>> by the assembly writer. As usual, LLVM 3.3 will be able to read and auto-upgrade
>> previous bitcode and `.ll' files.
>> 
>> Querying
>> --------
>> 
>> The attributes are attached to the function. It's therefore trivial to access
>> the attributes within the middle- and the back-ends. Here's an example of how
>> attributes are queried:
>> 
>> Attributes &A = F.getAttributes();
>> 
>> // Target-independent attribute query.
>> A.hasAttribute(Attributes::NoInline);
>> 
>> // Target-dependent attribute query.
>> A.hasAttribute("no-sse");
>> 
>> // Retrieving value of a target-independent attribute.
>> int Alignment = A.getIntValue(Attributes::Alignment);
>> 
>> // Retrieving value of a target-dependent attribute.
>> StringRef CPU = A.getStringValue("cpu");
> 
> Maybe some set attribute examples too?
> 
That would be done through the current AttrBuilder class:

	AttrBuilder B;

	// Add a target-independent attribute.
	B.addAttribute(Attributes::NoInline);

	// Add a target-dependent attribute.
	B.addAttribute("no-sse");

	// Create the attribute object.
	Attributes A = Attributes::get(Context, B);

> Overall, I think this is a nice addition!
> 
Thanks!

-bw