[LLVMdev] [RFC] Passing Options to Different Parts of the Compiler Using Attributes

Tue Nov 20 11:03:44 PST 2012

On Nov 13, 2012, at 12:20 AM, Bill Wendling wrote:

> IR Changes
> ----------
> 
> The attributes will be specified within the IR. This allows us to generate code
> that the user wants. This also has the advantage that it will no longer be
> necessary to specify all of the command line options when compiling the bit code
> (via 'llc' or 'clang'). E.g., '-mcpu=cortex-a8' will be an attribute and won't
> be required on llc's command line. However, explicit flags (like `-mcpu') on the
> llc command line will override flags specified in the module.
> 
> The core of this proposal is the idea of an "attribute group". As the name
> implies, it's a group of attributes that are then referenced by objects within
> the IR. An attribute group is a module-level object. The BNF of the syntax is:
> 
> attribute_group := attrgroup <attrgroup_id> = { <attribute_list> }
> attrgroup_id    := #<number>
> attribute_list  := <attribute> (, <attribute>)*
> attribute       := <name> (= <value>)?
> 
> To use an attribute group, an object references the attribute group's ID:
> 
> attribute_group_ref := attrgroup(<attrgroup_id>)
> 
> This is an example of an attribute group for a function that should always be
> inlined, has stack alignment of 4, and doesn't unwind:
> 
> attrgroup #1 = { alwaysinline, nounwind, alignstack=4 }
> 
> void @foo() attrgroup(#1) { ret void }
> 
> An object may refer to more than one attribute group. In that situation, the
> attributes are merged.
> 
> Attribute groups are important for keeping `.ll' files readable, because a lot
> of functions will use the same attributes. In the degenerative case of a `.ll'
> file that corresponds to a single `.c' file, the single `attrgroup' will capture
> the command line flags used to build that file.

A few comments on the new syntax:

   1. I think most folks will understand what 'attrgroup' means, but it is a little cryptic. 
      How about just 'attributes'?  The following reads easier to my eyes:

         attributes #1 = { alwaysinline, nounwind, alignstack=4 }
         void @foo() attributes(#1) { ret void }

   2. Are group references allowed in all attribute contexts (parameter, return value, function)?
      I think the answer should be yes.  Also, it might be worth considering using the same attribute
      list syntax in the current context and the new attribute group definition (i.e. comma-separated
      v.s. space-separated).  This way we have a consistent syntax for groups of attributes and the
      main addition this proposal adds is to give a name to those attributes for later reference.

   3. Can attribute groups and single attributes be inter-mixed?
      For example:

         void @foo attrgroup(#1) alwaysinline attrgroup(#2) nounwind

   4. Do we really want the attribute references limited to a number?  Code will be more readable
      if you can use actual names that indicate the intent.  For example:

         attrgroup #compile_options = { … }
         void @foo attrgroup(#compile_options)

   5. Can attributes be nested?  For example:

         attrgroup #1 = { foo, bar }
         attrgroup #2 = { #1, baz }

      Might be nice.

   6. Do we really need to specify the attrgroup keyword twice? (Once in the group definition and once in the use)
      ISTM, that the hash-mark is enough to announce a group reference in the use.  For example:

         void @foo #1 alwaysinline #2 no unwind

In other words, I think something like the following might be nicer:

attribute_group := attributes <attrgroup_id> = { <attribute_list> }
attrgroup_id    := #<id>
attribute_list  := <attribute> ( <attribute>)*
attribute       := <name> (= <value>)?
                 | <attribuge_id>

…

function_def    := <attribute_list> <result_type> @<id> ([argument_list]) <attribute_list>

> Target-Dependent Attributes in IR
> ---------------------------------
> 
> The front-end is responsible for knowing which target-dependent options are 
> interesting to the target. Target-dependent attributes are specified as strings,
> which are understood by the target's back-end. E.g.:
> 
> attrgroup #0 = { "long-calls", "cpu=cortex-a8", "thumb" }
> 
> define void @func() attrgroup(#0) { ret void }
> 
> The ARM back-end is the only target that knows about these options and what to
> do with them.
> 
> Some of the `cl::opt' options in the backend could move into attribute groups.
> This will clean up the compiler.
> 

Isn't calling these "target-dependent" a little artificial?  Surely there are many uses
for string attributes one of which is for target-specific data.  I think organizing the
proposal to add these new arbitrary string attributes and using the target-specific bits
as examples will be clearer.

> Updating IR
> -----------
> 
> The current attributes that are specified on functions will be moved into an
> attribute group. The LLVM assembly reader will still honor those but when the
> assembly file is emitted, those attributes will be output as an attribute group
> by the assembly writer. As usual, LLVM 3.3 will be able to read and auto-upgrade
> previous bitcode and `.ll' files.
> 
> Querying
> --------
> 
> The attributes are attached to the function. It's therefore trivial to access
> the attributes within the middle- and the back-ends. Here's an example of how
> attributes are queried:
> 
> Attributes &A = F.getAttributes();
> 
> // Target-independent attribute query.
> A.hasAttribute(Attributes::NoInline);
> 
> // Target-dependent attribute query.
> A.hasAttribute("no-sse");
> 
> // Retrieving value of a target-independent attribute.
> int Alignment = A.getIntValue(Attributes::Alignment);
> 
> // Retrieving value of a target-dependent attribute.
> StringRef CPU = A.getStringValue("cpu");

Maybe some set attribute examples too?

Overall, I think this is a nice addition!

--
Meador Inge
CodeSourcery / Mentor Embedded
http://www.mentor.com/embedded-software