[PATCH] Improved plugin/tool support by expanding an existing attribute

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Thu Feb 9 06:07:34 PST 2017


On Sat, Feb 4, 2017 at 8:26 AM, Marcwell Helpdesk via cfe-commits
<cfe-commits at lists.llvm.org> wrote:
> Many plugins/tools could benefit from having a generic way for communicating control directives directly from the source code to itself (via the AST) when acting, for example, as source code transformers, generators, collectors and the like. Attributes are a suitable way of doing this but most available attributes have predefined functionality and modifying the compiler for every plugin/tool is obviously not an option. There is however one undocumented but existing attribute that could be used for a generic solution if it was slightly modified and expanded - the annotate attribute.
>
> The current functionality of the annotate attribute encompass annotating declarations with arbitrary strings using the GNU spelling. The attached patch expands on this functionality by adding annotation of statements, C++11 spelling and documentation. With the patch applied most of the code could be annotated for the use by any Clang plugin or tool. For a more detailed description of the updated attribute, see patch for "AttrDocs.td".

I definitely agree with the premise for this work -- having a generic
way for Clang to pass attributed information down to LLVM IR is
desirable. However, I'm not convinced that the annotate attribute is
the correct approach over something like pluggable attributes. My
primary concerns stem from the fact that the annotate attribute has no
safe guards for feature collisions (you have to pick a unique string,
or else you collide with someone else's string, but there's nothing
that forces you to do this), has no mechanisms for accepting arguments
in a consistent fashion, and provides no way to check the semantics of
the attribute. It's basically a minimum viable option for lowering
attributed information down to LLVM IR, which is fine for things that
aren't in the user's face, but isn't a good general-purpose solution.

I do not have a problem with adding a C++11 spelling to the annotate
attribute for the cases we already support, and I definitely think we
should document the attribute. But I don't think it makes sense to add
it as a statement attribute (and the current patch also leaves out
type attributes).

Some quick thoughts on the patch:

> Index: include/clang/Basic/Attr.td
> ===================================================================
> --- include/clang/Basic/Attr.td (revision 293612)
> +++ include/clang/Basic/Attr.td (working copy)
> @@ -462,9 +462,10 @@
>  }
>
>  def Annotate : InheritableParamAttr {
> -  let Spellings = [GNU<"annotate">];
> +  let Spellings = [GNU<"annotate">,
> +                   CXX11<"", "annotate", 201612>];

This should be in the clang namespace and should not be given a third
argument (it's not a standards-based attribute).

>    let Args = [StringArgument<"Annotation">];
> -  let Documentation = [Undocumented];
> +  let Documentation = [AnnotateDocs];
>  }
>
>  def ARMInterrupt : InheritableAttr, TargetSpecificAttr<TargetARM> {
> Index: lib/Sema/SemaStmtAttr.cpp
> ===================================================================
> --- lib/Sema/SemaStmtAttr.cpp (revision 293612)
> +++ lib/Sema/SemaStmtAttr.cpp (working copy)
> @@ -23,6 +23,24 @@
>  using namespace clang;
>  using namespace sema;
>
> +static Attr *handleAnnotateAttr(Sema &S, Stmt *St, const AttributeList &A,
> +                                SourceRange Range) {
> +  // Assert string literal as the annotation's argument.
> +  StringRef Str;
> +  if (!S.checkStringLiteralArgumentAttr(A, 0, Str))
> +    return nullptr;
> +
> +  // Assert single argument
> +  if (A.getNumArgs() > 1) {
> +    S.Diag(A.getLoc(), diag::err_attribute_wrong_number_arguments)
> +           << A.getName() << 1;
> +    return nullptr;
> +  }
> +
> +  return ::new (S.Context) AnnotateAttr(A.getRange(), S.Context, Str,
> +                                        A.getAttributeSpellingListIndex());
> +}
> +
>  static Attr *handleFallThroughAttr(Sema &S, Stmt *St, const AttributeList &A,
>                                     SourceRange Range) {
>    FallThroughAttr Attr(A.getRange(), S.Context,
> @@ -273,6 +291,8 @@
>             diag::warn_unhandled_ms_attribute_ignored :
>             diag::warn_unknown_attribute_ignored) << A.getName();
>      return nullptr;
> +  case AttributeList::AT_Annotate:
> +    return handleAnnotateAttr(S, St, A, Range);
>    case AttributeList::AT_FallThrough:
>      return handleFallThroughAttr(S, St, A, Range);
>    case AttributeList::AT_LoopHint:
> Index: test/SemaCXX/attr-annotate.cpp
> ===================================================================
> --- test/SemaCXX/attr-annotate.cpp (nonexistent)
> +++ test/SemaCXX/attr-annotate.cpp (working copy)
> @@ -0,0 +1,17 @@
> +// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
> +
> +[[annotate("foo")]] void foo() {
> +  // C++11 decl annotations
> +  [[annotate("bar")]] int x;
> +  [[annotate(1)]] int y; // expected-error {{'annotate' attribute requires a string}}
> +  [[annotate("bar", 1)]] int z1; // expected-error {{'annotate' attribute takes one argument}}
> +  [[annotate()]] int z2; // expected-error {{'annotate' attribute takes one argument}}
> +  [[annotate]] int z3; // expected-error {{'annotate' attribute takes one argument}}
> +
> +  // C++11 NullStmt annotations
> +  [[annotate("bar")]];
> +  [[annotate(1)]]; // expected-error {{'annotate' attribute requires a string}}
> +  [[annotate("bar", 1)]]; // expected-error {{'annotate' attribute takes one argument}}
> +  [[annotate()]]; // expected-error {{'annotate' attribute takes one argument}}
> +  [[annotate]]; // expected-error {{'annotate' attribute takes one argument}}
> +}

There should be some codegen tests that show the attribute is properly
finding its way down to the LLVM IR. For instance, what would an
annotate attribute on a null statement get lowered to? How about a
switch statement?

> Index: include/clang/Basic/AttrDocs.td
> ===================================================================
> --- include/clang/Basic/AttrDocs.td (revision 293612)
> +++ include/clang/Basic/AttrDocs.td (working copy)
> @@ -2812,3 +2812,97 @@
>  ensure that this class cannot be subclassed.
>    }];
>  }
> +
> +def DocCatGeneric : DocumentationCategory<"Generic Attributes"> {
> +  let Content = [{
> +These attributes are suitable for passing arbitrary information from the code to
> +a plugin or other clang based tools such as source code transformers, generators,
> +collectors and similar. The attributes are not tied to any type of solution or
> +predefined functionality and may be used in a variety of situations when a generic
> +attribute suffice.
> +  }];
> +}
> +
> +def AnnotateDocs : Documentation {
> +  let Category = DocCatGeneric;
> +  let Content = [{
> +The ``annotate`` attribute may be used to annotate any type of declarations or
> +statements with arbitrary information that are propagated to the AST. The attribute
> +takes one string parameter containing the annotation value. The content, format
> +and the exact meaning of the value is up to the programmer to decide.
> +
> +C++11 spelling examples:
> +
> +.. code-block:: c++
> +
> +  class [[annotate("plain")]] Object
> +  { . . . };
> +
> +  int main(int argc, char* argv[])
> +  {
> +    [[annotate("local_var")]]   // Decl annotation
> +    int n = 1 ;
> +
> +    // Multiple Stmt annotations
> +    [[annotate("group-A"), annotate("opt-1:2")]]
> +    switch(n)
> +    { . . . }
> +    return 0;
> +  }
> +
> +Looking at the AST for the example above reveals that any annotated statements
> +are wrapped into an AttributedStmt node together with the annotation attributes
> +in comparison to annotated declarations where the annotation attributes are
> +attached directly beneath the declaration node.
> +
> +.. code-block:: text
> +
> +  TranslationUnitDecl 0x345fb20 <<invalid sloc>> <invalid sloc>
> +  | . . .
> +  |-CXXRecordDecl 0x34b9b58 <example.cpp:2:1, line:10:1> line:2:29 class Object definition
> +  | |-AnnotateAttr 0x34b9c10 <col:9, col:25> "plain"
> +  | |-CXXRecordDecl 0x34b9cb8 <col:1, col:29> col:29 implicit class Object
> +  | |-AccessSpecDecl 0x34b9d50 <line:4:3, col:9> col:3 public
> +  | ` . . .
> +  `-FunctionDecl 0x34ba240 <line:12:1, line:28:1> line:12:5 main 'int (int, char **)'
> +    |-ParmVarDecl 0x34ba020 <col:10, col:14> col:14 argc 'int'
> +    |-ParmVarDecl 0x34ba130 <col:20, col:31> col:26 argv 'char **':'char **'
> +    `-CompoundStmt 0x34ba7f0 <line:13:1, line:28:1>
> +      |-DeclStmt 0x34ba4d0 <line:15:2, col:12>
> +      | `-VarDecl 0x34ba400 <col:2, col:10> col:6 used n 'int' cinit
> +      |   |-IntegerLiteral 0x34ba4b0 <col:10> 'int' 1
> +      |   `-AnnotateAttr 0x34ba460 <line:14:4, col:24> "local_var"
> +      |-AttributedStmt 0x34ba790 <line:17:2, line:26:2>
> +      | |-AnnotateAttr 0x34ba750 <line:17:25, col:43> "opt-1:2"
> +      | |-AnnotateAttr 0x34ba770 <col:4, col:22> "group-A"
> +      | `-SwitchStmt 0x34ba608 <line:18:2, line:26:2>
> +      |  . . .
> +      `-ReturnStmt 0x34ba7d8 <line:27:2, col:9>
> +        `-IntegerLiteral 0x34ba7b8 <col:9> 'int' 0
> +
> +The GNU spelling, ``__attribute((annotate("str")))``, is limited to annotating
> +declarations but is at the same time a bit more relaxed on the allowed insertion
> +points compared to the C++11 spelling that needs to be defined prior to any
> +declaration or statement.
> +
> +In the example below the GNU-only insertion points are shown as __attr.
> +
> +.. code-block:: c++
> +
> +  void __attr function/method(int __attr n) __attr
> +  int __attr n = 1;
> +
> +(Note that it is also possible to annotate arbitrary expressions using the GCC
> +syntax ``int __builtin_annotation(int, "str")`` where the ``int`` is an integer
> +of any compiler supported bit-width and with the return value set to the same
> +as the parameter value. However, the annotation does not show up in the AST as
> +an AnnotateAttr but as an ordinary CallExpr).
> +
> +The annotations of variable, function and method declarations (and expressions)
> +are forwarded into IR during code generation using the LLVM annotation intrinsic
> +functions or by adding the information to a global annotation table. The
> +information added, in either case, are the annotation string, file name and
> +line number of the declaration or expression. These IR annotations may be used
> +by optimizers or other back-end tools.
> +  }];
> +}
>

~Aaron

>
> An example demonstratiing the use and syntax of the updated attribute:
>
>   class [[annotate("plain")]] Object
>   { . . . };
>
>   int main(int argc, char* argv[]) __attribute((annotate("entry-point")))
>   {
>     [[annotate("local_var")]]   // Decl annotation
>     int n = 1 ;
>
>     // Multiple Stmt annotations
>     [[annotate("group-A"), annotate("opt-1:2")]]
>     switch(n)
>     { . . . }
>     return 0;
>   }
>
> Cheers,
> Chris
>
> Modified:
>   include/clang/Basic/Attr.td
>   include/clang/Basic/AttrDocs.td
>   lib/Sema/SemaStmtAttr.cpp
> Added:
>   test/SemaCXX/attr-annotate.cpp
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>


More information about the cfe-commits mailing list