[cfe-dev] [RFC] New attribute `annotate_type`

Martin Brænne via cfe-dev cfe-dev at lists.llvm.org
Mon Oct 18 06:05:51 PDT 2021


Sorry, this is going to be a long reply again with lots of individual
comments -- but to front-load the conclusion (which should also give some
detail for all the detailed comments):

If we can make [[clang::annotate]] work in type positions, then I would
definitely find that palatable and probably even preferable! I think it may
be non-trivial to implement though.

On Wed, 13 Oct 2021 at 17:46, Aaron Ballman <aaron at aaronballman.com> wrote:

> On Wed, Oct 13, 2021 at 9:19 AM Martin Brænne <mboehme at google.com> wrote:
> >
> >
> >
> > On Tue, 12 Oct 2021 at 22:16, Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >>
> >> On Tue, Oct 12, 2021 at 7:49 AM Martin Brænne <mboehme at google.com>
> wrote:
> >> >
> >> >
> >> >
> >> > On Mon, 11 Oct 2021 at 18:54, Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >> >>
> >> >> On Mon, Oct 11, 2021 at 10:28 AM Martin Brænne via cfe-dev
> >> >> <cfe-dev at lists.llvm.org> wrote:
> >> >> >
> >> >> > I would like to propose a new attribute `annotate_type` that would
> be analogous to the existing `annotate` attribute but for use on types. As
> for `annotate`, the typical use case would be to add annotations to be used
> by static analysis tools.
> >> >> >
> >> >> > I have a draft patch implementing this attribute (
> https://reviews.llvm.org/D111548), but before it's reviewed I wanted to
> solicit some feedback more broadly on whether people feel this makes sense.
> >> >>
> >> >> Thanks for the proposal! One question I have is whether we need an
> >> >> additional attribute for this instead of reworking
> [[clang::annotate]]
> >> >> so that it applies to either a type or a declaration. Is a separate
> >> >> attribute necessary because there may be unfortunate problems with
> >> >> using __attribute__((annotate)) as the spelling?
> >> >
> >> >
> >> > Yes, that's exactly the problem. Today, the GNU spelling can be
> applied both before and after the type name:
> >> >
> >> >   __attribute__((annotate("foo"))) int i1;
> >> >   int __attribute__((annotate("foo"))) i2;
> >> >
> >> > We would need to reinterpret the second variant to refer (or
> "appertain") to the decl-specifier-seq, not to the declaration (which is,
> IIUC, what the C++ standard would prescribe if this was a C++ attribute).
> But as it's possible that this variant is already being used "in the wild"
> with the intent that it should refer to the declaration, I think this isn't
> something we can change.
> >>
> >> Agreed, thanks for the confirmation that this was the reason why we
> >> need a second attribute.
> >>
> >> > Hence, the practical solution seems to be to introduce a separate
> attribute for types, for which it is then unambiguous what it should apply
> to.
> >>
> >> Another possible solution would be to not support
> >> __attribute__((annotate)) as a type attribute (e.g., you can use
> >> [[clang::annotate]] as a declaration or a type attribute, but we would
> >> restrict __attribute__((annotate)) to only ever be a declaration
> >> attribute.).
> >
> > I guess that would be an option too, though I think this would require
> some changes to Clang to allow C++ attributes to be used in this position:
> >
> > int [[clang::annotate]] foo;
>
> Clang already supports type attributes in that position today. e.g.,
> https://godbolt.org/z/rTxoWWfdb


Quoting the godbolt example to save having to follow the link:

int * [[clang::address_space(126)]] ip = nullptr;

This is different from my example above though because the attribute comes
after the `*`, which makes it part of the declarator or, more specifically,
the ptr-operator:

http://eel.is/c++draft/dcl.dcl#nt:ptr-operator

In contrast, with the example `int [[clang::annotate]] foo` that I was
referring to, the attribute is part of the decl-specifier-seq
(specifically, it's an attribute on the defining-type-specifier, as you
also discuss below).

The position Clang doesn't currently support is on decl specifiers.
> e.g., https://godbolt.org/z/rG1aP7Tz5


Agreed.

> See also discussion below.
> >
> >> > As a side note, IIUC, Clang currently more generally lumps GNU
> attributes that occur in these two positions together, right? I'm looking
> at Parser::ParseSimpleDeclaration(), specifically this line:
> >> >
> >> > DS.takeAttributesFrom(Attrs);
> >> >
> >> > For C++ attributes, I believe this would not be correct, but that's
> not (currently) a concern because AFAICT Clang currently doesn't allow C++
> attributes to occur after a decl-specifier-seq at all (because, I presume,
> Clang doesn't yet support any C++ attributes that can occur in this
> position).
> >>
> >> We currently don't support any attributes that appertain to the
> >> declaration specifiers (in any spelling mode).
> >
> >
> > But one of the enumerators in TypeAttrLocation is this:
> >
> >   /// The attribute is in the decl-specifier-seq.
> >   TAL_DeclSpec,
> >
> > And it looks as if it is used?
>
> Ah, yeah, this is confusing stuff. decl-specifier-seq can have
> attributes (http://eel.is/c++draft/dcl.spec#nt:decl-specifier-seq) and
> TAL_DeclSpec tracks that. However, we do not currently support
> attributes on *all kinds of decl specifiers*. e.g., [[supported #1]]
> static [[not supported #2]] int [[supported #3]] i [[supported #4]];
>

Just to make sure we're on the same page: Not all of these attributes are
part of the decl-specfier-seq -- right?

#1 is the attribute-specifier-seq in a simple-declaration:
http://eel.is/c++draft/dcl.dcl#nt:simple-declaration

#4 is the attribute-specifier-seq in a noptr-declarator:
http://eel.is/c++draft/dcl.dcl#nt:noptr-declarator

We support the ones that are in the declaration position (#1), the
> type position (#2),


I think you mean #3?


> and the identifier position (#4) but do not
> support ones on storage class specifiers, function specifiers, or
> other kinds of decl specifiers other than a defining-type-specifier.
> e.g., https://godbolt.org/z/8sq5Wo8sT


I think [[]] is treated differently from known attribute (and unknown
attributes may be treated differently yet again?). Anyway, I think we get
into this more deeply below anyway.

>> GNU attributes will
> >> "slide" to the type or the declaration based on the name of the
> >> attribute when doing semantic processing for them (see the
> >> moveAttrFromListToList() calls in SemaType.cpp for some examples of
> >> this).
> >
> >
> > Thanks, that's an interesting pointer.
> >
> > I wondering now though: What do attributes attach to when the "sliding"
> logic doesn't apply to them (because it doesn't know them)?
>
> Heh, now you're seeing why GNU attributes were replaced by what was
> standardized in C and C++. The answer is: we don't know what to do and
> this can sometimes even impact our ability to parse properly (for
> example, if it's a type attribute in C, it may be the difference
> between ANSI "implicit int" being allowed or not). However, in Clang,
> we do not retain unknown attributes in the AST, so we parse as best we
> can and then drop what we parsed onto the floor if the attribute is
> unknown.
>

Hm. Got it.

This does seem to make a pretty strong argument for what you're suggesting,
namely that we shouldn't support the GNU syntax of the `annotate` attribute
for annotations on types and instead should support only the C++ syntax
(where, thankfully, the standard makes it pretty clear what the attribute
appertains to).

> Maybe I should clarify my use case (which will probably benefit the whole
> discusssion). What I'm looking to be able to do is to attach attributes to
> different levels of a potentially multi-level pointer. For example:
> >
> > int __attribute__((annotate_type("A"))) *
> __attribute__((annotate_type("B"))) * __attribute__((annotate_type("C")))
> pp;
> >
> > The "A" annotation is what I mean by being able to attach an attribute
> to the decl specifiers (maybe my terminology isn't accurate here?).
> >
> > One use case for this is being able to specify which of the levels a
> nullability annotation refers to and not just have it always implicitly
> refer to the outermost pointer, as I believe the existing nullability
> annotations do. (See the Clang fork here for an example that does something
> very similar: https://github.com/sampsyo/quala)
> >
> > Staying with the nullability example, it's important to be able to
> attach an annotation to the base type itself -- imagine the `int` in the
> example above was instead a `unique_ptr`.
> >
> > This does seem to be possible to some extent with Clang today because if
> I feed the example above to the code in my draft change, I get
> `AttributedTypeLoc`s for the types `int`, `int *`, and `int **`, with
> annotations "A", "B", and "C" respectively.
>
> It is possible today, but I note that we might have a bug with the
> [[]] implementation: https://godbolt.org/z/vr81KzG33 -- I'd have to
> dig around more to see what's going on there.
>

Yes, I think this example makes the problem pretty clear, namely that C++
attributes can't be applied to the defining-type-specifier today.

There's an additional wrinkle here in that, IIUC, the `address_space`
attribute only makes sense to apply to a ptr-declarator, not a
defining-type-specifier. I think the code that is causing the attribute to
be rejected here isn't specific to `address_space` though -- I think it's
just the general code we discuss below.


> >> > This also means that if we want to use the proposed `annotate_type`
> attribute in this position, we have to use the GNU spelling.
> >>
> >> I don't think that's correct. We certainly support type attributes
> >> using a C++ spelling already.
> >
> > I think this only works though if the attribute is applied to a
> declarator, but not to the decl specifiers? Here's an excerpt from the test
> in my draft change (https://reviews.llvm.org/D111548):
> >
> >   int __attribute__((annotate_type("bar"))) w;
> >   int [[clang::annotate_type("bar")]] w2; // expected-error
> {{'annotate_type' attribute cannot be applied to types}}
> >
> >   int *__attribute__((annotate_type("bar"))) x;
> >   int *[[clang::annotate_type("bar")]] x2;
> >
> > The error seems to be coming from the following chunk of code in
> Parser::ParseDeclarationSpecifiers(), which is specific to C++ attributes:
> >
> >         // Reject C++11 attributes that appertain to decl specifiers as
> >         // we don't support any C++11 attributes that appertain to decl
> >         // specifiers. This also conforms to what g++ 4.8 is doing.
> >         ProhibitCXX11Attributes(attrs,
> diag::err_attribute_not_type_attr);
>
> Oh hey, this is the possible bug I just noted above. :-D I think we
> should not be rejecting if the decl specifier is a defining type
> specifier because that has utility,


Agree -- I'd like to add support for this in Clang.

I suspect the reason this isn't supported today is because none of the
existing type attributes need to be applied to the
declaring-type-specifier. As noted above, `address_space` only really makes
sense on ptr-declarators. And without having checked, I suspect the other
attributes also refer to properties of pointers (which is where I think
most of the practical reasons for attributing types comes from).

Having said that, it would be great if the nullability attributes could,
for example, be applied to a unique_ptr, but it seems that this isn't the
case (https://godbolt.org/z/qEn68xMz9).


> but I'd have to go stare at the
> standard a bit more to be sure. Declarator parsing is not the easiest
> thing to reason about.
>

Let me know what you conclude.

>From what I've seen in the standard (see links that you and I posted
above), I think it's pretty clear where this is supposed to slot into the
grammar. But I assume what you're referring to is more the problem of how
to make this parse unambiguously in Clang?

>> > This is an unfortunate niggle, but teaching Clang to correctly process
> C++ attributes in this position seems like a non-trivial undertaking
> because it would require us to treat GNU spellings and C++ spellings
> differently in Parser::ParseSimpleDeclaration(), which seems like it would
> be a bit of a headache.
> >>
> >> I don't think we'll have to go down that route, hopefully.
> >>
> >> >> What kind of type semantic impacts does this attribute have on the
> >> >> type identity for things like overloading, name mangling, type
> >> >> conversion, etc?
> >> >
> >> > Good point. The intent is that the attribute should have no impact.
> Anything else would be pretty problematic from a portability point of view
> because other compilers will simply ignore the attribute -- so if it had an
> effect on the type system in Clang, that could cause other compilers to
> generate different code.
> >>
> >> Vendor-specific attributes are inherently not portable or safe to
> >> ignore.
> >
> >
> > But for C++ attributes at least, the standard says that compilers should
> do exactly this (i.e. ignore unknown attributes)?
>
> Same for C. The standard says that vendor-specific attributes are
> fully implementation-defined (so they're permitted to have whatever
> semantics the implementation wants, including necessary side effects).
> The standard also says that if the implementation doesn't know about
> an attribute, the implementation should ignore the attribute. The
> combination of these rules is what makes vendor-specific attributes
> inherently nonportable or safe to ignore.
>
> As an example: https://godbolt.org/z/vsv3Wo6En


Thanks, makes sense!

>> Generally speaking, if a type attribute has no type semantics,
> >> it probably shouldn't be a type attribute. However, because this
> >> attribute doesn't really have any semantics beyond "keep this
> >> information associated with the type", it's a bit more complex.
> >> Consider:
> >>
> >> void func(int i);
> >> void func(int [[clang::annotate("hoo boy")]] i);
> >>
> >> I can see an argument that this is not a valid overload set because
> >> both functions accept an integer.
> >
> >
> > I think nullability attributes are a good analogy here (and essentially
> "prior art"). Here's an example:
> >
> > void func(int* i) {}
> > void func(int* _Nonnull i) {}
> >
> > (https://godbolt.org/z/sP6za4WEE)
> >
> > Clang doesn't consider these to be overloads and instead complains about
> a redefinition.
>
> I think modeling on the nullability attributes may not be a bad idea,
> but note that those are surfaced to the user as keywords and not
> attributes at all. I believe that was done specifically because the
> nullability attributes had type semantics, but I could be remembering
> incorrectly.
>

Do you think you can dig out this discussion again?

I don't think it's really important though -- for the [[clang::annotate]]
case (assuming that's what we'll go with in preference to `annotate_type`),
we certainly want to specify that the annotation does not create a new type.


> > I can certainly see how for different use cases you _would_ want an
> attribute to create a different type -- but I would say those would then
> require a second, separate attribute.
> >
> >>
> >> However:
> >>
> >> void func(int i) [[clang::annotate("ABI=v1")]];
> >> void func(int i) [[clang::annotate("ABI=v2")]];
> >>
> >> I can see an argument that this is a valid overload set because
> >> something else (the linker, whatever) is using that annotated
> >> information to provide additional semantics beyond what the compiler
> >> cares about.
> >
> >
> > As a data point that my be useful, I looked at what Clang does when I
> move those attributes to a position where they are declaration attributes
> (so that I can run the current Clang on them), i.e.:
> >
> >  [[clang::annotate("ABI=v1")]] void func(int i) {}
> >  [[clang::annotate("ABI=v2")]] void func(int i) {}
> >
> > (https://godbolt.org/z/fo5vrso6d)
> >
> > Again, Clang doesn't consider these to be overloads and instead
> complains about a redefinition.
>
> FWIW, that's because declaration attributes do not impact the type of
> the function. We do have a declaration attribute that says "this
> declaration is allowed to be overloaded" in C, but that's subtly a
> different thing.
>
> > So it seems that the existing semantics of the `annotate` attribute are
> similar to what I'm proposing for the `annotate_type` attribute.
> >
> >> >> Also, one problem here is -- [[clang::annotate]] isn't just used for
> >> >> the static analyzer, it's also something that can be used by backend
> >> >> tools (for example, when using attribute plugins).
> >> >
> >> >
> >> > It's good that you bring this up because I'm actually also thinking
> about extending ParsedAttrInfo to handle type attributes (with a new
> handleTypeAttribute() function as an analog to handleDeclAttribute()). The
> proposed annotate_type attribute would then also be a natural candidate for
> an attribute that the new handleTypeAttribute() could add to types.
> >>
> >> Oh, neat! Unifying type/stmt/decl attributes more is definitely a great
> goal.
> >>
> >> > Is this the concern that you had here or were you thinking about
> something different?
> >>
> >> Not really, it was more along the lines of lowering to LLVM IR (below)
> >> so that the backend can also take advantage of this.
> >>
> >> >> Do we need to
> >> >> consider what information should be lowered to LLVM IR when this new
> >> >> attribute appears on a type?
> >> >>
> >> >> Thanks!
> >> >
> >> > My intent was that this should have no effect on code generation.
> >> >
> >> > Lowering annotations on declarations to LLVM IR has some obvious use
> cases, but for annotations on types it's less obvious how it would be
> useful to lower these to IR (and also less obvious to me how one would
> represent these). (Did you have anything specific in mind here?)
> >>
> >> LLVM currently has the ability to describe some types (e.g.,
> >> `%struct.X = type { i32 }`), so I was envisioning attaching this
> >> attribute to the type (either as an LLVM attribute or metadata, I have
> >> no idea what the distinction really is between them). LLVM currently
> >> supports plugins as does Clang, and Clang's support for attributes in
> >> plugins still strongly encourages users to use existing attributes to
> >> surface functionality, so I can imagine users wanting to use the
> >> annotate type attribute in the same way they already use the annotate
> >> declaration attribute for this sort of collusion. However, I don't
> >> think the codegen needs to happen up front -- mostly trying to
> >> understand the shape of the design still.
> >
> >
> > Makes sense.
> >
> > Sorry that this has become such a long email, but I hope I've been able
> to give you a clearer picture of what I'm trying to achieve.
>
> No worries, I appreciate the detailed discussion of the design!
>
> I think we should double-check that I'm correct about the bug with
> Clang accepting:
>
> static int [[]] i;
>
> but rejecting:
>
> int [[]] i;
>

I don't think that's the distinction:

https://godbolt.org/z/MWcojK8n4

(Noting that it accepts both of these.)

Rather, this seems to be a case of [[]] being treated specially (probably
earlier in the parser) and therefore being accepted in this position, while
named, known attributes are not accepted here.


> If we're agreed that this is a bug, then it seems like once we fix
> that bug, you could support [[clang::annotate]] in type positions
> (with no type semantics) as well as declarations, but only if you were
> willing to define __attribute__((annotate)) as applying to
> declarations when there's an ambiguity between type vs decl. Would
> that be a palatable approach for you?
>

Yes, I think that would definitely work for me, and there would be an
attraction in not having to introduce an additional `annotate_type`
attribute that otherwise works mostly the same as `annotate`.

I think making this work may not be as simple as lifting the restriction in
Parser::ParseDeclarationSpecifiers() that we discussed above. I think
there's another issue that I touched on previously (
https://lists.llvm.org/pipermail/cfe-dev/2021-October/069094.html), namely
that Clang currently lumps attributes for the declaration and the
decl-specifier-seq together.

This happens in Parser::ParseSimpleDeclaration(), specifically this line
<
https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseDecl.cpp#L1818
>
:

DS.takeAttributesFrom(Attrs);

To quote the conclusion I reached at the time:

For C++ attributes, I believe this would not be correct, but that's not
> (currently) a concern because AFAICT Clang currently doesn't allow C++
> attributes to occur after a decl-specifier-seq at all (because, I presume,
> Clang doesn't yet support any C++ attributes that can occur in this
> position). This also means that if we want to use the proposed
> `annotate_type` attribute in this position, we have to use the GNU
> spelling. This is an unfortunate niggle, but teaching Clang to correctly
> process C++ attributes in this position seems like a non-trivial
> undertaking because it would require us to treat GNU spellings and C++
> spellings differently in Parser::ParseSimpleDeclaration(), which seems like
> it would be a bit of a headache.


And I believe this is still true?

Do you have any other ideas on what we could do about this?

Thanks,

Martin

~Aaron
>
> >
> > Cheers,
> >
> > Martin
> >
> >>
> >>
> >> ~Aaron
> >>
> >> > This could actually make another case for why we have two different
> attributes, because that would make the distinction clearer that we're
> doing code generation for one but not the other.
> >> >
> >> > Thanks,
> >> >
> >> > Martin
> >> >
> >> >>
> >> >> ~Aaron
> >> >>
> >> >> >
> >> >> > Thanks,
> >> >> >
> >> >> > Martin
> >> >> >
> >> >> > --
> >> >> >
> >> >> > Martin Brænne
> >> >> >
> >> >> > Software Engineer
> >> >> >
> >> >> > mboehme at google.com
> >> >> >
> >> >> >
> >> >> > Google Germany GmbH
> >> >> > Erika-Mann-Straße 33
> >> >> > 80363 München
> >> >> >
> >> >> > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> >> >> >
> >> >> > Registergericht und -nummer: Hamburg, HRB 86891
> >> >> >
> >> >> > Sitz der Gesellschaft: Hamburg
> >> >> >
> >> >> >
> >> >> > Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und
> löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
> >> >> >
> >> >> >
> >> >> >
> >> >> > This e-mail is confidential. If you are not the right addressee
> please do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.
> >> >> >
> >> >> > _______________________________________________
> >> >> > cfe-dev mailing list
> >> >> > cfe-dev at lists.llvm.org
> >> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Martin Brænne
> >> >
> >> > Software Engineer
> >> >
> >> > mboehme at google.com
> >> >
> >> >
> >> > Google Germany GmbH
> >> > Erika-Mann-Straße 33
> >> > 80363 München
> >> >
> >> > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> >> >
> >> > Registergericht und -nummer: Hamburg, HRB 86891
> >> >
> >> > Sitz der Gesellschaft: Hamburg
> >> >
> >> >
> >> > Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> sind, leiten Sie diese bitte nicht weiter, informieren Sie den Absender und
> löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
> >> >
> >> >
> >> >
> >> > This e-mail is confidential. If you are not the right addressee
> please do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.
> >
> >
> >
> > --
> >
> > Martin Brænne
> >
> > Software Engineer
> >
> > mboehme at google.com
> >
> >
> > Google Germany GmbH
> > Erika-Mann-Straße 33
> > 80363 München
> >
> > Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> >
> > Registergericht und -nummer: Hamburg, HRB 86891
> >
> > Sitz der Gesellschaft: Hamburg
> >
> >
> > Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
> leiten Sie diese bitte nicht weiter, informieren Sie den Absender und
> löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
> >
> >
> >
> > This e-mail is confidential. If you are not the right addressee please
> do not forward it, please inform the sender, and please erase this e-mail
> including any attachments. Thanks.
>


-- 

Martin *Brænne*

Software Engineer

mboehme at google.com


Google Germany GmbH
Erika-Mann-Straße 33
80363 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat sind,
leiten Sie diese bitte nicht weiter, informieren Sie den Absender und
löschen Sie die E-Mail und alle Anhänge. Vielen Dank.



This e-mail is confidential. If you are not the right addressee please do
not forward it, please inform the sender, and please erase this e-mail
including any attachments. Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211018/be61dc69/attachment-0001.html>


More information about the cfe-dev mailing list