[PATCH] D111548: [Clang] Add the `annotate_type` attribute

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed May 4 12:25:35 PDT 2022


aaron.ballman added a subscriber: rjmccall.
aaron.ballman added inline comments.


================
Comment at: clang/test/SemaCXX/annotate-type.cpp:2
+// RUN: %clang_cc1 %s -std=c++17 -fsyntax-only -verify
+
+struct S1 {
----------------
mboehme wrote:
> aaron.ballman wrote:
> > mboehme wrote:
> > > aaron.ballman wrote:
> > > > mboehme wrote:
> > > > > aaron.ballman wrote:
> > > > > > mboehme wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > mboehme wrote:
> > > > > > > > > aaron.ballman wrote:
> > > > > > > > > > Can you also add some test coverage for applying the attribute to a declaration instead of a type or not giving it any arguments? Also, should test arguments which are not a constant expression.
> > > > > > > > > I've added tests as you suggested, though I put most of them in Sema/annotate-type.c, as they're not specific to C++.
> > > > > > > > > 
> > > > > > > > > Thanks for prompting me to do this -- the tests caused me to notice and fix a number of bugs.
> > > > > > > > > 
> > > > > > > > > I haven't managed to diagnose the case when the attribute appears at the beginning of the declaration. It seems to me that, at the point where I've added the check, this case is indistinguishable from an attribute that appears on the type. In both cases, the `TAL` is `TAL_DeclSpec`, and the attribute is contained in `DeclSpec::getAttributes()`. This is because `Parser::ParseSimpleDeclaration` forwards the declaration attributes to the `DeclSpec`:
> > > > > > > > > 
> > > > > > > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Parse/ParseDecl.cpp#L1851
> > > > > > > > > 
> > > > > > > > > I believe if I wanted to prohibit this case, I would need to add a check to `Parser::ParseStatementOrDeclaration` and prohibit any occurrences of `annotate_type` there. However, this seems the wrong place to do this because it doesn't contain any specific processing for other attributes.
> > > > > > > > > 
> > > > > > > > > I've noticed that Clang also doesn't prohibit other type attributes (even ones with C++ 11 syntax) from being applied to declarations. For example: https://godbolt.org/z/Yj1zWY7nn
> > > > > > > > > 
> > > > > > > > > How do you think I should proceed here? I think the underlying issue is that Clang doesn't always distinguish cleanly between declaration attributes and type attributes, and fixing this might be nontrivial.
> > > > > > > > > How do you think I should proceed here? I think the underlying issue is that Clang doesn't always distinguish cleanly between declaration attributes and type attributes, and fixing this might be nontrivial.
> > > > > > > > 
> > > > > > > > This is a general issue with attribute processing. I would imagine that SemaDeclAttr.cpp should be able to diagnose that case when the attribute only applies to types and not declarations, but it'd take some investigation for me to be sure.
> > > > > > > > 
> > > > > > > > Because this issue isn't new to your situation, I'd recommend filing an issue about the general problem and then we can solve that later.
> > > > > > > I've done some more investigation myself, and I think I've come up with a solution; actually, two potential solutions. I have draft patches for both of these; I wanted to run these by you here first, so I haven't opened a bug yet.
> > > > > > > 
> > > > > > > I'd appreciate your input on how you'd prefer me to proceed with this. I do think it makes sense to do this work now because otherwise, people will start putting `annotate_type` in places where it doesn't belong, and then we'll have to keep supporting that in the future.
> > > > > > > 
> > > > > > > **My preferred solution**
> > > > > > > 
> > > > > > > https://reviews.llvm.org/D124081
> > > > > > > 
> > > > > > > This adds a function `DiagnoseCXX11NonDeclAttrs()` and calls it in all places where we should reject attributes that can't be put on declarations. (As part of this work, I noticed that `gsl::suppress` should be a `DeclOrStmtAttr`, not just a `StmtAttr`.)
> > > > > > > 
> > > > > > > Advantages:
> > > > > > > - Not very invasive.
> > > > > > > Disadvantages:
> > > > > > > - Need to make sure we call `DiagnoseCXX11NonDeclAttrs()` everywhere that non-declaration attributes should be rejected. (Not sure if I've identified all of those places yet?)
> > > > > > > 
> > > > > > > **Alternative solution**
> > > > > > > 
> > > > > > > https://reviews.llvm.org/D124083
> > > > > > > 
> > > > > > > This adds an `appertainsTo` parameter to `ParseCXX11Attributes()` and other "parse attributes" functions that call it. This parameter specifies whether the attributes in the place where they're currently being parsed appertain to a declaration, statement or type. If `ParseCXX11Attributes()` encounters an attribute that is not compatible with `appertainsTo`, it outputs a diagnostic.
> > > > > > > 
> > > > > > > Advantages:
> > > > > > > - Every call to `ParseCXX11Attributes()` _has_ to specify `appertainsTo` -- so there's no risk of missing some case where we have to output diagnostics
> > > > > > > Disadvantages:
> > > > > > > - This change is _much_ more invasive.
> > > > > > > - It's not always clear what value to specify for `appertainsTo`. The poster child for this is `Parser::ParseStatementOrDeclaration()`: As the name says, we don't yet know whether we're parsing a statement or a declaration. As a result, we need to be able to specify `AppertainsToUnknown` and would need to add additional logic deeper in the call stack to produce diagnostics once we know for sure whether we're dealing with a statement or declaration. (This isn't yet implemented.) This makes the solution less elegant that it initially seems.
> > > > > > > - There's a tension between this new functionality and the existing `ParsedAttr::diagnoseAppertainsTo()`. However, we unfortunately can't extend the existing `ParsedAttr::diagnoseAppertainsTo()` because it is called from `ProcessDeclAttribute()` and (despite the name) that function is also processes some legitimate type attributes.
> > > > > > > I do think it makes sense to do this work now because otherwise, people will start putting annotate_type in places where it doesn't belong, and then we'll have to keep supporting that in the future.
> > > > > > 
> > > > > > Thank you for working on this now, I agree that'd be good to avoid.
> > > > > > 
> > > > > > My concern with both of those approaches is that they impact *parsing* rather than *semantics* and that seems wrong. As far as the parser is concerned, attributes really only exist as one particular form of syntax or another, but it shouldn't care about whether an attribute is allowed to appertain to a particular thing or not. e.g., an attribute list can either appear somewhere in the syntax or it can't, but what particular attributes are specified do not matter at parsing time. Parsing determine what the attribute is associated with (a declaration, a type, a statement, etc). After we've finished parsing the attributes in the list, we convert the attributes into their semantic form and that's the time at which we care about whether the attribute may be written on the thing it appertains to.
> > > > > > 
> > > > > > I think we handle all of the parsing correctly today already, but the issue is on the semantic side of things. An attribute at the start of a line can only be one of two things: a statement attribute or a declaration attribute, it can never be a type attribute. So what I think should happen is that `ProcessDeclAttribute()` needs to get smarter when it's given an attribute that can only appertain to a type. I don't think we should have to touch the parser at all, especially because you can specify multiple attributes in one list, as in:
> > > > > > ```
> > > > > > void func() {
> > > > > >   [[decl_attr, type_attr]] int foo;
> > > > > > }
> > > > > > ```
> > > > > > and we'd want to diagnose `type_attr` but still allow `decl_attr`.
> > > > > > I think we handle all of the parsing correctly today already
> > > > > 
> > > > > I'm not sure about this. For example, take a look at `Parser::ParseSimpleDeclaration`, which does this at the end:
> > > > > 
> > > > > ```
> > > > > DS.takeAttributesFrom(Attrs);
> > > > > ```
> > > > > 
> > > > > Here, `DS` is a `ParsingDeclSpec`, and `Attrs` are the attributes that were seen on the declaration (I believe).
> > > > > 
> > > > > If my understanding of the code is correct (which it very well may not be), this is putting the attributes that were seen on the declaration in the same place (`DeclSpec::Attrs`) that will later receive the attributes seen on the decl-specifier-seq. Once we've put them there, it is no longer possible to tell whether those attributes were seen on the declaration (where type attributes are not allowed) or on the decl-specifier-seq (where type attributes _are_ allowed).
> > > > > 
> > > > > (This is why https://reviews.llvm.org/D124081 adds a `DiagnoseCXX11NonDeclAttrs(Attrs)` just before the `DS.takeAttributesFrom(Attrs)` call -- because this is the last point at which we haven't lumped declaration and decl-specifier-seq attributes together.)
> > > > > 
> > > > > `ProcessDeclAttribute()`, when called through `Sema::ProcessDeclAttributes()` (which I think is the relevant call stack), only sees attributes after they have been put in `DeclSpec::Attrs`. (It therefore sees, and ignores, all of the type attributes that were put on the decl-specifier-seq.) So I think it isn't possible to tell there whether we put a type attribute on a declaration.
> > > > > 
> > > > > It might be possible to fix this by giving `DeclSpec` two lists of attributes, one containing the attributes for the declaration, and another containing the attributes for the decl-specifier-seq itself. However, this is a pretty invasive change, as everything that consumes attributes from the `DeclSpec` potentially needs to be changed (I tried and shied away from it).
> > > > > 
> > > > > Am I misunderstanding something about the code? If not, what are your thoughts on how best to proceed?
> > > > > Am I misunderstanding something about the code? If not, what are your thoughts on how best to proceed?
> > > > 
> > > > I think your understanding is correct and I was imprecise. I meant that we generally parse the `[[]]` in all the expected locations in the parser. However, I think you're correct that the bits which determine what chunks in a declaration may need to be updated. That code largely exists because GNU-style attributes "slide" sometimes to other chunks, but that's not the case for [[]]-style attributes. It sounds like we are likely lacking expressivity for the `[[]]` type attributes in the hookup between what we parsed and when we form the semantic attribute because we need to convert the parsed type into a real type.
> > > > It sounds like we are likely lacking expressivity for the [[]] type attributes in the hookup between what we parsed and when we form the semantic attribute because we need to convert the parsed type into a real type.
> > > 
> > > OK, good, sounds like we're on the same page!
> > > 
> > > I'll look at what we can do to retain the information of whether an attribute was added to the declaration or the decl-specifier-seq, so that Sema can then issue the correct diagnostics. I'll look at changing `DeclSpec` so that it retains not one, but two lists of attributes, for the declaration and the decl-specifier-seq respectively. There are likely to be some devils in the details -- `DeclSpec` allows direct write access to its `Attrs` member variable through the non-const version of `getAttributes()`, and there are quite a few places in the code that call this. Does this seem like the right direction to you in principle though?
> > > 
> > > Apart from all of this, are there any outstanding issues that you see with this change? It does of course make sense to prohibit type attributes on declarations before I land this change, but I wanted to check whether there's anything else that would need to happen here.
> > > Does this seem like the right direction to you in principle though?
> > 
> > Maybe. In terms of the standards concepts, a `decl-specifier-seq` can have attributes associated with it (https://eel.is/c++draft/dcl.dcl#dcl.spec.general-1.sentence-2) and so from that perspective, a `DeclSpec` seems like it should have only one list of attributes (those for the entire sequence of declaration specifiers).  However, GNU-style attributes "slide" around to other things, so it's possible that the behavior in `ParseSimpleDeclaration()` is correct for that particular syntax but incorrect for `[[]]` style attributes. So, depending on which devilish details are learned while working on this, it may turn out that the best approach is a secondary list, or it may turn out that we want to add a new `takeAll` variant that takes only the correct attributes while leaving others. (It may be useful to summon @rsmith for opinions on the right approach to take here.)
> > 
> > > Apart from all of this, are there any outstanding issues that you see with this change? It does of course make sense to prohibit type attributes on declarations before I land this change, but I wanted to check whether there's anything else that would need to happen here.
> > 
> > Nothing else is jumping out at me as missing or needing additional work aside from some formatting nits. I'm going to add another reviewer just to make sure I've not missed something, but I think this is about ready to go.
> Thanks for the input!
> 
> I've looked at this more closely, and I've concluded that adding a second list of attributes to the `DeclSpec` isn't really the right way to go. For one thing, as you say, the attributes we're discussing don't really belong on the decl-specifier-seq.
> 
> But I _have_ come up with an alternative solution that I like better than the alternatives I had considered so far and that, as you suggest, emits the diagnostics from Sema, not the Parser. The solution is not just more concise than the other options I had explored but also just feels like the right way of doing things.
> 
> Taking a look at the standard, attributes in the location in question (at the beginning of a simple-declaration) have the same semantics as attributes that appear after a declarator-id, and we already have a location to store the latter, namely `Declarator::Attr`. It therefore seems the right thing to do also put attributes from the simple-declaration in `Declarator::Attr`. We can then take advantage of the fact that there is existing code already in place to deal with attributes in this location.
> 
> The other part of the solution is to make `ProcessDeclAttribute` warn about C++11 attributes that don’t “slide” to the DeclSpec (except, of course, if `ProcessDeclAttribute` is actually being called for attributes on a `DeclSpec` or `Declarator`).
> 
> Here’s a draft change that implements this:
> 
> https://reviews.llvm.org/D124919
> 
> (The change description contains more details on the attribute semantics from the standard that I alluded to above.)
> 
> The change still contains a few TODOs, but all of the hard bits are done. Specifically, all of the tests pass too. I feel pretty good about this change, but before I work on it further to get it ready for review, I wanted to get your reaction: Do you agree that this is the right general direction?
> The change still contains a few TODOs, but all of the hard bits are done. Specifically, all of the tests pass too. I feel pretty good about this change, but before I work on it further to get it ready for review, I wanted to get your reaction: Do you agree that this is the right general direction?

Thank you for your continued work on this, I *really* appreciate it! I think your newest approach is the one I'm happiest with, though there are still parts of it I hope we can find ways to improve. My biggest concern is with having to sprinkle `ExtractDefiniteDeclAttrs()` in the "correct" places, as that's likely to be fragile. If there was a way we only had to do that once instead of 8 times, I'd be much happier. My worry is mostly with someone forgetting to call it when they should or thinking they need to call it when modeling new code on existing code.

Aside from that, I think the approach you've got is quite reasonable, though I'd be curious if @rsmith or @rjmccall have additional thoughts or concerns that I might not be seeing yet.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111548/new/

https://reviews.llvm.org/D111548



More information about the cfe-commits mailing list