[PATCH] D148381: [Clang] Implement the 'counted_by' attribute

Bill Wendling via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 13 16:44:43 PDT 2023


void added inline comments.


================
Comment at: clang/include/clang/AST/Decl.h:4272-4275
+    FieldDecl *FD = nullptr;
+    for (FieldDecl *Field : fields())
+      FD = Field;
+    return FD;
----------------
aaron.ballman wrote:
> Could this be implemented as: `return !field_empty() ? *std::prev(field_end()) : nullptr;` ? Then maybe someday we'll get better iterator support that isn't forward-only and this code will automagically get faster.
Using `std::prev` on a forward iterator won't work:

https://stackoverflow.com/questions/23868378/why-stdprev-does-not-fire-an-error-with-an-iterator-of-stdunordered-set

`std::prev` itself is defined only for bidirectional iterators:

```
 template<typename _BidirectionalIterator>
    _GLIBCXX_NODISCARD
    inline _GLIBCXX17_CONSTEXPR _BidirectionalIterator
    prev(_BidirectionalIterator __x, typename
         iterator_traits<_BidirectionalIterator>::difference_type __n = 1)
    {
...
```



================
Comment at: clang/include/clang/AST/Decl.h:4277-4282
+  const FieldDecl *getLastField() const {
+    const FieldDecl *FD = nullptr;
+    for (const FieldDecl *Field : fields())
+      FD = Field;
+    return FD;
+  }
----------------
aaron.ballman wrote:
> We use this pattern fairly often to avoid repeating the implementation.
Done. I just hate the use of `const_cast`...


================
Comment at: clang/include/clang/AST/DeclBase.h:482
+  /// Whether it resembles a flexible array member.
+  static bool isFlexibleArrayMemberLike(
+      ASTContext &Context, const Decl *D, QualType Ty,
----------------
erichkeane wrote:
> Why isn't this a member function?
Because I want to be able to call it with a `nullptr`. I'll document this.


================
Comment at: clang/include/clang/Basic/Attr.td:4246
+    private:
+      SourceRange countedByFieldLoc;
+    public:
----------------
erichkeane wrote:
> aaron.ballman wrote:
> > Teeny tiniest of coding style nits :-D
> Should we instead be capturing the field itself, rather than its location?  It seems to me that would be more useful?
I tried that before and it didn't work. The issue is that at the time of parsing we don't yet have the full definition of the structure / union. So the `Decl`'s aren't really available to us yet. There may be a way to massage the parsing code to allow this to happen, but I'd like to do it as a separate patch. I'll document it with a `FIXME`.


================
Comment at: clang/include/clang/Basic/Attr.td:4167
+  let Documentation = [ElementCountDocs];
+  let LangOpts = [COnly];
+}
----------------
aaron.ballman wrote:
> void wrote:
> > nickdesaulniers wrote:
> > > Does C++ not support VLAs?
> > C++ doesn't support flexible array members or VLAs.
> Clang supports both as extensions in C++, as does GCC: https://godbolt.org/z/5xP7ncoha -- I think we should consider enabling this attribute in C++ mode, but I'm fine considering that in a follow-up.
I understand we can accept it as an extension, but yeah let's burn that bridge when we come to it.


================
Comment at: clang/include/clang/Basic/AttrDocs.td:7200
+  let Content = [{
+Clang supports the ``counted_by`` attribute for the flexible array member of a
+structure. The argument for the attribute is the name of a field member in the
----------------
aaron.ballman wrote:
> If we land the changes with the attribute as a C-only attribute, we should document that explicitly here (unless we're immediately enabling it C++ mode; not asking for busywork).
That's not something normally done for other `COnly` attributes. (However, the documentation shows those attributes being available for C++11, which is probably a mistake...) I went ahead and added that this is for C though.


================
Comment at: clang/include/clang/Basic/AttrDocs.td:7203
+same structure holding the count of elements in the flexible array. This
+information can be used to improve the results of array bound sanitizer and the
+``__builtin_dynamic_object_size``.
----------------
erichkeane wrote:
> As this is the purpose of this attribute, it seems to me that the documentation should headline/more obviously highlight that the purpose here is to improve the result of the sanitizer when it comes to flexible array members.  
I'm not sure how much more I could do to highlight that. :-)


================
Comment at: clang/include/clang/Basic/AttrDocs.td:7209
+    /* ... */
+    struct bar *fam[] __attribute__((counted_by(num_fam_elements)));
+  };
----------------
rapidsna wrote:
> I don't think it's necessary for this patch but for [[ https://discourse.llvm.org/t/rfc-enforcing-bounds-safety-in-c-fbounds-safety/70854/113?u=rapidsna | -fbounds-safety ]], we'd eventually turn this into a type attribute that we will use to create a sugared type or qualified type. And we'd make the attribute work for **both** this position and inside the array bracket (e.g., `fam[__counted_by(n)]`). Would it work for you?
This was a suggestion by a GCC developer as well. I personally hate that syntax. Regardless, it's something to be done in the future. :-)


================
Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:6368-6369
 
+def warn_flexible_array_counted_by_attr_field_not_found : Warning<
+  "counted_by field %0 not found">;
+def err_flexible_array_counted_by_attr_refers_to_self : Error<
----------------
aaron.ballman wrote:
> Why is this a warning and not a typical `err_no_member` error? I think we want that behavior so that we get typo correction for free. e.g.,
> ```
> struct S {
>   int Count;
>   int FAM[] __attribute__((counted_by(Clount))); // Should get a "did you mean 'Count'" fix-it
> };
> ```
I thought it would be too harsh to make this an error. But I suppose there's no harm in doing this. I'm not sure how to implement a fixit for a closely matching identifier. Do you have an example of how to find the closest matching Decl?


================
Comment at: clang/include/clang/Basic/DiagnosticSemaKinds.td:6371
+def err_flexible_array_counted_by_attr_refers_to_self : Error<
+  "counted_by field %0 cannot refer to the flexible array">;
+def err_flexible_array_counted_by_attr_field_not_integral : Error<
----------------
erichkeane wrote:
> aaron.ballman wrote:
> > We try to wrap syntax elements in single quotes in our diagnostics so it's more visually distinct
> This one isn't clear... something more specifically about 'cannot point to itself' is perhaps more useful/explainatory.
> 
> Also, the beginning of all of these is really quite awkward as well, perhaps something like:
> 
> `field %0 referenced by 'counted_by' attribute ...` ?
I reworded them a bit.

As for referring to itself, it could be a bit confusing to say that it 'cannot point to itself' because the 'itself' in that is the attribute, not the flexible array member. I think it's more-or-less clear what the error's referring to. The user can use this attribute only with a flexible array member. So they should know what what's meant by the message.


================
Comment at: clang/lib/AST/ASTImporter.cpp:8979-8980
+  // Useful for accessing the imported attribute.
+  template <typename T> T *getAttrAs() { return cast<T>(ToAttr); }
+  template <typename T> const T *getAttrAs() const { return cast<T>(ToAttr); }
+
----------------
aaron.ballman wrote:
> Should these be `castAttrAs` so it's more clear that type mismatches don't result in a null pointer, but an assertion?
Sure.


================
Comment at: clang/lib/AST/DeclBase.cpp:443-444
+
+  if (const auto *OID = dyn_cast_if_present<ObjCIvarDecl>(D))
+    return OID->getNextIvar() == nullptr;
+
----------------
aaron.ballman wrote:
> Can you explain a bit about what this code is for? I didn't spot any test coverage for it, but perhaps I missed something.
Short answer is I don't know. :-) This is copied from the original implementation of this method.


================
Comment at: clang/lib/AST/DeclBase.cpp:482
+
+  RecordDecl::field_iterator FI(
+      DeclContext::decl_iterator(const_cast<FieldDecl *>(FD)));
----------------
erichkeane wrote:
> Whats going on here?  Is this simply an attempt to see if this is the last one? Can we combine this effort with the 'getLastFieldDecl' above?  Alternatively, can we get a good comment here?
It's not really the same as the `getLastField` method. That one will return a non-null pointer whether or not the last field is a FAM. This code is just shorthand for getting the iterator for the FAM and then quickly testing if it's the final one. Again, this is because of the forward iterator that RecordDecl uses, for some reason. If we had bidirectional iterators, this would be much clearer.


================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:857
+  if (IsDynamic) {
+    const LangOptions::StrictFlexArraysLevelKind StrictFlexArraysLevel =
+        getLangOpts().getStrictFlexArraysLevel();
----------------
nickdesaulniers wrote:
> aaron.ballman wrote:
> > nickdesaulniers wrote:
> > > aaron.ballman wrote:
> > > > We don't generally use top-level const on local variables.
> > > Do we explicitly document this in the style guide? This came up for me recently in a code review (cc @jyknight ).
> > > https://llvm.org/docs/CodingStandards.html
> > It's the prevailing style in our files rather than a hard rule. In Clang, we seem to have organically landed on "const members are fine, const locals are not fine, do not use globals". Our const-correctness story is terrible and if we had better const correct interfaces, I think we'd have landed somewhere different.
> prevailing styles should be documented explicitly in the style guide. That settles all discussions, now and in the future.
> 
> Or perhaps an RFC on discourse.
Easy enough to do. Done.


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18000-18002
+    if (const auto *FD = dyn_cast<FieldDecl>(D))
+      if (Pred(FD))
+        return FD;
----------------
aaron.ballman wrote:
> Ask and you shall receive! :-D
Nice!! :-D


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18012
+
+static void CheckCountedByAttr(Sema &S, const RecordDecl *RD,
+                               const FieldDecl *FD) {
----------------
aaron.ballman wrote:
> This logic seems like it should live in SemaDeclAttr.cpp when handling the attribute. We're given the declaration the attribute is applied to, so we can do everything we need from that (the `FieldDecl` has a `getParent()` so we don't need the `RecordDecl` to be passed in. Is there a reason we need it to live here instead?
I put it here only because it's close to where it's called. I'm ambivalent on where it should live. Am I performing the check in the correct place?

Changed to use `getParent()`.


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18035
+        << ECA->getCountedByField() << SR;
+  } else if (!Field->getType()->isIntegerType()) {
+    // The "counted_by" field must have an integer type.
----------------
aaron.ballman wrote:
> Errr any integer type?
> ```
> struct S {
>   bool HerpADerp;
>   int FAM[] __attribute__((counted_by(HerpADerp)));
> };
> ```
> seems like something we don't want, but I also wonder about enumeration types and more interestingly, plain `char` where it could be treated as signed or unsigned depending on compiler flags.
Yeah...unless they want just 1 element. ;-) I can rule a `bool` out. (Done)

A `char` should be fine. Dunno about signed vs. unsigned...


================
Comment at: clang/test/Misc/warning-flags.c:21
 
-CHECK: Warnings without flags (65):
+CHECK: Warnings without flags (66):
 
----------------
aaron.ballman wrote:
> One line up: "The list of warnings below should NEVER grow.  It should gradually shrink to 0." ;-)
> 
> That said, I don't think this warning needs to be added because I think we have an existing error that is more appropriate.
Awww....gatekeeping. ;-) Reverted.


================
Comment at: clang/test/Sema/attr-counted-by.c:1
+// RUN: %clang_cc1 -triple x86_64-linux -fstrict-flex-arrays=3 -fsanitize=array-bounds \
+// RUN:     -fsyntax-only -verify %s
----------------
aaron.ballman wrote:
> Is the triple necessary? Also, do we have to enable `-fsanitize=array-bounds` to test this?
I suppose their not 100% necessary. I'll remove them.


================
Comment at: clang/test/Sema/attr-counted-by.c:21
+  struct bar *fam[] __counted_by(non_integer); // expected-note {{counted_by field 'non_integer' declared here}}
+};
----------------
aaron.ballman wrote:
> Please add test with the other strange FAM-like members, as well as one that is just a trailing constant array of size 2.
> 
> Some more tests or situations to consider:
> ```
> struct S {
>   struct {
>     int NotInThisStruct;
>   };
>   int FAM[] __counted_by(NotInThisStruct); // Should this work?
> };
> 
> int Global;
> struct T {
>   int FAM[] __counted_by(Global); // Should fail per your design but is that the behavior you want?
> };
> 
> struct U {
>   struct {
>     int Count;
>   } data;
>   int FAM[] __counted_by(data.Count); // Thoughts?
> };
> 
> struct V {
>   int Sizes[2];
>   int FAM[] __counted_by(Sizes[0]); // Thoughts?
> };
> ```
> (I'm not suggesting any of this needs to be accepted in order to land the patch.)
> 
> You should also have tests showing the attribute is diagnosed when applied to something other than a field, given something other than an identifier, is given zero arguments, and is given two arguments.
I added a "this isn't a flexible array member" error message.

```
struct S {
  struct {
    int NotInThisStruct;
  };
  int FAM[] __counted_by(NotInThisStruct); // Should this work?
};
```

Yes, I believe it should. Kees had a similar example:

```
struct S {
  int x;
  struct {
    int count;
    int FAM[] __counted_by(count);
  };
};
```

I made changes to support this. It may be controversial, so please let me know if you or anyone has an issue with it.

```
int Global;
struct T {
  int FAM[] __counted_by(Global); // Should fail per your design but is that the behavior you want?
};
```

Yes, it should fail. I'll add a test case.

```
struct U {
  struct {
    int Count;
  } data;
  int FAM[] __counted_by(data.Count); // Thoughts?
};
```

I don't think we should support that at the moment. Referencing arbitrary fields in a nested structure is very complicated and the syntax for it is not at all settled on.

```
struct V {
  int Sizes[2];
  int FAM[] __counted_by(Sizes[0]); // Thoughts?
};
```

Hmm...Not sure. I'll ask the GCC person who's implementing this on her side about this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148381/new/

https://reviews.llvm.org/D148381



More information about the cfe-commits mailing list