[PATCH] D148381: [Clang] Implement the 'counted_by' attribute

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 28 09:05:16 PDT 2023


aaron.ballman added inline comments.


================
Comment at: clang/include/clang/Basic/Attr.td:4246
+    private:
+      SourceRange countedByFieldLoc;
+    public:
----------------
void wrote:
> aaron.ballman wrote:
> > void wrote:
> > > erichkeane wrote:
> > > > aaron.ballman wrote:
> > > > > Teeny tiniest of coding style nits :-D
> > > > Should we instead be capturing the field itself, rather than its location?  It seems to me that would be more useful?
> > > I tried that before and it didn't work. The issue is that at the time of parsing we don't yet have the full definition of the structure / union. So the `Decl`'s aren't really available to us yet. There may be a way to massage the parsing code to allow this to happen, but I'd like to do it as a separate patch. I'll document it with a `FIXME`.
> > I think at some point we're going to want this to take an `Expr` argument instead of an identifier argument, so we can support `foo.bar` and `baz[0]`, etc as arguments for more complicated situations. But I believe the current form is reasonable as-is (we can start taking an expression and all of these identifiers should become a `MemberExpr`, so users don't have to change their code).
> I think that GCC may not be able to use expressions (apart from simple, resolvable integer arithmetic). But yeah, those types of expansions would be nice. There are many issues to resolve with them first---such as how to reference a field in a sub-struct when the fam is in another substruct:
> 
> ```
> struct S {
>   struct A {
>     int count;
>   } a;
>   struct C {
>     int count;
>     struct X {
>       int count;
>     } x;
>   } c;
>   struct F {
>     int fam[] __counted_by(/* how to refer to p->c.x.count? */);
>   } f;
> };
> ```
FWIW, my intuition is that we'd want something close to:
```
struct S {
  struct A {
    int count;
  } a;
  struct C {
    int count;
    struct X {
      int count;
    } x;
  } c;
  struct F {
    int fam[] __counted_by(c.x.count);
  } f;
};
```
but this is in follow-up territory.


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18029
+
+    S.LookupName(Result, Scope);
+    if (Result.getResultKind() == LookupResult::Found) {
----------------
void wrote:
> aaron.ballman wrote:
> > When the lookup result is not found, I think you should call `Sema::DiagnoseEmptyLookup()`. This will handle giving you the error about the member not being found, but it should also give you typo correction for free, and if typo correction happens, you should then already know what new declaration was found via the correction, so you can continue to check for the other cases from there.
> > 
> > https://github.com/llvm/llvm-project/blob/b57df9fe9a1a230f277d671bfa0884bbda9fc1c5/clang/lib/Sema/SemaLambda.cpp#L1151 is an example of this being called elsewhere.
> GAAAA! It's just like the rest of Clang: There are a million ways to do something, but none of them are documented and only one of them works for your particular case...kind of...and it takes a ton of arguments which seem counter-intuitive (`CXXScopeSpec` instead of `DeclSpec`?). All of Clang has been over-engineered from the beginning and it shows in how hard it is to do even the most basic things. So hard that people write the same things over and over with only small changes (see the two designated initializers implementations) leading to a nightmare.
> 
> The example you pointed to doesn't work. And none of the other forms seem to fit well enough. There are apparently many ways to "correct" a typo, but none of them seemed appropriate or like they'd do what I want. Is it really necessary to have this?
Necessary? No. A better implementation of the feature? I think so; we want the user experience to be consistent whenever possible and it's pretty unfortunate that users would get typo correction elsewhere but not here.

That said, I don't think this patch needs to be gated on that; we can add a FIXME to use a more general mechanism instead of even more custom logic.


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18012
+
+static void CheckCountedByAttr(Sema &S, const RecordDecl *RD,
+                               const FieldDecl *FD) {
----------------
void wrote:
> aaron.ballman wrote:
> > This logic seems like it should live in SemaDeclAttr.cpp when handling the attribute. We're given the declaration the attribute is applied to, so we can do everything we need from that (the `FieldDecl` has a `getParent()` so we don't need the `RecordDecl` to be passed in. Is there a reason we need it to live here instead?
> I put it here only because it's close to where it's called. I'm ambivalent on where it should live. Am I performing the check in the correct place?
> 
> Changed to use `getParent()`.
I prefer it to live in SemaDeclAttr.cpp as that's where the other attribute checking logic generally lives; it makes it slightly easier for us to find the checking logic in the future.


================
Comment at: clang/lib/Sema/SemaDecl.cpp:18035
+        << ECA->getCountedByField() << SR;
+  } else if (!Field->getType()->isIntegerType()) {
+    // The "counted_by" field must have an integer type.
----------------
void wrote:
> aaron.ballman wrote:
> > void wrote:
> > > aaron.ballman wrote:
> > > > Errr any integer type?
> > > > ```
> > > > struct S {
> > > >   bool HerpADerp;
> > > >   int FAM[] __attribute__((counted_by(HerpADerp)));
> > > > };
> > > > ```
> > > > seems like something we don't want, but I also wonder about enumeration types and more interestingly, plain `char` where it could be treated as signed or unsigned depending on compiler flags.
> > > Yeah...unless they want just 1 element. ;-) I can rule a `bool` out. (Done)
> > > 
> > > A `char` should be fine. Dunno about signed vs. unsigned...
> > The `char` situation I'm thinking about is morally: https://godbolt.org/z/377WPYojz
> > 
> > With `-fsigned-char`, accessing `Idx` means you get the value `-1` and with `-funsigned-char`, you get the value `255`. If that was used to index into `FAM`, the `signed char` version would become a bounds access issue whereas the `unsigned char` version works fine, so it's hard to write a portable attribute that refers to a `char` member.
> > 
> > Then again, the `counted_by` attribute is intended to catch exactly this kind of issue, so perhaps it's fine on `char`?
> True. But we technically have that issue with every type. Maybe it'd be best to make sure we don't have a negative result?
So at CodeGen time we see if we know the value within the field, and if it's known to be negative, we diagnose? Or are you thinking of something else?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148381/new/

https://reviews.llvm.org/D148381



More information about the cfe-commits mailing list