[PATCH] D93630: [Attr] Apply GNU-style attributes to expression statements

Tue Jan 5 08:57:49 PST 2021

aaron.ballman added a comment.

In D93630#2479343 <https://reviews.llvm.org/D93630#2479343>, @vsavchenko wrote:

> In D93630#2479260 <https://reviews.llvm.org/D93630#2479260>, @aaron.ballman wrote:
>
>> In D93630#2478977 <https://reviews.llvm.org/D93630#2478977>, @vsavchenko wrote:
>>
>>> I guess I want to clarify one point here, after this patch the parser **will not assume** that statement always follows statement attributes.  We simply turn off the assumption that what follows is a declaration, parser will simply determine whether it is a statement or a declaration, it can do it on its own without attribute's "help".
>>
>> We used to say that if there's a GNU attribute (or we're parsing a declaration statement), what follows must be a declaration. Now we're using the attributes in the given attribute list to decide whether what follows is a declaration or a statement (if all of the attributes are statement attributes, we don't treat what follows as being a declaration unless we're in a declaration statement). Or am I misreading the code?
>
> We do not decide whether it is declaration or statement.  I do get that it is a matter of perspective, but right now I prefer to read it like this:
> **BEFORE**: if we've seen a GNU-style attribute parse whatever comes next as a declaration
> **AFTER**: if we've seen a GNU-style attribute except for the case when all of the parsed attributes are known to be statement attributes, parse whatever comes next as a declaration
>
> So, essentially, when we see statement attributes only, attribute/declaration heuristic is canceled and we parse exactly the way we would've parsed as if no attributes present.

I think this is a defensible design, but it still has the potential to change the parsing behavior in some circumstances and I want to make sure those circumstances are ones we know about. Note, I still think we should coordinate this change with the GCC folks. GCC has a lot of attributes that Clang does not have, so my concern with GCC is that Clang supports writing attributes in this way and GCC can't support it for some reason and it causes issues for an attribute we want to share between implementations. I don't have a specific case in mind that may cause an issue but they have a lot of attributes I'm unfamiliar with.

>> e.g.,
>>
>>   void foo(void) {
>>     __attribute__((fallthrough)) i = 12;
>>   }
>>
>> Before patch:
>>
>>   F:\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe -fsyntax-only "F:\Aaron Ballman\Desktop\test.c"
>>   F:\Aaron Ballman\Desktop\test.c:2:32: warning: type specifier missing, defaults to 'int' [-Wimplicit-int]
>>     __attribute__((fallthrough)) i = 12;
>>                                  ^
>>   F:\Aaron Ballman\Desktop\test.c:2:18: error: 'fallthrough' attribute cannot be applied to a declaration
>>     __attribute__((fallthrough)) i = 12;
>>                    ^             ~
>>   1 warning and 1 error generated.
>>
>> After patch:
>>
>>   F:\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe -fsyntax-only "F:\Aaron Ballman\Desktop\test.c"
>>   F:\Aaron Ballman\Desktop\test.c:2:32: error: use of undeclared identifier 'i'
>>     __attribute__((fallthrough)) i = 12;
>>                                  ^
>>   1 error generated.
>
> In both situations, the code contains errors.  I don't think that this is a valid argument, to be honest.  Actually, the nature of this change is that it starts to parse strictly MORE cases.

My concerns are if there are cases where we would start to reject valid code based on the presence of a GNU statement attribute or where we'd regress diagnostics. This seemed like a case where we were regressing diagnostics and could potentially reject otherwise valid code (using a different attribute that wouldn't generate an error in the way `fallthrough` does). We previously were saying "aha, this is a declaration and the attribute cannot apply to it" which is strictly better than "aha, I have no idea what this identifier is doing here". However, I'm changing my opinion here -- see below.

>> So I think this will cause issues for the suppression attribute you're working on if we allow it to have a GNU spelling with valid C89 code like; `__attribute__((suppress)) i = 12;`
>
> This is **not** a valid code if you remove `__attribute__((suppress))`. `i = 12` without a prior declaration of `i` will cause the `use of undeclared identifier 'i'` error.  You can try it with any compiler: https://godbolt.org/z/P43nxn.

Thank you for pointing out that my test case was bad in the first place -- I forgot that *block scope* variables cannot be implicit `int`, but *file scope* ones can (in C, not in C++): https://godbolt.org/z/7TjGT3

When I went back and re-ran my test with the variable at file scope, the behavior stayed the same with and without the patch:

  F:\llvm-project>cat "F:\Aaron Ballman\Desktop\test.c"
  __attribute__((fallthrough)) i = 12;
  j = 12;

Before patch:

  F:\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe -std=c89 -fsyntax-only "F:\Aaron Ballman\Desktop\test.c"
  F:\Aaron Ballman\Desktop\test.c:1:30: warning: declaration specifier missing, defaulting to 'int'
  __attribute__((fallthrough)) i = 12;
                               ^
  int
  F:\Aaron Ballman\Desktop\test.c:1:16: error: 'fallthrough' attribute cannot be applied to a declaration
  __attribute__((fallthrough)) i = 12;
                 ^             ~
  F:\Aaron Ballman\Desktop\test.c:2:1: warning: declaration specifier missing,
        defaulting to 'int'
  j = 12;
  ^
  int
  2 warnings and 1 error generated.

After patch:

  F:\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe -std=c89 -fsyntax-only "F:\Aaron Ballman\Desktop\test.c"
  F:\Aaron Ballman\Desktop\test.c:1:30: warning: declaration specifier missing, defaulting to 'int'
  __attribute__((fallthrough)) i = 12;
                               ^
  int
  F:\Aaron Ballman\Desktop\test.c:1:16: error: 'fallthrough' attribute cannot be applied to a declaration
  __attribute__((fallthrough)) i = 12;
                 ^             ~
  F:\Aaron Ballman\Desktop\test.c:2:1: warning: declaration specifier missing, defaulting to 'int'
  j = 12;
  ^
  int
  2 warnings and 1 error generated.

Huttah!

> I honestly don't see a scenario when the user has some piece of code that is parsed as a statement, then they mindfully add a //statement// attribute, and expect it to be parsed as a declaration after that.

There are attributes like `nomerge` which are both a declaration and a statement attribute, and there are other attributes like `suppress` that are statement attributes which can reasonably be written on a declaration statement. So the scenario I am worried about is when the code is being parsed as a declaration, the user adds one of these attributes and the decision changes to parse as a statement rather than a declaration. I'm not yet convinced we can't run into that situation with this change, but the implicit `int` thing was the only major edge case I could come up with and I've alllllmost convinced myself we can't run into an issue with it. I think our parser predicate test (on line 218 of your patch) for whether something is a declaration statement or not is what's helping avoid most of the parsing concerns I have because they usually will find some declaration specifier to help get the right answer.

FWIW, while testing various situations out, I think I found a Clang bug where we reject valid code (https://godbolt.org/z/ooqMvP) that *might* cause us problems if we apply this patch and ever fix that rejects valid bug. We should not be rejecting any of these functions, but we fail to note the implicit `int` with the declaration within `foo()` (note how the presence of the `extern` declaration specifier in `quux()` fixes things).

>> or `__attribute__((suppress)) g() { return 12; }` which may be intended to suppress the "type specifier missing" diagnostic (somewhat amusingly, since the user could also just add the `int` to the declaration since they're changing the code anyway).
>
> Again, we **do not force** the parser to parse whatever follows the statement attribute as a statement.  `__attribute__((suppress))` will not change how we parse things here.

I agree that it shouldn't (in theory), but I've not been able to convince myself that it doesn't (in practice). I'm getting there though, and I appreciate your patience while we discuss the concerns. Concretely, I think we could use some more C test coverage for the various places where implicit `int` can appear in a declaration. Here are the places I could come up with off the top of my head: https://godbolt.org/z/6h3MTr

================
Comment at: clang/test/Parser/stmt-attributes.c:89
+
+__attribute__((nomerge)) static int i; // expected-error {{'nomerge' attribute cannot be applied to a declaration}}
----------------
aaron.ballman wrote:
> This is a semi-good example of the kind of behavior I was worried about -- `nomerge` can be applied to a declaration or a statement, just not *this* kind of declaration, and so the diagnostic is confusing.
In the latest patch, this diagnostic is back to being reasonable (but the test fails when I run it locally because it emits a different diagnostic).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93630/new/

https://reviews.llvm.org/D93630