[cfe-dev] [analyzer][RFC] Attribute(s) to enhance/configure the analysis

Fri Oct 23 05:34:18 PDT 2020

On Thu, Oct 22, 2020 at 10:31 PM Jan Korous <jkorous at apple.com> wrote:
>
> Hi all,
>
> I'm sorry I'm a little late to the party!
>
> I have couple thoughts first and a proposal at the end.
>
> On Oct 20, 2020, at 1:06 PM, Gábor Horváth via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> I think an ideal suppress annotation would:
> * be applicable to a wide range of scopes, not just function scope
> * have clear documentation indicating this is the last resort and refer to some guidelines how to suppress false positives by improving the code
> * come with an easy mechanism to check whether the annotation makes any difference, so we can easily get rid of redundant ones
> * come with an easy mechanism to check how many issues are suppressed by the same annotation (or even expect the user to specify this number as an argument and warn if that does not match the reality?)
>
>
> We must consider projects that need to support multiple versions of clang and by proxy multiple versions of Static Analyzer. Since there can be both false positives and false negatives in Analyzer we need to have a good story for situations when an attribute doesn't make sense for a particular Analyzer version (for example older one) but the maintainers still want to keep it in source. Maybe we could produce just lower severity diagnostics than warning?
> (In case someone asks - I would strongly prefer to not introduce versioning of these attributes.)
>
> On Oct 20, 2020, at 11:24 AM, Valeriy Savchenko via cfe-dev <cfe-dev at lists.llvm.org> wrote:
> However, the hardest choice here is the string argument for the attribute.  Again, it would be perfect if we can avoid weird identifiers like “java.com.warnings.alpha.whatever.2020”, it would be good if that string will make it clear for people who don’t use the analyzer what’s going on here.
> Checker names that we use in command line are a bit too verbose, error messages are prone to changes, bug categories are too broad and abstract (the intention is pretty vague if you suppress “Logic error”).
>
> My suggestion would be using so-called “Bug types” (we already show them in the summary table in index.html).  Each diagnostic has one, it’s unique and more-or-less descriptive.  If we revise all the existing bug types and try to make them concise and clear, I think it can work!
>
>
> Just to make sure we're on the same page - I'll use one of the WebKit checkers that I need to have a suppression for as an example:
>
> checker name:
> webkit.UncountedLambdaCapturesChecker
> error messages (depend on the context):
> warning: Captured raw-pointer 'ref_countable' to uncounted type is unsafe [webkit.UncountedLambdaCapturesChecker]
> warning: Implicitly captured raw-pointer 'ref_countable' to uncounted type is unsafe [webkit.UncountedLambdaCapturesChecker]
> bug category:
> WebKit coding guidelines
> bug type:
> Uncounted call argument for a raw pointer/reference parameter
>
> Assuming I got it right from all the above I have a clear preference which is the checker name. In a way it would be analogous to pragmas for cc1 diagnostics suppression - the same identifier is used in CLI and in the source for suppression.
> My second choice would be to introduce another string identifier as I have a hard time imagining any other of the existing strings being introduce to codebases.
>
> On Oct 20, 2020, at 1:11 PM, Valeriy Savchenko via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> I just want to add something that I think was not very clear from my previous email.  The existing “suppress” attribute is a statement attribute, which gives it a very fine level of granularity.
>
>
> I have couple use-cases for declaration attributes (like ParmVarDecl, FiledDecl, VarDecl) but I assume that no matter what mechanism we devise it should be applicable for declaration attributes too.
> One thing I feel we need is "kind safety". While it's trivial for single-purpose attribute to define what kinds of AST nodes is it applicable to it becomes a bit more work with different arguments of a generic attribute.
>
> On Oct 20, 2020, at 1:12 PM, Gábor Horváth via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> Sorry for the duplicate emails, I just realized that there is one more dimension of suppression mechanisms. There are separate tools such as CodeChecker that already support suppressing individual reports (either using comments or bug hashes). So adding an annotation could make it unclear who is responsible for the suppression and which feature should the user use.
>
>
> I assume that the status quo is that there's no common standard for in-source suppression for bug-finding tools. That means that arguably we won't make it significantly worse by adding n+1.
> Hypothetically if we design the annotation system well we might convince other projects to use it. And if I had to pick one tool to take into the account I would maybe think of GCC analyzer?
>
> https://developers.redhat.com/blog/2020/03/26/static-analysis-in-gcc-10/
>
> One thing to consider here is that if we want to have some cross-tool story we probably shouldn't use any vendor-specific prefix like clang::suppress or csa::suppress.

There's not a standardized identifier for this attribute and so I
believe we should in fact use a vendor-specific scope. Putting
something in the standard's attribute namespace would not be the right
approach, at least. My preference is to use the clang vendor namespace
because it's already a vendor namespace our users are used to seeing
and because the name covers the broadest range of diagnostics that can
be disabled.

> On Oct 21, 2020, at 11:57 AM, Aaron Ballman via cfe-dev <cfe-dev at lists.llvm.org> wrote:
>
> Thank you for exploring the options in this space -- I think having a
> uniform way for users to suppress diagnostics using attributes is a
> really interesting idea. I have personally never really liked using
> pragmas to do this work because that approach is usually quite verbose
> when applied to a single line of code.
>
> IIUC we have use-cases for suppression of "diagnostics produced by Analyzer" only.
> While attribute-based suppression of cc1 diagnostics sounds interesting I'm worried that the necessary broad discussion would effectively halt this.
> Would you mind if we keep the focus on Analyzer for now or do you believe we'd not converge to the right design unless we consider cc1 too?

I don't think that's the case, though. We have at least three
different tools that are all tightly integrated together and none of
them share a common mechanism for suppressing diagnostics. In Clang,
you either have to use command line arguments or a pragma. In
clang-tidy, we support // NOLINT comments. Now we're talking about
adding an attribute for the static analyzer. I think it's fine for
there to be more than one way to suppress a diagnostic, but we need a
common way for users that doesn't require them to understand what part
of the toolchain issued a diagnostic in order to suppress it properly.

I'm fine focusing on the static analyzer's needs, but I personally
think it's a requirement that whatever we come up with is a general
solution that covers at least the frontend, clang-tidy, and the static
analyzer with buy-in from the people heavily involved in those tools,
even if there's not a commitment to get that work done immediately for
all three tools.

> Re: "stable identifiers". I feel that no matter what identifiers we decide to use we still might want some escape hatch in case we end up having to rename a "stable identifier".
> I am not aware of any solution for such situation with pragmas.
> https://clang.llvm.org/docs/UsersManual.html#controlling-diagnostics-via-pragmas

I'm not overly worried about the stability of identifiers -- we change
diagnostics on occasion, but I think we tweak wording or remove/add
diagnostics far more than we change a diagnostic's meaning
sufficiently to warrant giving it a new stable identifier.

That said, it does bring up a different kind of diagnostic we should
keep in mind: [[clang::suppress("I no longer exist")]] where the
string literal doesn't refer to a diagnostic name. e.g., the user was
suppressing a diagnostic we removed, or the user has a typo in the
string literal, etc. We should perhaps diagnose this case with typo
correction to help alert users of the confusion rather than silently
accepting an attribute that has no impact.

> But I am thinking that basically all we need is a notion of identifier alias - if we can have two names for an attribute we're fine.
> Imagine we add a warning identifier "foo" to Analyzer in clang-12 and add an alias "bar" for it in clang-13. Projects using the "bar" identifier are fine and any project that is ok with minimal supported version being clang-13 can change the spelling in their sources to "bar".

I think this is an interesting idea and is worth exploring. For
instance, in clang-tidy we have checks that are aliased into multiple
groupings and I could imagine the user saying they want to suppress a
readability-based diagnostic but still allow the diagnostic if another
module is turned on even if it's the same information (e.g., maybe
they enable the CERT module and suddenly the issue is no longer about
readability but security as far as the user is concerned).

> Overall I suggest we:
> - Adopt the suppress attribute Valeriy mentioned and extend it to declarations.

Agreed, though we should also consider whether we have any type-based
warnings where we want the attribute on a type (I can't think of any
off the top of my head)

> - Don't use prefix like clang:: or csa:: to make it an obvious interface for suppression to be used by other tools too.

-1, this steps on design space that's not ours to step on.

> - MVP can support just a single string identifier, later extend to comma-separated list of string identifiers.

I can't imagine supporting a list of identifiers is actually more work
than supporting a single identifier. That said, I don't think it's
critical for the MVP, but if someone wants to implement it in the
initial offerings, I say go for it.

> - Use checker name as an identifier.

This approach seems to work fine if the checker only issues a few
diagnostics and they generally aren't on the same statement or
declaration, but it would be a problem if we had a checker which
issues multiple unrelated (or semi-related) warnings on the same
statement. e.g.,

// note, this is all one statement with newlines for clarity
foo() + // say Checker A issues a "don't call foo" warning here
bar() / baz(); // and Checker A also issues a "potential divide by
zero" warning here

If the user wanted to use an attribute to suppress the potential
divide by zero warning, using the checker name would also mean the
"don't call foo" warning is silenced because the attribute is at the
statement level. My impression is that we don't have many checkers
that would be in this situation, so maybe it's fine, but I do wonder a
bit about the frontend diagnostics. I'd imagine we could use the
warning group name as the "checker" name in that situation, but I
could imagine code like this:

[[gnu::not_supported_in_clang]] // want to suppress the ignored
unknown attribute diagnostic here
[[clang::supproted_in_clang]] // don't want to suppress the ignored
unknown attribute diagnostic here (from the typo)
void func();

I think it's worth intentionally deciding whether we want to support
cases like this or not up front. We also probably should talk about
some edge cases like whether aliases are treated the same as the
original checker or not, does it support families of checkers, etc.

~Aaron

>
>
> Cheers,
>
> Jan
>