[cfe-dev] [clang-tidy][RFC] Run each check only once

Sun Oct 3 08:24:37 PDT 2021

On Sun, Oct 3, 2021 at 10:52 AM Carlos Galvez <carlosgalvezp at gmail.com> wrote:
>
> I realized I didn't click on "Reply all" :facepalm:, sending again:
>
> Thanks a lot for the feedback!
>
> > As for the
> configuration differences, those are documented with the individual
> checks.
>
> I have gone through your link, checking one by one each of the alias checks. There's not a single one of them that describes any configuration, or where it says that its configuration differs from the "main" check. The docs only say "this is an alias check, please check <main check> for documentation. Am I looking at the wrong place?

The primary check documents the configuration options. e.g.,
https://clang.llvm.org/extra/clang-tidy/checks/cppcoreguidelines-special-member-functions.html.
However, I am betting there are inconsistencies in documentation that
could be improved and you raise a valid point of it being hard to know
*how* these checks differ (if they differ at all).

> > For example, we have checks that sometimes give conflicting
> diagnostics where the user cannot win.
>
> It's very simple to win - the user disables either of the checks which is giving conflict diagnostics.

Our definitions of "simple" differ -- I consider any changes to a
configuration file (or adding pragmas) to silence a diagnostic to be
strictly "harder" than locally modifying the code so that the
diagnostic isn't triggered in the first place.

> We are running clang-tidy with all checks enabled plus a handful disabled on a large codebase and it's working just fine. The checks that we disable are not due to conflicts, but simply because we don't want them. Having a list of the checks that are disabled visualizes more clearly what is it extra that we could potentially check, but don't want to for reasons.
>
> > you rarely want to conform to every coding standard
>
> It's not about conformance, it's about "writing code using best practices". Coding guidelines are there to propose best practices in their fields, and usually most are in the same line. Aggregating all checks from all coding guidelines allows one to be exposed to best practices from all different domains. If a particular check doesn't apply to your domain, you just disable it, but still get a chance to get a diagnostic and read about the pitfalls the check is preventing you from.
>
> In other words, the use case is "use clang-tidy to learn about best C++ coding practices", instead of "conform to X guideline". Whether this makes sense or not it's up to the user I believe. It really depends on what everyone wants to achieve and I don't see why clang-tidy should have a say on that.

My point is: these modules are not designed for "writing code using
best practices" on the whole, so enabling all of them is basically not
a useful operation for most people. The Linux kernel module checks are
highly specific to the Linux kernel and have nothing to do with code
quality outside of that project. The LLVM module checks are highly
specific to LLVM in the same way. The LLVM libc module checks are
highly specific to LLVM's libc in the same way. Etc. Making it easier
for users to enable all of the checks at once requires check authors
to consider *all* interactions between modules, and to date that isn't
a price we've been willing to make check authors pay. Instead, we
typically view modules as being orthogonal sets of functionality that
may or may not play well together.

> > we still want both of them listed
> A bit annoying but I could live with that, I can see quite a pain to implement the different edge cases - let's keep it simple. The biggest gain is decrease in runtime for the tool in CI.
>
> As Roman says, what would be the problem if this was an opt-out feature?

To date, aliases have been surfaced to the user as a first-class check
(while the documentation typically redirects to the primary check for
simplicity of writing check aliases, there's no requirement that they
do so; other than that, aliases should behave the same as any other
check as far as the user is concerned.) You're proposing to make
aliases be sort of second-class checks that are more closely tied to
the primary check by providing an option that says aliases are
disabled as a blanket operation rather than a case-by-case basis. I am
not convinced this is a good approach. We may already have checks for
which this is a problem (where the aliased check and the primary check
are checking different things), and this closes off the design space
for such checks in the future (and possibly encourages less code reuse
in the process).

Btw, I'm not saying "no way", I'm saying "let's make sure we're not
regressing functionality or painting ourselves into a corner."

I'd be curious to hear from Alex on the topic as well, as this is not
the first discussion about alias behaviors.

~Aaron

>
> Thanks!
>
> /Carlos
>
>
> On Sun, Oct 3, 2021 at 4:01 PM Aaron Ballman <aaron at aaronballman.com> wrote:
>>
>> On Sun, Oct 3, 2021 at 9:03 AM Roman Lebedev <lebedev.ri at gmail.com> wrote:
>> >
>> > While i don't really have an opinion as to what should happen if
>> > someone enables multiple aliased checks with different configs,
>> > i'm starting to strongly believe that this whole concept of
>> > aliased checks is broken, and there must be an explicit opt-out,
>> > that completely disabled all the aliasee checks,
>> > while not affecting the non-aliasee checks.
>>
>> Can you explain why? My inclination is to be opposed to such a thing
>> because the design of clang-tidy has been to encourage aliases to be
>> able to share implementations while being able to expose significantly
>> different checks between modules. So while I can understand a
>> rationale for a flag to disable *identical* checks, I think we set the
>> wrong expectations by surfacing a feature to disable *all* aliases
>> because that implies the aliases are not useful in general. (Note, if
>> the discussion is instead about whether we should deprecate alias
>> checks entirely, my opinions may differ. My point is that if we are
>> continuing to encourage aliases, they should continue to be surfaced
>> as first-class checks otherwise users have to worry more about what
>> check is an alias and what check is not, which doesn't seem like a
>> useful distinction to force users to reason about.)
>>
>> ~Aaron
>>
>> > I would like to see that change happen, and i would try to help review
>> > such a patch.
>> >
>> >
>> > Roman
>> >
>> > On Sun, Oct 3, 2021 at 3:45 PM Aaron Ballman via cfe-dev
>> > <cfe-dev at lists.llvm.org> wrote:
>> > >
>> > > On Mon, Sep 27, 2021 at 7:50 AM Carlos Galvez via cfe-dev
>> > > <cfe-dev at lists.llvm.org> wrote:
>> > > >
>> > > > Hi!
>> > > >
>> > > > I'm writing to this list as suggested in this bug:
>> > > > https://reviews.llvm.org/D72566
>> > > >
>> > > > Background
>> > > > =========
>> > > >
>> > > > clang-tidy has implemented a number of checks, some of which are aliases of "real" checks. Currently if a user specifies that they want to run both the "real" and the "alias" check, clang-tidy will run the same check twice. Or N times if there are N aliases.
>> > > >
>> > > > When applying clang-tidy to large codebases, this can have a severe impact in runtime in Continuous Integration checks. The only solution so far is to manually disable alias checks, but this generates noise in the config file, and is error-prone and cumbersome to maintain in the future, as one always need to check the LLVM docs to see what is aliasing what.
>> > > >
>> > > > I have read in different bugs that one tricky problem with this is that "some alias checks run with different defaults, therefore leading to different results". I would be interested to know which checks are these and where this different default configuration is documented. I would argue that if 2 checks run different things and produce different results, then they are not aliases. They may share the same implementation, but they are not identical (therefore cannot be called "aliases".
>> > >
>> > > Traditionally, we've always considered two checks to be aliases when
>> > > they share the same implementation and only differ via configuration.
>> > > We have a lot of aliased checks, see the bottom of this page:
>> > > https://clang.llvm.org/extra/clang-tidy/checks/list.html As for the
>> > > configuration differences, those are documented with the individual
>> > > checks.
>> > >
>> > > > I have also read the argument "people should not enable all checks". I believe that should be up to users to decide. Plus, as we've seen, there are aliases even within the same category, e.g. "cert-*", so the problem still remains.
>> > >
>> > > I strongly agree with the argument that people should not enable all
>> > > checks. For example, we have checks that sometimes give conflicting
>> > > diagnostics where the user cannot win. But more importantly, given
>> > > that many clang-tidy checks are for specific coding standards, I
>> > > believe it is very unlikely to realistically enable all checks anyway
>> > > because you rarely want to conform to every coding standard (more
>> > > often, you want to conform to one or two). Coding standards comprise
>> > > the majority of clang-tidy checks. abseil, altera, android, boost,
>> > > cert, cppcoreguidelines, darwin, fuschia, google, hicpp, linuxkernel,
>> > > llvm, llvmlibc, mpi, objc, openmp, and zircon are the coding standard
>> > > modules, while bugprone, majority of clang-analyzer, concurrency,
>> > > misc, modernize, performance, portability, and readability are the
>> > > general-purpose modules.
>> > >
>> > > > Proposal
>> > > > =======
>> > > >
>> > > > - Find which checks are "fake aliases", if any - share the same implementation but run with different defaults and produce different results. Change these checks to not be considered as "aliases".
>> > > >
>> > > > - Add an opt-in feature (command line, config) to enable "the same check should be run only once". Diagnostics should come only from the "main" check (if it's active) or from the "alias" check (if the "main" check is not active).
>> > >
>> > > Personally, I'm unmotivated by the idea of running all checks in
>> > > clang-tidy and I don't think it's a feature we want to encourage.
>> > > There's not a lot of reason to believe that it's valuable to enable
>> > > the LLVM style guide checks at the same time as the Linux Kernel
>> > > checks at the same time as the android checks given how unrelated
>> > > these modules are to one another (and other such mildly-nonsensical
>> > > combinations exist).
>> > >
>> > > That said, aliases between one of the coding standard modules and one
>> > > of the general-purpose modules happen quite frequently and combining
>> > > these kinds of modules (or general-purpose checks with other
>> > > general-purpose checks in another module) is very reasonable to want
>> > > to disable exact aliases for. So my thinking is: if the checks have
>> > > *no* configuration differences, there could be value in allowing the
>> > > user to run the check only once. I have no idea how many checks would
>> > > qualify though. Also, I think we'd still want to list all the enabled
>> > > checks which trigger the diagnostic as we currently do. e.g. if cert-*
>> > > and misc-* are enabled, we still want both of them listed in the text
>> > > inside the [] in: `warning: found assert() that could be replaced by
>> > > static_assert() [cert-dcl03-c,misc-static-assert]`, otherwise behavior
>> > > with NOLINT suppression tags could get interesting.
>> > >
>> > > ~Aaron
>> > >
>> > > >
>> > > > Let me know what you think! I'm a total newby to LLVM but I'm happy to contribute in any way I can :)
>> > > >
>> > > > Thanks!
>> > > > /Carlos
>> > > >
>> > > > _______________________________________________
>> > > > cfe-dev mailing list
>> > > > cfe-dev at lists.llvm.org
>> > > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>> > > _______________________________________________
>> > > cfe-dev mailing list
>> > > cfe-dev at lists.llvm.org
>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev