[cfe-dev] [clang-tidy][RFC] Run each check only once

Sun Oct 3 05:44:47 PDT 2021

On Mon, Sep 27, 2021 at 7:50 AM Carlos Galvez via cfe-dev
<cfe-dev at lists.llvm.org> wrote:
>
> Hi!
>
> I'm writing to this list as suggested in this bug:
> https://reviews.llvm.org/D72566
>
> Background
> =========
>
> clang-tidy has implemented a number of checks, some of which are aliases of "real" checks. Currently if a user specifies that they want to run both the "real" and the "alias" check, clang-tidy will run the same check twice. Or N times if there are N aliases.
>
> When applying clang-tidy to large codebases, this can have a severe impact in runtime in Continuous Integration checks. The only solution so far is to manually disable alias checks, but this generates noise in the config file, and is error-prone and cumbersome to maintain in the future, as one always need to check the LLVM docs to see what is aliasing what.
>
> I have read in different bugs that one tricky problem with this is that "some alias checks run with different defaults, therefore leading to different results". I would be interested to know which checks are these and where this different default configuration is documented. I would argue that if 2 checks run different things and produce different results, then they are not aliases. They may share the same implementation, but they are not identical (therefore cannot be called "aliases".

Traditionally, we've always considered two checks to be aliases when
they share the same implementation and only differ via configuration.
We have a lot of aliased checks, see the bottom of this page:
https://clang.llvm.org/extra/clang-tidy/checks/list.html As for the
configuration differences, those are documented with the individual
checks.

> I have also read the argument "people should not enable all checks". I believe that should be up to users to decide. Plus, as we've seen, there are aliases even within the same category, e.g. "cert-*", so the problem still remains.

I strongly agree with the argument that people should not enable all
checks. For example, we have checks that sometimes give conflicting
diagnostics where the user cannot win. But more importantly, given
that many clang-tidy checks are for specific coding standards, I
believe it is very unlikely to realistically enable all checks anyway
because you rarely want to conform to every coding standard (more
often, you want to conform to one or two). Coding standards comprise
the majority of clang-tidy checks. abseil, altera, android, boost,
cert, cppcoreguidelines, darwin, fuschia, google, hicpp, linuxkernel,
llvm, llvmlibc, mpi, objc, openmp, and zircon are the coding standard
modules, while bugprone, majority of clang-analyzer, concurrency,
misc, modernize, performance, portability, and readability are the
general-purpose modules.

> Proposal
> =======
>
> - Find which checks are "fake aliases", if any - share the same implementation but run with different defaults and produce different results. Change these checks to not be considered as "aliases".
>
> - Add an opt-in feature (command line, config) to enable "the same check should be run only once". Diagnostics should come only from the "main" check (if it's active) or from the "alias" check (if the "main" check is not active).

Personally, I'm unmotivated by the idea of running all checks in
clang-tidy and I don't think it's a feature we want to encourage.
There's not a lot of reason to believe that it's valuable to enable
the LLVM style guide checks at the same time as the Linux Kernel
checks at the same time as the android checks given how unrelated
these modules are to one another (and other such mildly-nonsensical
combinations exist).

That said, aliases between one of the coding standard modules and one
of the general-purpose modules happen quite frequently and combining
these kinds of modules (or general-purpose checks with other
general-purpose checks in another module) is very reasonable to want
to disable exact aliases for. So my thinking is: if the checks have
*no* configuration differences, there could be value in allowing the
user to run the check only once. I have no idea how many checks would
qualify though. Also, I think we'd still want to list all the enabled
checks which trigger the diagnostic as we currently do. e.g. if cert-*
and misc-* are enabled, we still want both of them listed in the text
inside the [] in: `warning: found assert() that could be replaced by
static_assert() [cert-dcl03-c,misc-static-assert]`, otherwise behavior
with NOLINT suppression tags could get interesting.

~Aaron

>
> Let me know what you think! I'm a total newby to LLVM but I'm happy to contribute in any way I can :)
>
> Thanks!
> /Carlos
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev