[cfe-dev] [clang-tidy][RFC] Run each check only once

Carlos Galvez via cfe-dev cfe-dev at lists.llvm.org
Wed Nov 17 04:35:31 PST 2021


Hi again!

I've been looking into this for a while now and have some very crude
proof-of-concept that seems to work without any impact for check developers
and ensuring only "perfect aliases" are not run. The only things I can't
get around are diagnostics and profiling, so I'd need some help to figure
that out (or drop the requirements). Should I push this proof-of-concept
for review and continue the discussion there, or do you prefer to discuss
here in the mailing list?

Best regards,
/Carlos


On Mon, Oct 4, 2021 at 9:03 PM Aaron Ballman <aaron at aaronballman.com> wrote:

> On Mon, Oct 4, 2021 at 2:30 PM Carlos Galvez <carlosgalvezp at gmail.com>
> wrote:
> >
> > I think we agree in the goals of this change, perhaps it gets confusing
> when using the term "alias" with different meanings, so let me summarize
> what I believe is the requirements of what we want to have, without
> entering in technicalities:
> >
> > * All publicly documented checks should be "first-class checks", meaning
> users can be enable/disable them independently of each other. Not the scope
> of this change to deprecate the concept/name of "alias", perhaps in the
> future.
>
> +1. To be super clear, I consider checks which redirect their
> documentation to still count as "publicly documented checks".
>
> > * Users don't need to know/care how these checks are implemented. If
> they reuse code or not, that's not their concern. They just need to care
> that their code will get checked properly when the check is enabled, so
> that they can fix bugs, improve code quality or be compliant with a given
> guideline.
>
> +1
>
> > * Check developers should keep their exact same development flow, and be
> able to reuse code if possible as they do today.
>
> +1
>
> > * Identical checks should run only once. The definition of "identical"
> is: a) they have the same implementation and b) they have the same
> configuration. Same implementation but different configuration is not
> identical checks, so they would all still be run. So the scope is smaller
> than what it might have looked like in my first post, I hope it's clear now.
>
> +1
>
> > * Two additional points of discussion: diagnostics and profiling. IMO
> implementing this change would be easier if we just skipped diagnostics and
> profiling for the not-run "identical" checks. We have a list of all checks,
> then remove the identical ones and run the updated list of checks. Keeping
> diagnostics and profiling info would probably complicate the design of the
> tool, but maybe someone has clever ideas. From a user perspective, I think
> it's neat to avoid diagnostics from identical checks, since it's less noisy
> to NOLINT them in code.
>
> Regarding diagnostics, what I'm hoping we can do is keep the
> diagnostic behavior we already have. e.g.,
> https://godbolt.org/z/ss4eKsadb where the diagnostic says:
>
> warning: found assert() that could be replaced by static_assert()
> [cert-dcl03-c,misc-static-assert]
>
> and lists both of the ways in which the diagnostic can be triggered
> but does not show hicpp-static-assert which is a third way to trigger
> the diagnostic (but was never enabled). I think this is useful because
> not listing the identical checks can potentially cause some pain for
> users depending on how they fix the diagnostic or migrate their code.
> While this one would be most sensible to just change the code, it's
> not unthinkable that a developer will add a `// NOLINT` comment to
> disable the diagnostic. If they're only shown one check name in the
> diagnostic, the user may elect to write `// NOLINT(cert-dcl03-c)` to
> disable *only* the check name they see. They'll be rather surprised if
> they decide to drop checking the CERT module and they start getting
> diagnostics on that line about misc-static-assert when the code was
> previously suppressing the warning. Seeing multiple checks that
> trigger will hopefully help the user to know to write `// NOLINT`
> instead of a specific check, or may help gently guide them towards
> fixing their code instead of suppressing the diagnostic.
>
> That said, if this turns out to be an implementation burden in
> practice, I think the behavior is fine *so long as* the user who adds
> `// NOLINT(cert-dcl03-c)` doesn't have the experience of silencing the
> cert diagnostic while still emitting the misc diagnostic (which, if
> we're not running the duplicate check in the first place, shouldn't
> happen anyway).
>
> > Would you agree on the above?
>
> I think we're in agreement, yes!
>
> ~Aaron
>
> > > So the primary check gets all the options that users can tweak, and
> then coding style checks set those tweakable options as they see fit
> > Thanks, this helps me understand the design. Makes sense to have a
> "core" implementation of the check with knobs that can be tweaked via
> config for each guideline. I have some thoughts about build system and
> where to put such "common" checks that are re-usable, but let's leave that
> for another time :)
> >
> > On Mon, Oct 4, 2021 at 4:35 PM Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >>
> >> On Sun, Oct 3, 2021 at 8:48 PM Carlos Galvez <carlosgalvezp at gmail.com>
> wrote:
> >> >
> >> > > Making it easier
> >> > for users to enable all of the checks at once requires check authors
> >> > to consider *all* interactions between modules, and to date that isn't
> >> > a price we've been willing to make check authors pay. Instead, we
> >> > typically view modules as being orthogonal sets of functionality that
> >> > may or may not play well together.
> >> >
> >> > Fully agree to that, checks should absolutely be independent,
> granular and focused on doing one single thing without depending on one
> another. And they may conflict with each other. My point is that it should
> be the user making the decision as to how to handle conflicts, enabling and
> disabling whichever checks they want. Why is that a problem?
> >>
> >> I am not convinced that it's the right granularity. We've developed
> >> aliases with the idea that as far as the user is concerned, which is
> >> primary and which is aliased should not matter. Now we want to change
> >> that with a blanket switch that impacts all aliases. This isn't a
> >> problem per se, but I'm not convinced we're ready to add a switch to
> >> disable all aliases yet.
> >>
> >> > We are not requiring extra work on LLVM developers.
> >>
> >> We certainly are requiring extra work for check developers because now
> >> they have to consider whether to make a check an alias or not because
> >> that now matters to how their check is surfaced to users.
> >>
> >> > Why should they have a say on how the user configures the tool? If
> developers have exposed this public CLI/config interface, users are free to
> use it as they see fit (and pay the price of their choices, of course).
> >> >
> >> > > You're proposing to make aliases be sort of second-class checks
> >> >
> >> > Hmm, not exactly, I agree with you that all publicly documented
> checks should be regarded as "first-class" checks. What I'm proposing here
> could be considered more as an "optimization" - don't run the same check
> twice, just re-use previous results. If you know that you are going to run
> 2 checks that are identical, just run one of them.
> >>
> >> I think that functionality is a good first step. If it's limited to
> >> only checks that are *identical*, then my concerns largely go away.
> >> But I had the impression that the desired scope was larger -- to avoid
> >> running checks that are identical when ignoring the default
> >> configurations of the checks -- and that difference is the bit that
> >> worries me.
> >>
> >> > > we've never really cared which module has the primary check and
> which one has the alias check
> >> > I actually think this is one of the problems that you were concerned
> about above. Today, not all checks are first-class. There are "primary"
> checks and "alias" checks, and a decision has to be made as to where to put
> which, which causes the very issues you are describing. Not all checks from
> cppcoreguidelines are "primary", some come from "misc", so what? They
> aren't any less valuable.
> >>
> >> Today, all checks are first-class as far as the user running the tool
> >> is concerned. We mention aliases in public so users are aware that
> >> they exist and to ease documentation burdens (maybe that was a
> >> mistake), but users don't have to care which one is primary and which
> >> ones are secondary when they run the tool. So long as we keep that
> >> property, I'm content.
> >>
> >> > I refer to my question above - why do we even need this distinction
> between "primary" and "alias", i.e. what problem is this solving? There's 3
> checks checking for C style arrays? That's great, why not just link them to
> the same documentation, and internally (not leaked to the user) use the
> same implementation?
> >>
> >> That's the whole goal of aliases. You bring up a good point on whether
> >> we need users to know about them at all.
> >>
> >> > The only use case I can think of to publicly document that there are
> "alias" checks is to let the user know that they can "skip redundant
> checks" by manually disabling them. If there's any other use case I'd be
> very happy to know!
> >>
> >> They're a convenience for developing a check once but varying its
> >> behavior across modules, as this use case comes up somewhat frequently
> >> with coding standards. So the primary check gets all the options that
> >> users can tweak, and then coding style checks set those tweakable
> >> options as they see fit. However, this possibly doesn't require the
> >> user to know about the check being an alias. And there's no benefit to
> >> running a check twice when it's configuration is identical to a
> >> previous check that I can tell.
> >>
> >> ~Aaron
> >>
> >> >
> >> > /Carlos
> >> >
> >> >
> >> >
> >> > On Sun, Oct 3, 2021 at 6:19 PM Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >> >>
> >> >> On Sun, Oct 3, 2021 at 11:24 AM Aaron Ballman <
> aaron at aaronballman.com> wrote:
> >> >> > To date, aliases have been surfaced to the user as a first-class
> check
> >> >> > (while the documentation typically redirects to the primary check
> for
> >> >> > simplicity of writing check aliases, there's no requirement that
> they
> >> >> > do so; other than that, aliases should behave the same as any other
> >> >> > check as far as the user is concerned.) You're proposing to make
> >> >> > aliases be sort of second-class checks that are more closely tied
> to
> >> >> > the primary check by providing an option that says aliases are
> >> >> > disabled as a blanket operation rather than a case-by-case basis.
> I am
> >> >> > not convinced this is a good approach. We may already have checks
> for
> >> >> > which this is a problem (where the aliased check and the primary
> check
> >> >> > are checking different things), and this closes off the design
> space
> >> >> > for such checks in the future (and possibly encourages less code
> reuse
> >> >> > in the process).
> >> >> >
> >> >> > Btw, I'm not saying "no way", I'm saying "let's make sure we're not
> >> >> > regressing functionality or painting ourselves into a corner."
> >> >>
> >> >> Also, to date I believe we've never really cared which module has the
> >> >> primary check and which one has the alias check (it's fine for the
> >> >> primary to be in cert- and the alias to be in bugprone- or vice
> >> >> versa), but with this sort of an option, that implementation detail
> is
> >> >> leaked out to the user in a more obvious way. If we go this route, we
> >> >> might have to move some checks around so there's some consistent rule
> >> >> to aid users (like, the general-purpose module holds the primary and
> >> >> the coding style guidelines get the alias), but it may not work in
> >> >> every situation (I can imagine a check living in two style guides but
> >> >> not a general purpose module, or two general purpose modules but not
> a
> >> >> style guide).
> >> >>
> >> >> ~Aaron
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211117/bfaeb005/attachment-0001.html>


More information about the cfe-dev mailing list