[cfe-dev] [clang-tidy][RFC] Run each check only once

Carlos Galvez via cfe-dev cfe-dev at lists.llvm.org
Sat Nov 20 06:07:05 PST 2021


Here it goes! https://reviews.llvm.org/D114317

As I wrote in the commit message, please take it with a grain of salt - the
code is not pretty and many things are missing - I'll take care of that
if/when we are happy with the general direction. Looking forward to your
feedback! If you have suggestions for the missing bits I'd be very happy to
hear them.

/Carlos

On Thu, Nov 18, 2021 at 1:46 PM Aaron Ballman <aaron at aaronballman.com>
wrote:

> On Wed, Nov 17, 2021 at 7:36 AM Carlos Galvez <carlosgalvezp at gmail.com>
> wrote:
> >
> > Hi again!
> >
> > I've been looking into this for a while now and have some very crude
> proof-of-concept that seems to work without any impact for check developers
> and ensuring only "perfect aliases" are not run. The only things I can't
> get around are diagnostics and profiling, so I'd need some help to figure
> that out (or drop the requirements). Should I push this proof-of-concept
> for review and continue the discussion there, or do you prefer to discuss
> here in the mailing list?
>
> If you've got a proof of concept, I think it's probably worth showing
> at this point. (Just as an FYI for setting expectations, my review
> time is pretty limited for the next while.)
>
> ~Aaron
>
> >
> > Best regards,
> > /Carlos
> >
> >
> > On Mon, Oct 4, 2021 at 9:03 PM Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >>
> >> On Mon, Oct 4, 2021 at 2:30 PM Carlos Galvez <carlosgalvezp at gmail.com>
> wrote:
> >> >
> >> > I think we agree in the goals of this change, perhaps it gets
> confusing when using the term "alias" with different meanings, so let me
> summarize what I believe is the requirements of what we want to have,
> without entering in technicalities:
> >> >
> >> > * All publicly documented checks should be "first-class checks",
> meaning users can be enable/disable them independently of each other. Not
> the scope of this change to deprecate the concept/name of "alias", perhaps
> in the future.
> >>
> >> +1. To be super clear, I consider checks which redirect their
> >> documentation to still count as "publicly documented checks".
> >>
> >> > * Users don't need to know/care how these checks are implemented. If
> they reuse code or not, that's not their concern. They just need to care
> that their code will get checked properly when the check is enabled, so
> that they can fix bugs, improve code quality or be compliant with a given
> guideline.
> >>
> >> +1
> >>
> >> > * Check developers should keep their exact same development flow, and
> be able to reuse code if possible as they do today.
> >>
> >> +1
> >>
> >> > * Identical checks should run only once. The definition of
> "identical" is: a) they have the same implementation and b) they have the
> same configuration. Same implementation but different configuration is not
> identical checks, so they would all still be run. So the scope is smaller
> than what it might have looked like in my first post, I hope it's clear now.
> >>
> >> +1
> >>
> >> > * Two additional points of discussion: diagnostics and profiling. IMO
> implementing this change would be easier if we just skipped diagnostics and
> profiling for the not-run "identical" checks. We have a list of all checks,
> then remove the identical ones and run the updated list of checks. Keeping
> diagnostics and profiling info would probably complicate the design of the
> tool, but maybe someone has clever ideas. From a user perspective, I think
> it's neat to avoid diagnostics from identical checks, since it's less noisy
> to NOLINT them in code.
> >>
> >> Regarding diagnostics, what I'm hoping we can do is keep the
> >> diagnostic behavior we already have. e.g.,
> >> https://godbolt.org/z/ss4eKsadb where the diagnostic says:
> >>
> >> warning: found assert() that could be replaced by static_assert()
> >> [cert-dcl03-c,misc-static-assert]
> >>
> >> and lists both of the ways in which the diagnostic can be triggered
> >> but does not show hicpp-static-assert which is a third way to trigger
> >> the diagnostic (but was never enabled). I think this is useful because
> >> not listing the identical checks can potentially cause some pain for
> >> users depending on how they fix the diagnostic or migrate their code.
> >> While this one would be most sensible to just change the code, it's
> >> not unthinkable that a developer will add a `// NOLINT` comment to
> >> disable the diagnostic. If they're only shown one check name in the
> >> diagnostic, the user may elect to write `// NOLINT(cert-dcl03-c)` to
> >> disable *only* the check name they see. They'll be rather surprised if
> >> they decide to drop checking the CERT module and they start getting
> >> diagnostics on that line about misc-static-assert when the code was
> >> previously suppressing the warning. Seeing multiple checks that
> >> trigger will hopefully help the user to know to write `// NOLINT`
> >> instead of a specific check, or may help gently guide them towards
> >> fixing their code instead of suppressing the diagnostic.
> >>
> >> That said, if this turns out to be an implementation burden in
> >> practice, I think the behavior is fine *so long as* the user who adds
> >> `// NOLINT(cert-dcl03-c)` doesn't have the experience of silencing the
> >> cert diagnostic while still emitting the misc diagnostic (which, if
> >> we're not running the duplicate check in the first place, shouldn't
> >> happen anyway).
> >>
> >> > Would you agree on the above?
> >>
> >> I think we're in agreement, yes!
> >>
> >> ~Aaron
> >>
> >> > > So the primary check gets all the options that users can tweak, and
> then coding style checks set those tweakable options as they see fit
> >> > Thanks, this helps me understand the design. Makes sense to have a
> "core" implementation of the check with knobs that can be tweaked via
> config for each guideline. I have some thoughts about build system and
> where to put such "common" checks that are re-usable, but let's leave that
> for another time :)
> >> >
> >> > On Mon, Oct 4, 2021 at 4:35 PM Aaron Ballman <aaron at aaronballman.com>
> wrote:
> >> >>
> >> >> On Sun, Oct 3, 2021 at 8:48 PM Carlos Galvez <
> carlosgalvezp at gmail.com> wrote:
> >> >> >
> >> >> > > Making it easier
> >> >> > for users to enable all of the checks at once requires check
> authors
> >> >> > to consider *all* interactions between modules, and to date that
> isn't
> >> >> > a price we've been willing to make check authors pay. Instead, we
> >> >> > typically view modules as being orthogonal sets of functionality
> that
> >> >> > may or may not play well together.
> >> >> >
> >> >> > Fully agree to that, checks should absolutely be independent,
> granular and focused on doing one single thing without depending on one
> another. And they may conflict with each other. My point is that it should
> be the user making the decision as to how to handle conflicts, enabling and
> disabling whichever checks they want. Why is that a problem?
> >> >>
> >> >> I am not convinced that it's the right granularity. We've developed
> >> >> aliases with the idea that as far as the user is concerned, which is
> >> >> primary and which is aliased should not matter. Now we want to change
> >> >> that with a blanket switch that impacts all aliases. This isn't a
> >> >> problem per se, but I'm not convinced we're ready to add a switch to
> >> >> disable all aliases yet.
> >> >>
> >> >> > We are not requiring extra work on LLVM developers.
> >> >>
> >> >> We certainly are requiring extra work for check developers because
> now
> >> >> they have to consider whether to make a check an alias or not because
> >> >> that now matters to how their check is surfaced to users.
> >> >>
> >> >> > Why should they have a say on how the user configures the tool? If
> developers have exposed this public CLI/config interface, users are free to
> use it as they see fit (and pay the price of their choices, of course).
> >> >> >
> >> >> > > You're proposing to make aliases be sort of second-class checks
> >> >> >
> >> >> > Hmm, not exactly, I agree with you that all publicly documented
> checks should be regarded as "first-class" checks. What I'm proposing here
> could be considered more as an "optimization" - don't run the same check
> twice, just re-use previous results. If you know that you are going to run
> 2 checks that are identical, just run one of them.
> >> >>
> >> >> I think that functionality is a good first step. If it's limited to
> >> >> only checks that are *identical*, then my concerns largely go away.
> >> >> But I had the impression that the desired scope was larger -- to
> avoid
> >> >> running checks that are identical when ignoring the default
> >> >> configurations of the checks -- and that difference is the bit that
> >> >> worries me.
> >> >>
> >> >> > > we've never really cared which module has the primary check and
> which one has the alias check
> >> >> > I actually think this is one of the problems that you were
> concerned about above. Today, not all checks are first-class. There are
> "primary" checks and "alias" checks, and a decision has to be made as to
> where to put which, which causes the very issues you are describing. Not
> all checks from cppcoreguidelines are "primary", some come from "misc", so
> what? They aren't any less valuable.
> >> >>
> >> >> Today, all checks are first-class as far as the user running the tool
> >> >> is concerned. We mention aliases in public so users are aware that
> >> >> they exist and to ease documentation burdens (maybe that was a
> >> >> mistake), but users don't have to care which one is primary and which
> >> >> ones are secondary when they run the tool. So long as we keep that
> >> >> property, I'm content.
> >> >>
> >> >> > I refer to my question above - why do we even need this
> distinction between "primary" and "alias", i.e. what problem is this
> solving? There's 3 checks checking for C style arrays? That's great, why
> not just link them to the same documentation, and internally (not leaked to
> the user) use the same implementation?
> >> >>
> >> >> That's the whole goal of aliases. You bring up a good point on
> whether
> >> >> we need users to know about them at all.
> >> >>
> >> >> > The only use case I can think of to publicly document that there
> are "alias" checks is to let the user know that they can "skip redundant
> checks" by manually disabling them. If there's any other use case I'd be
> very happy to know!
> >> >>
> >> >> They're a convenience for developing a check once but varying its
> >> >> behavior across modules, as this use case comes up somewhat
> frequently
> >> >> with coding standards. So the primary check gets all the options that
> >> >> users can tweak, and then coding style checks set those tweakable
> >> >> options as they see fit. However, this possibly doesn't require the
> >> >> user to know about the check being an alias. And there's no benefit
> to
> >> >> running a check twice when it's configuration is identical to a
> >> >> previous check that I can tell.
> >> >>
> >> >> ~Aaron
> >> >>
> >> >> >
> >> >> > /Carlos
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, Oct 3, 2021 at 6:19 PM Aaron Ballman <
> aaron at aaronballman.com> wrote:
> >> >> >>
> >> >> >> On Sun, Oct 3, 2021 at 11:24 AM Aaron Ballman <
> aaron at aaronballman.com> wrote:
> >> >> >> > To date, aliases have been surfaced to the user as a
> first-class check
> >> >> >> > (while the documentation typically redirects to the primary
> check for
> >> >> >> > simplicity of writing check aliases, there's no requirement
> that they
> >> >> >> > do so; other than that, aliases should behave the same as any
> other
> >> >> >> > check as far as the user is concerned.) You're proposing to make
> >> >> >> > aliases be sort of second-class checks that are more closely
> tied to
> >> >> >> > the primary check by providing an option that says aliases are
> >> >> >> > disabled as a blanket operation rather than a case-by-case
> basis. I am
> >> >> >> > not convinced this is a good approach. We may already have
> checks for
> >> >> >> > which this is a problem (where the aliased check and the
> primary check
> >> >> >> > are checking different things), and this closes off the design
> space
> >> >> >> > for such checks in the future (and possibly encourages less
> code reuse
> >> >> >> > in the process).
> >> >> >> >
> >> >> >> > Btw, I'm not saying "no way", I'm saying "let's make sure we're
> not
> >> >> >> > regressing functionality or painting ourselves into a corner."
> >> >> >>
> >> >> >> Also, to date I believe we've never really cared which module has
> the
> >> >> >> primary check and which one has the alias check (it's fine for the
> >> >> >> primary to be in cert- and the alias to be in bugprone- or vice
> >> >> >> versa), but with this sort of an option, that implementation
> detail is
> >> >> >> leaked out to the user in a more obvious way. If we go this
> route, we
> >> >> >> might have to move some checks around so there's some consistent
> rule
> >> >> >> to aid users (like, the general-purpose module holds the primary
> and
> >> >> >> the coding style guidelines get the alias), but it may not work in
> >> >> >> every situation (I can imagine a check living in two style guides
> but
> >> >> >> not a general purpose module, or two general purpose modules but
> not a
> >> >> >> style guide).
> >> >> >>
> >> >> >> ~Aaron
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20211120/ce18f521/attachment-0001.html>


More information about the cfe-dev mailing list