[cfe-dev] More plans on Static Analyzer + Clang-Tidy interoperation.

Thu Oct 15 11:54:06 PDT 2020

On Wed, Oct 14, 2020 at 4:28 PM Artem Dergachev <noqnoqneo at gmail.com> wrote:
> On 10/13/20 5:47 AM, Aaron Ballman wrote:
> > On Mon, Oct 12, 2020 at 5:26 PM Artem Dergachev via cfe-dev
> > <cfe-dev at lists.llvm.org> wrote:
> >> After a bit of hiatus i'm reviving this work. The previous discussion is
> >> at http://lists.llvm.org/pipermail/cfe-dev/2019-August/063092.html, also
> >> http://lists.llvm.org/pipermail/cfe-dev/2019-September/063229.html. The
> >> plan is not to turn the two Clang bug-finding tools into a single tool
> >> entirely but rather to make them more interchangeable, so that users who
> >> have existing integration of, say, static analyzer, could also have
> >> clang-tidy integration for as-free-as-possible. In particular,
> >> checks/checkers could be shared, which would resolve the constant
> >> struggle of "where to put the checker?".
> > Thank you for looking into this topic again!
> >
> >> One thing i realized since the last discussion is that the more tools we
> >> integrate, the more redundant compilation work do we perform. If we are
> >> to compile the code + run static analyzer + clang-tidy separately over
> >> it, that's 3 rebuilds of the AST from scratch. Whatever solution we
> >> provide to run both tools, I'd much rather keep it at 2 (compilation +
> >> *all* analysis) because static analysis is already expensive in terms of
> >> build-time, no need to make it worse.
> > +1
> >
> >> One core component of this plan is to teach clang-tidy how to emit
> >> reports in various human-readable and machine-readable formats that the
> >> static analyzer already supports. At this point i'm pretty much ready to
> >> publish a clang::DiagnosticConsumer that'd produce all kinds of static
> >> analyzer-style reports. In particular, such consumer would enable
> >> clang-tidy to be used from inside scan-build instead of clang --analyze;
> >> both clang-tidy checks and static analyzer checkers would be ran from
> >> inside clang-tidy binary but produce html reports consumable by
> >> scan-build; a common index.html report would be generated for all
> >> checkers. I'm very much in favor of teaching scan-build how to run
> >> arbitrary clang tools, not just clang-tidy (i.e., "scan-build
> >> --tool=/path/to/my-clang-tool" or something like that) which would allow
> >> users who don't have CMake compilation databases to take advantage of
> >> clang tools more easily (and we have a few users who are interested in
> >> that).
> > I think this sounds really useful.
> >
> >> On the other hand, we've also made up our mind that for ourselves we do
> >> in fact want produce a clang binary with clang-tidy checkers integrated.
> > I continue to disagree with that stance, FWIW. I want to see clang
> > able to run clang-tidy checks the same as clang static analyzer checks
> > because having a single tool to run which produces diagnostic results
> > is easier than running multiple tools in build systems or CI
> > pipelines. I realize that the community may not share this position,
> > but I hope we're willing to revisit the discussion.
>
> Just to double-check, sounds like we two actually *agree* on this(?)
> like, we both want clang with clang-tidy checks?

Sorry for the confusion, yes, I think we agree on this point -- we
both want clang with clang-tidy checks. I was disagreeing with the
status quo.

> >> Apart from free integration into a number of our CI systems, that'll
> >> allow us to avoid shipping and supporting the clang-tidy command line
> >> interface as a separate feature. That's about 7MB increase in clang
> >> binary size which we're fine with. I plan to make it an off-by-default
> >> cmake flag unless there are strong opinions to do the opposite.
> > I don't have a strong opinion to do the opposite, but I'd like to at
> > least explore it as an option.
> >
> >> The
> >> alternative approach to move ourselves into a separate binary that's
> >> integrated at the Driver level would also technically work but that's
> >> too disruptive to commit to for us at the moment - even just the Driver
> >> work alone would require a lot of testing, let alone making sure that
> >> static analyzer works exactly as before from within the tool (it already
> >> works from inside clang-tidy but we don't really know if it actually
> >> works *the same* in all the potential cornercases).
> > I think this approach would add more complexity to the question "what
> > tool do I reach for".
> >
> >> So, like, we want to support multiple workflows.
> >>
> >> Also i'll be making occasional commits to some clang-tidy checks that
> >> we're interested in - mostly to address a few false positives (say, some
> >> checkers aren't aware of some Objective-C constructs), but also
> >> sometimes tweak the warning text a little bit (for example,
> >> bugprone-assert-side-effect is an awesome check but the warning text
> >> "found assert() with side effect" sounds fairly un-compiler-ish - we're
> >> not playing hide-and-seek!). Hope i'm welcome with these changes ^.^
> > Most certainly! One other thing that I think we should think about is:
> > now that we're looking to more tightly couple the static analyzer and
> > clang-tidy, should we unify their diagnostic message styles?
> > clang-tidy follows the Clang conventions and does not use
> > grammatically correct diagnostic messages (no capitalization or
> > punctuation) while the static analyzer has its own convention
> > (capitalization but no punctuation).
>
> This is an important thing to do for any UI that displays both static
> analyzer warnings and clang-tidy warnings; different capitalization is
> indeed a bummer.

Agreed.

> It's not uncommon for UIs/IDEs to re-capitalize all warnings into a
> single style. If we suddenly change static analyzer warnings to be
> lowercase now, i think most UIs will simply re-capitalize them back, so
> that to preserve the existing experience.

I know of at least one consumer of diagnostic output that doesn't do
that, FWIW. They just faithfully reproduce whatever the tool produces.

> I could also teach my intermediate DiagnosticConsumer to do the same.
>
> Or i could teach individual PathDiagnosticConsumers (the ones that are
> responsible for static analyzer-style outputs) to do that for themselves
> specifically - that actually sounds like a very good plan, probably
> *the* way forward.

Are you suggesting that Clang diagnostics would stay the same when
issued from Clang, but would be automatically capitalized if issued
through the static analyzer diagnostic consumer?

> > Relatedly, do you expect to
> > expose some of the handy diagnostic utilities from the static analyzer
> > to clang-tidy checks, such as the ability to report a path that led to
> > a diagnostic (for the few checks that track that sort of information)?
>
> These facilities will be exposed once https://reviews.llvm.org/D67422
> actually lands.

Huttah!

> That said, i don't think path notes specifically will be useful for any
> of the existing clang-tidy checks. IIUC the only checker so far that
> could potentially make use of these notes is the use-after-move checker
> - but it doesn't actually find the path, it only finds two points on the
> CFG with a happens-before relation. Static analyzer path notes make
> sense because every single point in the path potentially contributes to
> the warning. Which implies that behind the warning there's a *huge*
> facility that understands potential implications of *every* statement.
> When it's just two points that contribute to the path, two notes are enough.

That's true -- I don't think a lot of existing clang-tidy checks were
written with path functionality in mind.

> One thing i'm excited about, though, is eventually producing a good bug
> visualization for flow-sensitive warnings. As i mentioned a few times,
> flow-sensitive warnings should be displayed together rather than
> separately in order to be understandable: "This code is dead *because*
> this condition is always false *because* this assignment to the variable
> that participates in the condition is a dead store". I believe static
> analyzer's rich diagnostics system could contribute to that dream.

Strongly agreed! Perhaps I used the wrong terminology (or
misunderstood some other way), but this was the use case I was excited
about supporting. Making it more clear which particular set of
statements led to the decision to issue a diagnostic is really helpful
functionality when reasoning about how to resolve certain diagnostics.

Thanks!

~Aaron

>
> > Thanks!
> >
> > ~Aaron
> >
> >>
> >> _______________________________________________
> >> cfe-dev mailing list
> >> cfe-dev at lists.llvm.org
> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>