[cfe-dev] More plans on Static Analyzer + Clang-Tidy interoperation.

Wed Oct 14 06:28:29 PDT 2020

Hi Artem, 

see some comments inline with DK>.

Regards,
Daniel
-----Original Message-----
From: cfe-dev <cfe-dev-bounces at lists.llvm.org> On Behalf Of Artem Dergachev via cfe-dev
Sent: Monday, October 12, 2020 11:26 PM
To: cfe-dev <cfe-dev at lists.llvm.org>; Valeriy Savchenko <vsavchenko at apple.com>; Ravi <rkandhadaimadhav at apple.com>; Dmitri Gribenko <gribozavr at gmail.com>
Subject: [cfe-dev] More plans on Static Analyzer + Clang-Tidy interoperation.

After a bit of hiatus i'm reviving this work. The previous discussion is at https://protect2.fireeye.com/v1/url?k=c6a4cdef-9815168f-c6a48d74-86e2237f51fb-9ba675f7f1c1086e&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-August%2F063092.html, also https://protect2.fireeye.com/v1/url?k=66abd4f1-381a0f91-66ab946a-86e2237f51fb-1172955ed582bc7a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=http%3A%2F%2Flists.llvm.org%2Fpipermail%2Fcfe-dev%2F2019-September%2F063229.html. The plan is not to turn the two Clang bug-finding tools into a single tool entirely but rather to make them more interchangeable, so that users who have existing integration of, say, static analyzer, could also have clang-tidy integration for as-free-as-possible. In particular, checks/checkers could be shared, which would resolve the constant struggle of "where to put the checker?".

One thing i realized since the last discussion is that the more tools we integrate, the more redundant compilation work do we perform. If we are to compile the code + run static analyzer + clang-tidy separately over it, that's 3 rebuilds of the AST from scratch. Whatever solution we provide to run both tools, I'd much rather keep it at 2 (compilation +
*all* analysis) because static analysis is already expensive in terms of build-time, no need to make it worse.

DK> Completely agree. Our users run both clang-tidy and clang static analyzer on their code from CodeChecker. The analysis is run for each TU separately by clang-tidy and separately by clang SA. It would be great if we could save the redundant parsing work. So if you add this option to static analyzer then we would run the clang-tidy check through this too (what about fixits? would they be generated?).
In CodeChecker we primarily use the machine readable plist format to generate any further output format to the user: command line output, static web html, or central server storage. We actually convert the output of clang tidy and other static analyzer tools (such as cppcheck, pylint...) too to plist (because it is machine readable).
We are open to contribute this plist_to_html converter tool to (https://github.com/Ericsson/codechecker/tree/master/tools/plist_to_html) the llvm repo if it is interesting to others too.

One core component of this plan is to teach clang-tidy how to emit reports in various human-readable and machine-readable formats that the static analyzer already supports. At this point i'm pretty much ready to publish a clang::DiagnosticConsumer that'd produce all kinds of static analyzer-style reports. In particular, such consumer would enable clang-tidy to be used from inside scan-build instead of clang --analyze; both clang-tidy checks and static analyzer checkers would be ran from inside clang-tidy binary but produce html reports consumable by scan-build; a common index.html report would be generated for all checkers. I'm very much in favor of teaching scan-build how to run arbitrary clang tools, not just clang-tidy (i.e., "scan-build --tool=/path/to/my-clang-tool" or something like that) which would allow users who don't have CMake compilation databases to take advantage of clang tools more easily (and we have a few users who are interested in that).

DK> good idea! Actually if we used the plist_to_html converter or similar (in python), we wouldn’t need to generate HTML code from clang, which gives greater flexibility (more libraries available in python for this etc.). One other thing we added to CodeChecker and which could be of generic interest: gcc to clang compilation database converter. We have many users who use gcc cross-compiled projects and who want to use clang static analyzer and clang-tidy to analyze their code. So in CodeChecker we implemented a conversion tool that transforms a gcc compilation database to clang compatible compilation database. It detects the built-in configuration of the gcc-compiler used for the original compilation (target architecture, built in defines, built in include paths), removes the gcc-only parameters, and creates a clang compatible compilation database with all this information. Then this can be used by any clang tool (tidy, clang-format, clangd etc.) to run analysis on gcc cross-compiled code without errors. This part could also be factored out as common component to LLVM if you guys find it interesting.

On the other hand, we've also made up our mind that for ourselves we do in fact want produce a clang binary with clang-tidy checkers integrated. 
Apart from free integration into a number of our CI systems, that'll allow us to avoid shipping and supporting the clang-tidy command line interface as a separate feature. That's about 7MB increase in clang binary size which we're fine with. I plan to make it an off-by-default cmake flag unless there are strong opinions to do the opposite. The alternative approach to move ourselves into a separate binary that's integrated at the Driver level would also technically work but that's too disruptive to commit to for us at the moment - even just the Driver work alone would require a lot of testing, let alone making sure that static analyzer works exactly as before from within the tool (it already works from inside clang-tidy but we don't really know if it actually works *the same* in all the potential cornercases).

So, like, we want to support multiple workflows.

Also i'll be making occasional commits to some clang-tidy checks that we're interested in - mostly to address a few false positives (say, some checkers aren't aware of some Objective-C constructs), but also sometimes tweak the warning text a little bit (for example, bugprone-assert-side-effect is an awesome check but the warning text "found assert() with side effect" sounds fairly un-compiler-ish - we're not playing hide-and-seek!). Hope i'm welcome with these changes ^.^

_______________________________________________
cfe-dev mailing list
cfe-dev at lists.llvm.org
https://protect2.fireeye.com/v1/url?k=5fe0386f-0151e30f-5fe078f4-86e2237f51fb-3106052e60c21d5a&q=1&e=8c04b238-977e-489f-a825-0e5c7cec934a&u=https%3A%2F%2Flists.llvm.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcfe-dev