[cfe-dev] Proposal: Integrate static analysis test suites

Sat Jan 30 06:55:59 PST 2016

On Fri, Jan 29, 2016 at 11:32 PM, Anna Zaks <ganna at apple.com> wrote:
>
> By calling "$clang —analyze” you are not calling the compiler and asking it
> to work harder. You are calling another tool that is not going to compile
> for you but rather provide deep static code analysis.

This is not unlike the way clang-tidy works (which also runs the
analyzer!), but clang-tidy still shows compiler diagnostics.

> Calling "clang
> —analyze" could call the compiler behind the scenes and report the compiler
> warnings in addition to the static analyzer issues. However, when warnings
> from both tools are merged in a straightforward way on command line, the
> user experience could be confusing. For example, both tools report some
> issues such as warning on code like this:
>   int j = 5/0; // warning: Division by zero
>                    // warning: division by zero is undefined
> [-Wdivision-by-zero]

This is unfortunate, but to me it shows that we have duplication of
efforts in our tools. We run into the same general issue with
clang-tidy checks and the compiler, but the goal is to find one home
for that diagnostic functionality and only enable it there. If we have
diagnostics that live in both the compiler and the analyzer, we're
duplicating effort and we should strive to rectify that where
possible. There's likely to be cases where this is harder (such as
division by zero) because you want the diagnostic enabled by default
without requiring the overhead of running path-sensitive checks, but I
think there are ways we can manage that.

> Most importantly, end users should never invoke the analyzer by calling
> “clang —analyze” since “clang —analyze” is an implementation detail of the
> static analyzer. The only documented user facing clang static analysis tool
> is scan-build (see http://clang-analyzer.llvm.org). Here are some reasons
> for that. For one, it is almost impossible to understand why the static
> analyzer warns without examining the error paths. Second, the analyzer could
> be extended to perform whole project analysis in the future and "clang
> —analyze" works with a single TU at a time.

As a counter-example to requiring examining the code paths, the
compiler has thread role analysis diagnostics (among others) that are
also flow-sensitive and it's never been an issue that users must
examine the error paths, so I'm not certain that's a particularly
compelling reason to require a separate tool. Even templates and
macros require a lot of "path" archaeology, and we've found some
excellent ways to surface that from the compiler.

Whole-program analysis *is* a reasonably compelling reason for a
separate tool, however I don't think it should drive the design for
the user interface. For instance, Visual Studio does not require
execution of a separate tool to enable their static analysis (which I
believe does whole-program analysis). So, for instance, how do we
intend for clang-cl to support the /analyze option? Since we don't
have whole-program analysis currently, it seems like such a feature
could be designed to operate from a compiler flag (for instance, in
conjunction with compilation databases) that is responsible for
spawning off that secondary tool when required. (Note, I think a
similar approach could be used to support running clang-tidy from the
compiler via a command line flag.)

> I agree that the best user experience is to report all warnings in one
> place, while still differentiating which warning was reported by which tool.
> It would be awesome if the results from all bug finding tools such as the
> clang static analyzer, the compiler, and clang-tidy would be reported
> through the same interface.

I think we are in agreement, but to verify what I think we're agreeing
on: users don't particularly care about the *tool* used nearly so much
as they care about getting the diagnostics themselves. (For instance,
users don't care if it's a parser error, a semantic error, a
path-sensitive error, etc.) When it comes to diagnostics, the easier
we can make it on the user to enable the functionality, the greater
the chance of users actually using it. Based on that, having a single
mechanism the user can invoke to give them diagnostics (such as the
clang driver itself) is something we should strive towards, even if
that means executing different libraries or executables under the hood
(like we do with cc1). Obviously, *reporting* all the diagnostics in a
single place falls naturally out of invocation of a single tool. Does
that agree with what you were saying, or am I misinterpreting?

> The CodeChecker team is working on a solution for that and I hope we can
> incorporate their technology in LLVM/clang.

That's fantastic! Thank you for the explanations, as well as all the
hard work on this tool.

~Aaron