[cfe-dev] Clang Analysis of several open source projects.

Thu May 12 13:08:40 PDT 2011

On May 12, 2011, at 11:19 AM, John Smith wrote:

> But my main point wasnt really finding bugs in the projects
> themselves, but finding & fixing bugs in the analyzer (by decreasing
> the potential for false positives).

Thanks John.  That's what I am hopeful for as well.

To make this exercise the most constructive, we need actual bug reports against the analyzer.  Diagnosing a sea of reports, and complaining that there are too many false positives just really isn't constructive or helpful on its own.

Typically the bug reports have the following characteristics:

a) have a concise but precise diagnosis of what the analyzer isn't reasoning about correctly

b) provides a test case of a preprocessed file that can be used later to reproduce the issue.  (also include the platform/arch you are on when filing the report)

The scan-build results are useful, but they ultimately lack the ability to be replayed in a debugger session, which is useful when debugging the analyzer.  Typically, I have found three kinds of analyzer false positives:

1) The analyzer doesn't know about some higher-level program invariant that the developer knows about and is implicitly relying upon.  The discussion there should be how to help the analyzer become more educated about such invariants.  Sometimes the answer is interprocedural analysis, sometimes its annotations, etc.  There is some actual design tradeoffs to be made here in "fixing" these kind of issues.  In some cases, restructuring the original code to make it easier to be reasoned about is the best answer (but that depends on who is voicing the opinion and on what codebase).

2) The analyzer has an outright bug in handling a specific edge cases.  Typically these require a modest amount of change to the analyzer, but having a test case is really key to diagnosing these issues.  These are honestly the easiest issues to fix.

3) The analyzer has an algorithmic problem with reasoning about some code.  For example, the analyzer doesn't currently reason about bit fields.  It also lacks the ability to reason about linear constraints (e.g., a + b > c).  Some of these are known issues, others are not.  Having concrete examples really helps.

Beyond filing static analyzer bug reports, it would also be great if anyone wanted to help with any of the following projects:

1) Update or overhaul http://clang-analyzer.llvm.org have more information about extracting maximum value from the analyzer.

2) Making scan-build more awesome by making it more turn key, or having a much better way of presenting analysis results.  There's a ton of stuff we could do here.  I'm not a web developer, so scan-build's HTML reports are what they are because I don't have the expertise to make them better.

3) Integrating the analyzer into other IDEs, such as Eclipse.

4) Working on helping to make the analyzer's precision better, or working on new checkers.

I really should document all of this on the clang-analyzer website.

Anyhow, thanks again everyone for running the analyzer on these projects.  I do appreciate your level of enthusiasm; my main concern is channeling that enthusiasm in a way that has maximum value.

Cheers,
Ted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20110512/2873ed78/attachment.html>