[cfe-dev] Static analysis output format
kremenek at apple.com
Mon Jul 7 13:21:14 PDT 2008
On Jul 4, 2008, at 2:41 PM, David Smith wrote:
> As we've been working through the list of results from static
> analysis for Adium it's become increasingly clear that the output
> format is introducing some complications. Specifically, each time we
> rerun (whether to use an updated version of checker, or to check
> against the latest source) it eliminates any metadata that we've built
> up around the results, such as which ones were false positives.
> Unfortunately, fixing this seems somewhat tricky. The main thing that
> would be necessary is a way of identifying results across runs. That
> way we can plug this into our automated testing system so each time we
> commit it can rerun and say "ok, these ones are known, these ones are
> known false positives, and these ones are new" rather than just
> "here's a list to re-evaluate".
I believe this is a necessary feature, and I think it is one that will
take several iterations to get right.
> I'm not sure how to come up with some
> sort of identifier for issues though. Line numbers probably change too
> frequently to be reliable. I suppose a heuristic based on function
> name, issue type, file name, and approximate line number might be
> fairly accurate.
This seems like a very reasonable heuristic. Even eluding the line
number might be fine for now.
BTW, some of this meta-data can easily be grepped right out of the
HTML file. This is exactly what scan-build does to build the
index.html file. For example:
$ grep BUG report-wEXcKk.html
<!-- BUGPATHLENGTH 2 -->
<!-- BUGLINE 15 -->
<!-- BUGFILE /Volumes/Data/Users/kremenek/Desktop/MyClass.m -->
<!-- BUGDESC Memory Leak -->
We can easily include other meta-data, such as the function/method
name where the bug occurs, an cryptographic hash of the source file
(or function) that contained the bug, etc.
Aside from your own automatic testing tools, ideally, we want the HTML
output that the tool (scan-build) produces to allow users to triage
and navigate bugs across runs. This is an important feature, but not
immediately high on the priority list. Much of the heavy lifting
would probably be done in scan-build (which is currently written in
Perl) where the summary HTML pages are generated.
Anyone with Perl and HTML knowledge is welcome to provide patches to
improve this aspect of the system without basically having any
knowledge of how the analyzer works (meta-data embedded in report-
XXXXX.html files that is useful for building such features into scan-
build could be implemented on demand).
Moreover, scan-build can be completely rewritten to provide a more
advanced system for triaging bugs if anyone is interested in
undertaking such a project.
More information about the cfe-dev