[cfe-dev] Clang analyzer Google Summer of Code ideas/proposals

Zhongxing Xu xuzhongxing at gmail.com
Thu Apr 1 01:39:53 PDT 2010


Hi Samuel,

I haven't thought through about what the bug database or extension script
would be like. So I couldn't comment on your proposals.

But I personally prefer some improvements over the current core analysis
engine. For example, improve the inter-procedural analysis, add an integer
overflow detector, add a more powerful constraint manager, or add C++
support, etc.

2010/3/26 Samuel Harrington <samuel.harrington at mines.sdsmt.edu>

> Hello,
>
> I am interested in doing a project with Clang in the upcoming Google
> Summer of Code. I am currently a sophomore at the South Dakota School
> of Mines and Technology, and I have some C++, Perl, and Javascript
> programming experience. I have been interested in Clang and LLVM for a
> while, and I've looked through some of the code before. I am most
> interested in the analyzer component though.
>
>
> I have two possible project ideas I am interested in:
>
>
> A) Bug database
>
> Create a tool to store bugs and track changes over time.
>
> This tool would use the XML analyzer output and the CIndex library to
> correlate bugs over multiple runs. The tool would provide, at a
> minimum, a diff-like output given a pair of runs. Ideally, this would
> create and update a database with all the runs, and statuses for all
> the bugs (uninspected, false positive, verified, fixed). The tool
> would provide reports with chosen subsets of the bugs and annotations
> such as first run present and current status. The reports could be
> html output, reusing the existing infrastructure, or be viewable in a
> gui application.
>
> The database could be XML, SQLite, or some plain-text format. I am
> unsure whether this tool should be integrated into the clang binary,
> be a separate executable, or even use a scripting language like
> Python. However it is implemented, it would be integrated into
> scan-build/scan-view.
>
> I am interested in this project because it would make using the
> analyzer easier for larger projects. The diff output could be used as
> a regression finder or fix checker. The database would allow users to
> keep track of bugs better, and to provide statistics of bugs over
> time.
>
>
> B) User-made checkers
>
> This would provide some sort of easy extension mechanism to the
> analyzer to allow simple domain-specific checks. I have a couple of
> ideas of how this would look.
>
>
> 1) The first would be to read and use mygcc [1] rules to detect bugs.
> I believe this would would only provide simple flow-sensitive
> analysis, but it looks useful nonetheless. This would require making a
> pattern matcher to match ast nodes based on a parsed text expression.
>
>
> 2) Second, would be an interface to the analysis engines from a
> scripting language, perhaps python. This would be more complicated to
> use than mygcc, but likely more useful. For example, a check to make
> sure open has a third parameter if the CREATE flag is present is very
> simple given a scripting language, but impossible using mygcc rules
> [2].
>
> If I was to do this project, I would likely try to do the second idea
> first, and if time permits, write a mygcc matcher in the scripting
> language. Implementing mygcc rules in the scripting language would
> provide a good test of the interface completeness.
>
> I am interested in this because the clang analyzer could be easily
> extended with domain specific checks. For example, specialized locking
> rules could be checked using mygcc rules. A trickier example [3] would
> be to make sure a llvm::StringRef is not assigned a std::string that
> goes out of scope before it. This would be possible using a scripting
> language binding, and easier than modifying the Clang source. These
> types of checks are already being implemented in Clang, but it is
> infeasible for specialized checks for arbitrary given projects to be
> embedded. This project would be a way around the problem.
>
>
> 3) The closest tool I have seen to #2 is Dehydra [4], which also has a
> goal of allowing user-defined bug finding scripts. A complicating
> factor is that the scripting language is Javascript, and it may be
> infeasible to provide a compatible interface. Nevertheless, I am
> including replicating the interface here as a third possibility.
>
>
> Sorry for the incredibly long email. :)
>
> Are either of these proposals interesting? Any criticisms, ideas? All
> comments and questions would be appreciated.
>
> Thanks,
> - Sam
>
>
> [1] http://mygcc.free.fr/
> Note: I forget how I found this, I believe it was through an email on
> this list, but I can't find it.
>
> [2] example taken from Clang source
> lib/Checker/UnixAPIChecker.cpp
>
> [3] example again from an existing Clang check
> lib/Checker/LLVMConventionsChecker.cpp  line 133
>
> [4] https://developer.mozilla.org/en/Dehydra
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20100401/a3405a9c/attachment.html>


More information about the cfe-dev mailing list