[cfe-dev] GSoC proposal: Static Analyzer support for dynamically loadable checkers

Martin Milata b42-ml at srck.net
Fri Apr 1 08:47:02 PDT 2011


Hello,
below is my Google Summer of Code project proposal. I will be grateful
for any comments.

Martin Milata

* * *

Static Analyzer support for dynamically loadable checkers
---------------------------------------------------------

== Synopsis ==

Clang Static Analyzer allows programmers to run various source code checks on
their programs in order to find potential bugs. Code implementing particular
check is called a checker, and these are currently compiled directly into the
static analyzer. This project aims to make it possible to load checkers
dynamically and thus facilitate the use of project-specific checkers that can
be build outside Clang source code tree.

== The proposal ==

Static analysis is a convenient method of identifying problematic places in
source code. Thanks to being more or less automatic, it is slowly gaining
popularity. Static analysis can only find defects it was programmed to
find, and this is reflected in the architecture of Clang Static Analyzer,
which consists of analyzer engine and a number of so-called checkers that
implement various source code checks on top of the engine. Some of them are
general and can be used to analyze any program (like core.DivZero, which looks
for divisions by zero), while others are project-specific (such as
llvm.Conventions, which checks coding conventions in LLVM codebase).

Currently, all the checkers are compiled directly into the analyzer. The aim
of the project is to allow loading checkers dynamically, in a manner similar
to how "opt" tool allows dynamic loading of optimization passes. This will
make development and usage of new checkers easier, especially those that are
project-specific and therefore unsuitable to be included in Clang codebase.

Deliverables:
* Modification of the checker registration system to support dynamic loading
  of checkers. Part of the mechanism for dynamic loading of optimization
  passes should probably be reused.
* Modification of the driver to accept appropriate command line options.
* Modification of ccc-analyzer/scan-build scripts to support the relevant
  options.
* Documentation describing how a checker can be built outside of clang tree
  and how it can be used.
* Tests where possible. Special care should be taken to ensure that the
  dynamic loading works on all supported platforms, as this feature is quite
  system-specific.
* Eventual rewrite of some of the checkers to be built externally.

== Why is this project interesting for me ==

I think Clang Static Analyzer may become a tool that many (especially
open-source) programmers can benefit from. I'd like to see static analysis to
be used more often in the open-source world, and I think this project can
contribute a little bit to make it happen.

My other reason is that I'm considering doing master's thesis in the field of
static analysis after this summer. Being familiar with the codebase will allow
me to eventually build upon the existing parts of the analyzer.

== How the project will be useful for LLVM ==

The ability to load checkers at runtime will make writing and using new
checkers easier, because clang doesn't have to be recompiled in order to use
them. This will especially benefit users outside LLVM, who would like to write
analysis that is project specific and therefore it doesn't make sense to
include it in Clang's repository. This feature may eventually help wider
adoption of Clang Static Analyzer.

== My prior knowledge of compilers/LLVM ==

I have:
* taken basic compiler course and basic formal method course.
* read several papers on static analysis.
* skimmed through most of the clang/llvm documentation.
* played around with the Static Analyzer and wrote a blog post about it
  (in Czech language) [1].

== My academic and industry experience ==

I have a bachelor's degree in Informatics from Masaryk University in Brno,
Czech Republic and currently enrolled in master's programme at the same
university [2]. The title of my bachelor's thesis was "SMT-Based Verification
of Finite-State Systems" (text in Czech language, practical part written in
Haskell).

Some more notable industry/open-source projects I have worked on:
* internet gateway management system written for OrbisNet ISP (2006; PHP,
  Perl, Shell)
* tcjc, unfinished command-line jabber client [3] (unfortunately the only C++
  project I can show; abandoned in 2007, I believe my C++ skills got a bit
  better since)
* "gaming core for mobile entertainment" for Warhowski, Inc. (2009; small to
  medium sized C++ project for which NDA prevents me to disclose more details)
* GPXsplit, small application for splitting GPX files [4] (2009, Haskell)
* I wrote a song queue support for MOC audio player [5] (summer 2009, C)

== Contact information ==

Full contact information will be provided in the application or upon request.
My name is Martin Milata and you can contact me via this email, via jabber as
b42 at njs.netlab.cz or via IRC as b42 on IRCnet and freenode.


[1] http://programatori.irc.cz/b42/clang-a-staticka-analyza-kodu
[2] http://is.muni.cz/person/256615
[3] http://git.b42.cz/tcjc.git/ (gitweb)
[4] http://b42.cz/gpxsplit/
[5] http://moc.daper.net/node/484



More information about the cfe-dev mailing list