[cfe-dev] Static Analyzer Rocks Hard
Ted Kremenek
kremenek at apple.com
Wed Jun 25 09:49:06 PDT 2008
On Jun 24, 2008, at 12:04 AM, Holger Schurig wrote:
>> The more complete way to catch these bugs (and potentially
>> verify their absence) is to flag dangerous uses of untrusted
>> data: using it as a size parameter to malloc, using it as an
>> array index, and so on.
>
> It would be cool if, e.g. at an checker-level, a variable or
> memory object could have something like the perl "taint" bit.
>
> http://www.webreference.com/programming/perl/taint/
>
> In perl, you untaint via a regexp. In checker, you might untaint
> by checking a variable, e.g. for upper/lower bounds (signed) or
> upper bounds only (unsigned variable).
>
> If you then use the tainted variable to system function (how do
> we define this?), you could get a tainted warning from the
> checker.
This indeed would be a useful check, and it is something I would like
to have implemented one day as part of the static analyzer.
There has been a variety of work on doing taint analysis on C
programs, and there are different kinds of taint properties to check.
The kind of checking you mentioned has been before for C (in a
research tool) and was demonstrated to be very useful:
Using Programmer-Written Compiler Extensions to Catch Security Holes
http://www.stanford.edu/~engler/sp-ieee-02.pdf
Another kind of "taint property" is tracking the use of kernel/user
pointers in kernel space; this is more of an address-space qualifier
problem, but it can also be viewed as a form of taint propagation.
There have been a variety of proposals of how to define sources of
tainted data, and what sinks (functions) cannot take tainted data.
One standard approach is to use annotations on function prototypes,
which we could do in the form of attributes. This approach has
actually been used in the Linux kernel to annotated user vs. kernel
pointers. Of course simply having an external list of well-known
sources of tainted data that could be fed to the static analyzer would
also be useful.
Eventually, once a framework for doing inter-procedural analysis is in
place in clang, we could potentially relax taint attributes across
procedure boundaries. A good example of this is MECA (another
research tool):
MECA: an Extensible, Expressive System and Language for Statically
Checking Security Properties
http://www.stanford.edu/~engler/ccs03-meca.pdf
There are of course many examples of other systems that do taint
propagation (with potentially more analysis sophistication), but these
are a couple of good examples.
More information about the cfe-dev
mailing list