[cfe-dev] Clang GenericTaintChecker limitations
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Wed Aug 10 08:47:16 PDT 2016
The taint analysis we have here is not perfect, but it's pretty sane.
The analyzer assigns symbols to memory regions to represent their values
at a given moment of time, passes symbols around through assignments.
Then, some symbols carry taint, and symbolic expressions composed with
them automatically inherit the taint.
The GenericTaintChecker performs propagation of taint through functions.
For example, if you put pointers to tainted values into strcat(), then
the symbol that represents the value behind the returned pointer would
also be tainted, but unlike propagation from atomic symbols to
expressions, this is not automagical - a checker needs to do the work,
to support any specific API.
The memory model currently assumes that there is no aliasing between
unknown pointers, which is another limitation. However, inter-procedural
analysis works through inlining: when the function is inlined, any
aliasing between its actual arguments is correctly taken into account
(eg. the call of `f(&x, &x)' is modeled correctly, even though f assumes
that its arguments do not alias when analyzed separately).
One of the limitation that might bite you is lack of support for
floating-point values - the analyzer doesn't yet symbolicate them, so
they cannot be tainted.
One thing you'd probably need is to understand how structures are
modeled - eg. there's a symbol for the structure or array and the symbol
for its field or element, and there are multiple methods used for
representing this relationship, depending on circumstances.
I'm not aware of any other powerful open-source static analysis tools
for C/C++, but you might have a look at KLEE, which is not exactly
static, but also implements symbolic execution.
You may want to check an earlier discussion:
On 8/10/16 4:27 PM, Divya Muthukumaran via cfe-dev wrote:
> Hi All,
> I am looking for an open source static taint analysis tool that I can
> run on some applications to reason about security properties -- just
> to check if a tainted value can flow to some function parameters etc.
> The programs I want to try this on are around 10-20K lines of C code.
> I was thinking of using Clang's GenericTaintChecker (and just
> modifying the taint sources) for this purpose. I'd like to know if
> there are any limitations to this analysis that I should be aware of.
> I know that the interprocedural analysis doesn't work across
> translation units, but I'v managed to merge my source files using the
> cilly tool. I was mainly wondering about the precision of the taint
> analysis (what sort of pointer/alias analysis the IPA uses etc). If
> you could point me to any documentation that discusses the memory
> model, that would be great.
> Is the clang taint checker considered the state-of-the-art in
> open-source taint checking tools or is there something that is
> considered better (more precise)?
> Divya Muthukumaran
> Research Associate
> Department of Computing
> Imperial College London
More information about the cfe-dev