[cfe-dev] General query : Alpha security checkers and taint analysis

Ashwin Ganesh via cfe-dev cfe-dev at lists.llvm.org
Tue Apr 5 07:08:25 PDT 2016


Great, Thanks for the detailed explanation. I started out directly with
this tutorial http://llvm.org/devmtg/2012-11/Zaks-Rose-Checker24Hours.pdf and
also read the basics in clang static analyzer developer manual. But , since
I don't have any prior knowledge about clang , do i need to go through any
other tutorials to completely understand the code of various experimental
checkers and also to write one of my own?

Another specific question I have is that , suppose i have a statement var =
read_value() . can I directly add read_value function to be one of the
taint sources by adding a line in addSourcesPost function of
GenericTaintChecker ? And after changing the file , do i need to
necessarily run 'make clang' inside build directory or is there any simple
way to reflect the changes ,since the former takes way too much time.

Regards,
Ashwin

On Tue, Apr 5, 2016 at 3:26 PM, Artem Dergachev via cfe-dev <
cfe-dev at lists.llvm.org> wrote:

> > For example,
> >   x = getchar();
> >   char y = x + 1;
> > Which part of the code taints y?
>
> Propagation of taint through the symbol hierarchy is done by the core
> automatically. In fact, no propagation is done in any continuous manner -
> the core just looks at the symbol that you're interested in and finds
> tainted sub-symbols inside it. This mechanism is implemented in the
> ProgramState::isTainted() methods, and it relies on the assumption that
> symbols on which the taint originally appears are always atomic (of
> SymbolData class).
>
> In your particular example, the following happens:
> 1. getchar() returns a SymbolConjured - an atomic symbol that represents
> the return value. Technically, it returns an SVal of nonloc::SymbolVal
> class, but it is a simple wrapper around the symbol, so there isn't much
> difference. If you dump() the program state, you'd see it as something like
> "conj_$0<int>".
> 2. The conjured symbol is stored in the memory region (VarRegion) that
> represents AST variable 'x' in the analyzer's memory model. If you dump()
> the program state, you'd see a binding in the Store: "(x, 0, direct):
> conj_$0<int>".
> 3. In order to compute 'x + 1', the conjured symbol "conj_$0<int>" is
> loaded as (r)value of expression 'x'.
> 4. A SymIntExpr - a symbolic expression 'conj_$0<int> + 1' is stored in
> the memory region of variable 'y'.
> 5. Suppose then you ask if value of 'y' is tainted. Then symbol
> 'conj_$0<int> + 1' is taken to represent the value of 'y'.
> 6. In order to see if the value is tainted,
> ProgramState::isTainted(SymbolRef Sym, ...) iterates over all sub-symbols
> of the symbolic expression.
> 7. It finds 'conj_$0<int>' as one of such sub-symbols (and only; '1' is
> not a symbol).
> 8. Seeing that 'conj_$0<int>' was marked as tainted by the
> TaintPropagation checker, it decides that the whole symbol is tainted.
> Therefore it reports that value of the expression 'y' is tainted.
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160405/99b67530/attachment.html>


More information about the cfe-dev mailing list