[cfe-dev] General query : Alpha security checkers and taint analysis
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Tue Apr 5 02:56:48 PDT 2016
> For example,
> x = getchar();
> char y = x + 1;
> Which part of the code taints y?
Propagation of taint through the symbol hierarchy is done by the core
automatically. In fact, no propagation is done in any continuous manner
- the core just looks at the symbol that you're interested in and finds
tainted sub-symbols inside it. This mechanism is implemented in the
ProgramState::isTainted() methods, and it relies on the assumption that
symbols on which the taint originally appears are always atomic (of
In your particular example, the following happens:
1. getchar() returns a SymbolConjured - an atomic symbol that represents
the return value. Technically, it returns an SVal of nonloc::SymbolVal
class, but it is a simple wrapper around the symbol, so there isn't much
difference. If you dump() the program state, you'd see it as something
2. The conjured symbol is stored in the memory region (VarRegion) that
represents AST variable 'x' in the analyzer's memory model. If you
dump() the program state, you'd see a binding in the Store: "(x, 0,
3. In order to compute 'x + 1', the conjured symbol "conj_$0<int>" is
loaded as (r)value of expression 'x'.
4. A SymIntExpr - a symbolic expression 'conj_$0<int> + 1' is stored in
the memory region of variable 'y'.
5. Suppose then you ask if value of 'y' is tainted. Then symbol
'conj_$0<int> + 1' is taken to represent the value of 'y'.
6. In order to see if the value is tainted,
ProgramState::isTainted(SymbolRef Sym, ...) iterates over all
sub-symbols of the symbolic expression.
7. It finds 'conj_$0<int>' as one of such sub-symbols (and only; '1' is
not a symbol).
8. Seeing that 'conj_$0<int>' was marked as tainted by the
TaintPropagation checker, it decides that the whole symbol is tainted.
Therefore it reports that value of the expression 'y' is tainted.
More information about the cfe-dev