[cfe-dev] Adding taint sources to GenericTaintChecker
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Mon Apr 11 05:32:50 PDT 2016
> int readval()
> {
> return 10;
> }
>
> int a,b;
> a = readval() // warning : tainted
> b = a+1 //warning : tainted
In your example, readval() returns 10. Our analysis is inter-procedural,
so it knows such things.
10 is a concrete value. A concrete value cannot be tainted - an attacker
cannot forge 10 to become 20, or something like that. It's just "the"
10, and all 10's are the same. Something is tainted if it's a user input
or is anyhow known to be able to take completely arbitrary values; 10 is
not an input from the user, and it's quite under our control. So the
analyzer knows for sure that readval() returns a value that cannot be
tainted, and the message from the checker gets ignored - this is
expressed by the fact that the analyzer was unable to obtain a symbol
from the value provided by the checker, because the value is concrete.
In fact, only *symbols* may be "truly" tainted. To be exact, addTaint()
works with SymExpr's (SymbolRef's) and, additionally, SymbolicRegion's
(which are essentially regions pointed to by SymExpr pointers).
isTainted() works on SymExpr's, SymbolicRegion's and their sub-regions,
and additionally on SVal's of class nonloc::SymbolVal,
loc::MemRegionVal, nonloc::LocAsInteger whenever they contain a SymExpr
or a SymbolicRegion or its sub-region.
If i replace your definition of readval() with an opaque forward
declaration, eg:
int readval();
void foo() {
int a = readval() // warning : tainted
}
then everything works as expected.
On the other hand, if the definition of readval() is truly available in
your translation unit, then you don't need to add *it* to
GenericTaintChecker - instead, add whatever readval() calls to obtain
the user input, and the analyzer would model readval() itself and pass
the symbol down to the caller.
More information about the cfe-dev
mailing list