[cfe-dev] AddTaint failure with MemRegion
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Wed Nov 18 10:17:44 PST 2015
Hello,
In the current model of taint, only symbols (subclasses of SymExpr) can
"truly" carry taint: the whole taint thing is about mapping symbols to
taint kinds. It makes sense that only rvalues can be tainted, while
memregions represent lvalues (segments of memory).
A memory region is, by definition, said to be tainted iff the value of
the start of the segment (the numeric value of the pointer to the start)
represented by the MemRegion object is a tainted rvalue. Hence, a memory
region can be said to be tainted in the following cases:
1. It is a SymbolicRegion, and its parent symbol is tainted.
2. It is an ElementRegion, and its symbolic index value is tainted.
3. It is a sub-region of another tainted region.
The helper function ProgramState::addTaint(const MemRegion *) is a
simple wrapper that works only on symbolic regions directly, and adds
taint to its parent symbol. It does nothing, and in fact cannot do
anything sensible, if the region supplied is not of SymbolicRegion class.
I believe that the problem you are encountering is that you are trying
to add a taint on a region that cannot actually be tainted (eg.
VarRegion). You can see what your region is by dumping 'loc'.
Note, however, that any symbol that is based on a tainted region (eg.
SymbolRegionValue of a tainted region) becomes tainted itself
automatically, even if it's not explicitly included in the taint map.
But it doesn't mean that the value of a tainted region is necessarily
tainted; the value of a tainted region can be easily overwritten with a
trusted value.
So if your checker needs to add a taint to a MemRegion that does not
fall into one of the three categories mentioned above, then there must
be something wrong with the approach. Say, VarRegion itself cannot be
tainted; there's nothing wrong with a pointer to a well-defined variable
inside the program, this pointer value is not something that the
attacker can alter to produce unwanted results. It's a concrete region,
we know everything about it, while taint is all about values obtained
from external, untrusted sources.
If, on the other hand, your MemRegion falls into category 2 or 3, and
tainting the region (or, in other words, the pointer to that region) is
what you truly want, then you'd need to manually unpack the region and
taint the particular symbols inside it, either the index symbol, or the
symbolic base (if your region is a sub-region of a symbolic region, see
MemRegion::getSymbolicBase()).
More information about the cfe-dev
mailing list