[cfe-dev] AddTaint failure with MemRegion

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Wed Nov 18 10:17:44 PST 2015


Hello,

In the current model of taint, only symbols (subclasses of SymExpr) can 
"truly" carry taint: the whole taint thing is about mapping symbols to 
taint kinds. It makes sense that only rvalues can be tainted, while 
memregions represent lvalues (segments of memory).

A memory region is, by definition, said to be tainted iff the value of 
the start of the segment (the numeric value of the pointer to the start) 
represented by the MemRegion object is a tainted rvalue. Hence, a memory 
region can be said to be tainted in the following cases:

   1. It is a SymbolicRegion, and its parent symbol is tainted.

   2. It is an ElementRegion, and its symbolic index value is tainted.

   3. It is a sub-region of another tainted region.

The helper function ProgramState::addTaint(const MemRegion *) is a 
simple wrapper that works only on symbolic regions directly, and adds 
taint to its parent symbol. It does nothing, and in fact cannot do 
anything sensible, if the region supplied is not of SymbolicRegion class.

I believe that the problem you are encountering is that you are trying 
to add a taint on a region that cannot actually be tainted (eg. 
VarRegion). You can see what your region is by dumping 'loc'.

Note, however, that any symbol that is based on a tainted region (eg. 
SymbolRegionValue of a tainted region) becomes tainted itself 
automatically, even if it's not explicitly included in the taint map. 
But it doesn't mean that the value of a tainted region is necessarily 
tainted; the value of a tainted region can be easily overwritten with a 
trusted value.

So if your checker needs to add a taint to a MemRegion that does not 
fall into one of the three categories mentioned above, then there must 
be something wrong with the approach. Say, VarRegion itself cannot be 
tainted; there's nothing wrong with a pointer to a well-defined variable 
inside the program, this pointer value is not something that the 
attacker can alter to produce unwanted results. It's a concrete region, 
we know everything about it, while taint is all about values obtained 
from external, untrusted sources.

If, on the other hand, your MemRegion falls into category 2 or 3, and 
tainting the region (or, in other words, the pointer to that region) is 
what you truly want, then you'd need to manually unpack the region and 
taint the particular symbols inside it, either the index symbol, or the 
symbolic base (if your region is a sub-region of a symbolic region, see 
MemRegion::getSymbolicBase()).



More information about the cfe-dev mailing list