[cfe-dev] MemRegions - how to (re)use them right?

Tue Oct 27 07:00:40 PDT 2009

Hi Ted,

Ted Kremenek schrieb:
> Sorry for the delay in my response.
It was just in time to put me (hopefully) on the right track.

> In your example above, I'm not certain if ID* symbols are the values 
> bound to locations or the locations themselves.  
They were meant as the locations themselves. These locations then get 
bound to values similiar to what GRExprEngine does (as I have seen, 
RegionBindings stores these bindings there).

> If they are the locations, then you'd have:
>
>   ID1 => VarRegion(ptr)
>   ID2 => FieldRegion(VarRegion(temp), a)
>   ID3 => FieldRegion(VarRegion(temp), b)
>
> Now if ID* represent values, you're going to need something else 
> besides MemRegions to keep track of location -> value bindings.  For 
> example:
>
>> *sink = ptr->b;   // function detects ID1 for ptr: expand to *sink = 
>> (&temp)->b:
>
> To get this reasoning, you need to keep track of the binding:
>
>    VarRegion(ptr)  =>  VarRegion(temp)
That is, the value of a pointer itself represents another memory region. 
However I have the strong feeling that MemRegionVal actually represents 
a different concept and value flow logic isn't implemented at all (thats 
how I interpret the FIXME comment in GRExprEngine::EvalLoad). OTOH at 
least the "pointer!=0"-constraint is stored somewhere.
> which would cause the l-value of 'ptr->b' to evaluate to:
>
>    FieldRegion(VarRegion(temp), b)
>
> Then the r-value of 'ptr->b' expands to whatever value you track for 
> the field 'b' of 'temp'.
>
> In general, I'm not certain how much value flow logic you plan on 
> implementing.  
Just some basic alias analysis. That is, the value of a pointer is 
either a simple adress-of-operator result (in that case the MemRegion of 
the &-op-subexpression is bound to the pointer) or another simple 
pointer. Of course I try to make this as extensible as possible. For the 
value flow of non-pointer types the means provided by the MemRegion 
concept are sufficient.
> MemRegions themselves don't represent the values of expressions, but 
> rather memory locations (which some expressions may evaluate to).  The 
> path-sensitive engine uses SVals (short for "symbolic values") to 
> represent what the symbolic result of evaluating an expression.  As 
> you'll see, SVals can represent "locations" and "non-locations", and 
> "locations" include MemRegions.
But as I said above I don't think they are used in GRExprEngine in the 
way I want to use them.

>
> If your analysis isn't path-sensitive, how do plan on handling 
> confluence points in the CFG (and loops)?
Rather pragmatic. For the moment I will analyse basic blocks only. Later 
on this can be refined. E.g. at confluence points you may check if a 
pointer value remains or becomes the same on both paths. Then you can 
use this value further. Otherwise the pointer gets marked as unknown.

Overall I think that I can't use the xxxEngine framework but I have to 
write new components. However I think its possible to factor out the 
part creating MemRegions in a common Stmt visitor (apparently 
RegionStore and BasicStore don't differ at this point and maybe even the 
NOTE comment and the following if clause in 
RegionStoreManager::getLValueFieldOrIvar belongs to 
BasicStoreManager::getLValueField too). With this tool I should be able 
to resolve the first write to sink in my example. Then, in a second step 
I reason about pointer values. However for this second step IMHO a class 
like UnknownTypedRegion is still missing. Example:

void foo(int* x)
{
  int* y = x;
  //...
}

Even if I don't know the particular MemRegion x points to, I know that y 
points to the same region.
I hope to get some first results by the end of week. Meanwhile I'm still 
eager to get your comments.

Ciao Olaf