[cfe-dev] MemRegions - how to (re)use them right?

Ted Kremenek kremenek at apple.com
Mon Nov 9 14:06:55 PST 2009


On Oct 27, 2009, at 7:00 AM, Olaf Krzikalla wrote:

> Hi Ted,
>
> Ted Kremenek schrieb:
>> Sorry for the delay in my response.
> It was just in time to put me (hopefully) on the right track.
>
>> In your example above, I'm not certain if ID* symbols are the  
>> values bound to locations or the locations themselves.
> They were meant as the locations themselves. These locations then  
> get bound to values similiar to what GRExprEngine does (as I have  
> seen, RegionBindings stores these bindings there).

That's correct.  Anything that subclasses StoreManager manages the  
"symbolic store" component of GRState, where the symbolic state  
manages bindings from locations to values.

>
>> If they are the locations, then you'd have:
>>
>>  ID1 => VarRegion(ptr)
>>  ID2 => FieldRegion(VarRegion(temp), a)
>>  ID3 => FieldRegion(VarRegion(temp), b)
>>
>> Now if ID* represent values, you're going to need something else  
>> besides MemRegions to keep track of location -> value bindings.   
>> For example:
>>
>>> *sink = ptr->b;   // function detects ID1 for ptr: expand to *sink  
>>> = (&temp)->b:
>>
>> To get this reasoning, you need to keep track of the binding:
>>
>>   VarRegion(ptr)  =>  VarRegion(temp)
> That is, the value of a pointer itself represents another memory  
> region. However I have the strong feeling that MemRegionVal actually  
> represents a different concept and value flow logic isn't  
> implemented at all (thats how I interpret the FIXME comment in  
> GRExprEngine::EvalLoad). OTOH at least the "pointer!=0"-constraint  
> is stored somewhere.

The FIXME in GRExprEngine::EvalLoad was stale and needed to be removed.

loc::MemRegionVal is simply allows regions (which are memory  
locations) to be used as values during symbolic execution.   
GRExprEngine reasons mostly about SVals, and when necessary specially  
handles the cases where those values are memory regions.  Most of the  
logic to handle MemRegions, however, are intentionally left as a  
detail in the StoreManagers.

Constraints on pointers, such as whether or not a pointer is non-null,  
is handled by the ConstraintManager.  The ConstraintManager only  
reasons about symbols (provided by SymbolManager).  SymbolicRegions  
represent abstract regions that can either be null or actual locations  
in memory.  The ConstraintManager keeps track of whether or not symbol  
associated with a SymbolicRegions is null.  ConstraintManagers also  
reason about constraints on other symbols, such as integers.

>> which would cause the l-value of 'ptr->b' to evaluate to:
>>
>>   FieldRegion(VarRegion(temp), b)
>>
>> Then the r-value of 'ptr->b' expands to whatever value you track  
>> for the field 'b' of 'temp'.
>>
>> In general, I'm not certain how much value flow logic you plan on  
>> implementing.
> Just some basic alias analysis. That is, the value of a pointer is  
> either a simple adress-of-operator result (in that case the  
> MemRegion of the &-op-subexpression is bound to the pointer) or  
> another simple pointer. Of course I try to make this as extensible  
> as possible. For the value flow of non-pointer types the means  
> provided by the MemRegion concept are sufficient.

I think MemRegions should give you almost all the flexibility you need  
to reason about l-values.  The result of an address-of expression is  
always a region wrapped in a loc::MemRegionVal or UnknownVal.

>> If your analysis isn't path-sensitive, how do plan on handling  
>> confluence points in the CFG (and loops)?
> Rather pragmatic. For the moment I will analyse basic blocks only.  
> Later on this can be refined. E.g. at confluence points you may  
> check if a pointer value remains or becomes the same on both paths.  
> Then you can use this value further. Otherwise the pointer gets  
> marked as unknown.

Makes sense.

>
> Overall I think that I can't use the xxxEngine framework but I have  
> to write new components. However I think its possible to factor out  
> the part creating MemRegions in a common Stmt visitor (apparently  
> RegionStore and BasicStore don't differ at this point and maybe even  
> the NOTE comment and the following if clause in  
> RegionStoreManager::getLValueFieldOrIvar belongs to  
> BasicStoreManager::getLValueField too). With this tool I should be  
> able to resolve the first write to sink in my example. Then, in a  
> second step I reason about pointer values. However for this second  
> step IMHO a class like UnknownTypedRegion is still missing. Example:
>
> void foo(int* x)
> {
> int* y = x;
> //...
> }
>
> Even if I don't know the particular MemRegion x points to, I know  
> that y points to the same region.
> I hope to get some first results by the end of week. Meanwhile I'm  
> still eager to get your comments.

The way this is handle in GRExprEngine is that the VarRegion for 'x'  
binds to a symbolic region.  After the assignment to y, both the  
VarRegion for y and x bind to the same region.  RegionStoreManager  
handles some bindings like this lazily, but it it essentially reasons  
about them in this way.

Just to be clear, we have four components at play in GRExprEngine:

- MemRegions, which represent locations in memory
- SVals, which represents values that expressions can evaluate to  
(including in the case of lvalues, regions)
- StoreManager: which manages bindings from MemRegions to SVals
- ConstraintManager: which manages constraints on symbols (where  
SymbolicRegions have an associated symbol, and integers and other  
values can be symbolic as well)

Each of these components represents a separate set of functionality  
that when composed together provide the value tracking reasoning used  
by the path-sensitive engine.



More information about the cfe-dev mailing list