[cfe-commits] [patch] cast SymbolVal

Zhongxing Xu xuzhongxing at gmail.com
Sun Feb 22 00:25:36 PST 2009


On Sun, Feb 22, 2009 at 2:43 AM, Ted Kremenek <kremenek at apple.com> wrote:

>
> On Feb 21, 2009, at 2:42 AM, Zhongxing Xu wrote:
>
>  That's understandable, but I still think its not quite the right design
>> for the broader problem.  I think we want to do is update the type
>> information associated with the symbol, not change the symbol itself.  By
>> conjuring up new symbols like this we are going to:
>>
>> I completely agree with that we should update the type information instead
>> of creating a new symbol. But as symbols are immutable, how can we do this?
>> A conjured symbol has three parts of information: Stmt*, QualType, and a
>> Count. Should I take the same Stmt* and Count but with the new QualType to
>> create the new symbol?
>>
>>
>
> Hi Zhongxing,
>
> We have a variety of options, each one which we should probably discuss in
> some detail as I think we're going to have to make some important changes.
>
> 1) Use the GDM to track symbol information.
>
> Pros: Hooks in with our existing mechanism for throwing extra information
> into the GRState.  It also makes symbol information path-sensitive.
>
> Cons: We need access to the GRState object to get a symbol's state, so
> clients would need to be changed/updated.  This is probably okay and
> consistent with many other kinds of meta-data we need to query.
>
> 2) Use AnonTypedRegions to handle the layering of type information.
>
> This whole discussion seems to really only concern symbols that represent
> locations.  When symbols are scalar values, casts will require us reasoning
> about conversions anyway, so I think the type of those symbols will stay
> fixed.
>
> In your example:
>
> void* myfunc();
>
> void f13() {
>  char** array;
>  array = myfunc();
>  char *p = array[0];
>  char c = *p
> }
>
> We can have 'myfunc()' return a loc::SymbolVal with type void* for the
> symbol.  The cast from void* to char** gets the SymbolRegion for the symbol
> and then layers on top of it an AnonTypedRegion.  This is what
> AnonTypedRegion was designed for.  Then 'array[0]' does a lookup on the
> bindings to the ElementRegions of the AnonTypedRegion.
>
> A bonus of this approach is that we can nicely catch path-sensitive type
> errors.  For example:
>
>  char** array;
>  array = myfunc();
>  int** array2 = (void*) array;
>  array[0] = ...
>  int x = array2[0];
>
> Here the load from array2[0] is suspicious because we had a binding to an
> ElementRegion of the AnonTypedRegion(char**) and we are treating it as an
> ElementRegion of the AnonTypedRegion(int**).  This kind of reasoning, while
> potentially complicated, can only be done in the StoreManager since it
> understands how it structures regions.
>
> Pros: Works with our existing infrastructure and APIs.  Provides the
> flexibility to do a whole-bunch of path-sensitive type checking.
>
> Cons: We're still reasoning about loc::SymbolVal(sym) and
> loc::MemoryRegionVal(SymbolicRegion(sym)), two concepts which mean the same
> thing.  This will complicate the logic in EvalCast, the ConstraintManagers,
> etc.


I prefer the AnonTypedRegion approach. The AnonTypedRegion appears to be the
right tool to handle various abuse of the C type system. The first approach
is an option when there are more and more information about a symbol to
track.

>
>
> To address the "con" I suggest we remove loc::SymbolVal and just have
> loc::MemoryRegionVal(SymbolicRegion(sym)).  This is a cleanup I think we
> should do anyway, as it consolidates our reasoning about symbolic memory
> into one place.


Yeah, this cleanup should be done first. Then we won't have symbolic
locations. We will only have symbolic regions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20090222/2272948c/attachment.html>


More information about the cfe-commits mailing list