[cfe-dev] [Patch] add KillStruct() to RegionStore

Fri Jan 9 23:46:16 PST 2009

On Jan 8, 2009, at 5:36 PM, Zhongxing Xu wrote:

> Index: lib/Analysis/RegionStore.cpp
> ===================================================================
> --- lib/Analysis/RegionStore.cpp        （版本 61900）
>
>
> +  if (V.isUnknown())
> +    return KillStruct(St, R);
> +
>
> When precisely can V.isUnknown() be true when the value is a  
> struct?  (test case please).
>
> We already have that test case. It is the f5() in test/Analysis/ 
> array-struct.c.

Great!  We should add a comment to that test case the says what part  
of the analyzer/region store is getting tested.

> +const GRState* RegionStoreManager::KillStruct(const GRState* St,
> +                                              const TypedRegion* R){
> +  GRStateRef state(St, StateMgr);
> +  // Set the default value of the struct region to UnknownVal.
> +  St = state.set<RegionDefaultValue>(R, UnknownVal());
>
> Interesting.  Do we need the killset anymore, or are we going to  
> model "killed values" in this way?
>
> I think we still need killset. The regions in killset are ones we  
> query its binding directly. The regions in RegionDefaultValue are  
> ones whose subregions we query bindings for. I haven't had a way to  
> combine them.

I believe I might be confused about some basic concepts.  Here is how  
I interpreted things.

The "killset" is part of the state, and represents those regions whose  
values have been changed (via direct assignment) after the entry to  
the analyzed function.  This way we don't need to explicitly bind  
initial values to regions and instead lazily infer their bindings.

According to comments in RegionStore, the "default value" map  
(RegionDefaultValue) is used to track "what regions have a default  
value of 0 if they have no bound value and have not been killed."

 From the patch I see 'KillStruct' being used in the following way:

const GRState*
RegionStoreManager::BindStruct(const GRState* St, const TypedRegion*  
R, SVal V){
   // FIXME: Verify that we should use getRValueType or getLValueType.
   QualType T = R->getRValueType(getContext());
   assert(T->isStructureType());

   RecordType* RT = cast<RecordType>(T.getTypePtr());
   RecordDecl* RD = RT->getDecl();
   assert(RD->isDefinition());

   if (V.isUnknown())
     return KillStruct(St, R)
   ...

Two things occur to me:

(1) We should always be adding any region directly assigned to the  
killset.  Doesn't that include structs?

(2) After assigning the value "unknown" to the struct, the values of  
its fields should not be 0 (as indicated by the comments for  
RegionDefaultValue).  That doesn't make sense to me.  Shouldn't they  
just be "unknown"?

In other words, regardless of whether V.isUnknown() == true (in the  
above code snippet) we should always do (1) and then just remove any  
bindings to struct and it's fields.  Adding entries to  
RegionDefaultValue (i.e., (2)) doesn't make sense to me, so I think  
I'm just missing something and don't really understand the overall  
design.

> Ah, I forgot that point. Is this also known as persistent data  
> structure?

Yes it is.  The term "purely functional data structures" comes from  
the functional programming mindset.  Okasaki has an excellent book on  
that topic:

http://books.google.com/books?id=SxPzSTcTalAC&dq=purely+functional+data+structures&printsec=frontcover&source=bn&hl=en&sa=X&oi=book_result&resnum=4&ct=result

>
> Do we also need to remove region views for the regions that are  
> killed, or should we dispense with that GDM entry entirely?
>
> I think region views are orthogonal to this one. Region views are  
> used to cast region back and forth between typed and untyped ones.  
> Here we only handle the region bindings.

"Views" are just a specific type of subregions (whose bindings are all  
getting nuked by KillStruct).  The GDM entry "RegionViews" is just a  
back-mapping from a region to its subregions.  If we remove all of  
those bindings why do we need to keep that backmapping around since  
none of them bind to any value?  Removing stale data from the state  
allows better caching.

As a meta comment, it would probably be good to add comments to the  
top of RegionStore that documents the overall design. More  
specifically, we should mention how the different GDM pieces are used,  
what's the overall abstraction of how regions bind to values, how we  
model "views" of regions (i.e., what they are and what purpose they  
serve) and so forth.  I feel that some of my confusion is stemming  
from not completely understanding the design as well as I thought I did.