[cfe-dev] [analyzer] Random SVal hierarchy questions.

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Mon Nov 2 07:06:51 PST 2015


I've got a few quick random questions about SVal/MemRegion/SymExpr 
heirarchy in the Static Analyzer. In most cases, i'm looking for design 
ideas behind particular values, so it's sort of theoretical.


== SymbolConjured of a record type ==

Usually symbols have integral types. However, if a function returns, 
say, "struct S" by value, and its definition is not available, then its 
return value is described as conj_$N{S}. I'm surprised to see a symbol 
of such type, because it doesn't behave like other symbols - eg. you 
cannot compose a SymIntExpr from it, or assume range constraints on it. 
Is this behavior intentional, or it's rather worth it try replacing it 
with a more structure-like value? For example, a LazyCompoundValue with 
a NULL Store could describe an unknown structure pretty well. Is it 
worth it to *assert* that all symbols have numeric or pointer types only?


== SymbolicRegion in different memory spaces ==

If two SubRegions have different super regions, in particular if they 
have different memory spaces, they are considered to be different (i.e., 
have different hash). This makes sense for, say, discriminating between 
VarRegions of ParmVarDecls in different stack frame contexts (different 
calls to the same function). However, for SymbolicRegion based on a 
pointer symbol, it does not make sense to discriminate between the 
symbolic region in Unknown memory space and the symbolic region in Heap 
memory space (eg. constructed by MallocChecker). The analyzer currently 
considers these regions to be two different regions, even if they are 
based on the same pointer. Is this problem worth fixing by removing the 
memory space from the hash function of SymbolicRegion?


== CXXTempObjectRegion and block counts ==

CXXTempObjectRegion is constructed with expression that caused it to 
appear, a memory space, and that's it. I wonder if, during lifetime of a 
CXXTempObjectRegion we can accidentally reach the same expression again 
and produce the same region twice and, say, overwrite its bindings. In 
other words, does CXXTempObjectRegion require some sort of block-count, 
like SymbolConjured? This time, i'd rather believe that it doesn't, and 
that the current behavior is correct, so i wonder if there's an easy way 
to explain why is it correct.


== SymbolRegionValue after a deleted binding ==

As far as i understand, SymbolRegionValue can be described as a symbol 
that denotes the unknown value of a region *in the beginning* of the 
analysis. It is produced by the RegionStore whenever there is no other 
binding that has overwritten this value *during* the analysis. However, 
when we do invalidation, we can sometimes remove bindings from regions. 
Is it intended that the same SymbolRegionValue may possibly re-appear 
after invalidation, simply because the binding was removed, even if the 
original SymbolRegionValue is considered dead, and thus all constraints 
and checker data was erased, and hence there's no harm in reusing the 
symbol? Or we'd better try our best to ensure that some sort of 
SymbolDerived would always be created in this scenario, and the original 
SymbolRegionValue never gets resurrected?



More information about the cfe-dev mailing list