[cfe-dev] [analyzer] Random SVal hierarchy questions.
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Mon Nov 2 07:06:51 PST 2015
I've got a few quick random questions about SVal/MemRegion/SymExpr
heirarchy in the Static Analyzer. In most cases, i'm looking for design
ideas behind particular values, so it's sort of theoretical.
== SymbolConjured of a record type ==
Usually symbols have integral types. However, if a function returns,
say, "struct S" by value, and its definition is not available, then its
return value is described as conj_$N{S}. I'm surprised to see a symbol
of such type, because it doesn't behave like other symbols - eg. you
cannot compose a SymIntExpr from it, or assume range constraints on it.
Is this behavior intentional, or it's rather worth it try replacing it
with a more structure-like value? For example, a LazyCompoundValue with
a NULL Store could describe an unknown structure pretty well. Is it
worth it to *assert* that all symbols have numeric or pointer types only?
== SymbolicRegion in different memory spaces ==
If two SubRegions have different super regions, in particular if they
have different memory spaces, they are considered to be different (i.e.,
have different hash). This makes sense for, say, discriminating between
VarRegions of ParmVarDecls in different stack frame contexts (different
calls to the same function). However, for SymbolicRegion based on a
pointer symbol, it does not make sense to discriminate between the
symbolic region in Unknown memory space and the symbolic region in Heap
memory space (eg. constructed by MallocChecker). The analyzer currently
considers these regions to be two different regions, even if they are
based on the same pointer. Is this problem worth fixing by removing the
memory space from the hash function of SymbolicRegion?
== CXXTempObjectRegion and block counts ==
CXXTempObjectRegion is constructed with expression that caused it to
appear, a memory space, and that's it. I wonder if, during lifetime of a
CXXTempObjectRegion we can accidentally reach the same expression again
and produce the same region twice and, say, overwrite its bindings. In
other words, does CXXTempObjectRegion require some sort of block-count,
like SymbolConjured? This time, i'd rather believe that it doesn't, and
that the current behavior is correct, so i wonder if there's an easy way
to explain why is it correct.
== SymbolRegionValue after a deleted binding ==
As far as i understand, SymbolRegionValue can be described as a symbol
that denotes the unknown value of a region *in the beginning* of the
analysis. It is produced by the RegionStore whenever there is no other
binding that has overwritten this value *during* the analysis. However,
when we do invalidation, we can sometimes remove bindings from regions.
Is it intended that the same SymbolRegionValue may possibly re-appear
after invalidation, simply because the binding was removed, even if the
original SymbolRegionValue is considered dead, and thus all constraints
and checker data was erased, and hence there's no harm in reusing the
symbol? Or we'd better try our best to ensure that some sort of
SymbolDerived would always be created in this scenario, and the original
SymbolRegionValue never gets resurrected?
More information about the cfe-dev
mailing list