[cfe-dev] [analyzer] Random SVal hierarchy questions.
Artem Dergachev via cfe-dev
cfe-dev at lists.llvm.org
Tue Nov 10 08:50:16 PST 2015
Hello Devin,
Thanks a lot for the reply! Questions about CXXTempObjectRegion block
counts and SymbolRegionValue lifetime are now perfectly clear. I have a
few more confusions to sort out with other questions.
> the only legal way to execute the same expression twice while the
result of the previous one is still around is recursion
This argument makes sense to me :) Probably some day i'd have a look if
MemRegionManager::getCXXStaticTempObjectRegion() can have similar issues.
> Are you seeing cases where the analyzer is removing bindings but not
adding a default binding to a conjured symbol?
Not really, i just wanted to ensure that it's intended to stay that way
:) I was a bit confused with the phabricator comment
http://reviews.llvm.org/D5104#70661 .
> > == SymbolConjured of a record type ==
> One thing conjured symbols of record type let you do is reason about
equality of MemRegions and of values in the fields of those regions
independently of range constraints.
This confirms my understanding, and, additionally, i realized that the
idea of a LazyCompoundValue with NULL store is completely incorrect :)
So i can correct this question a bit:
Suppose that instead of the SymbolConjured of type "struct S", we return
a LazyCompoundValue with a store that contains only one binding - "(R,
0, default): conj_$N<int>", where R is... dunno, some temporary region
that represents this structure, or probably the whole memory space,
doesn't really matter. Then we obtain a similar SymbolDerived from this
"conjured store", which allows reasoning about equality of its field
values equally well, however the behavior of the engine confirms to the
following additional invariants, which i find convenient:
(1) All SymExpr's have numeric or pointer types; any SymExpr can serve
as a constraint manager key.
(2) Functions that return structs by values always return compound
values (or unknown/undefined), regardless of how conservatively they
were evaluated.
I initially run into this problem when coding alpha-renaming, which
caused additional special case of renaming a compound value into a
symbolic expression and vice versa. So i'm wondering about the design
idea behind using symbolic expressions instead of compound values in
this case. Even though in the current code, it causes no problems. So
i'm simply wondering if such change would be accepted as a part of some
bigger patch (or i work around), or if it accidentally rings any bells
that invariants (1) and (2) were actually always desired and missed only
accidentally. I think i can prepare a fix if it is desired, which
asserts invariants (1) and (2) and adjusts the code accordingly.
> > == SymbolicRegion in different memory spaces ==
Thanks, your explanation had quite some revalations for me regarding the
meaning of SymbolicRegion memory spaces for invalidation! What makes me
still confused a bit is whether it is intended to support situations like:
Store:
(SymRegion{s},0,direct) : 1 S32b
(SymRegion{s},0,direct) : 2 S32b
where the first SymRegion is in Heap space, and the second SymRegion is
in Unknown space? I guess it shouldn't happen, because these two
bindings refer to the exact same memory address (s + 0). But i can
easily imagine an incorrectly coded checker (when the author wasn't
aware of this problem) accidentally cause such store to appear.
So it makes sense to me to try to keep SymbolicRegions consistent: one
(symbolic) memory address refers to only one region.
I think i can come up with a different solution from what i mentioned
originally: create a new kind of MemRegion for regions returned by
malloc()/new, some sort of HeapSegmentRegion. I'm not really seeing any
benefit of keeping the address symbol at all in such region (similarly
to how we do not keep address symbols for VarRegion - it's a "concrete"
region, with all possible information clearly available). Instead, we
can keep SVal for extent, and override getExtent() accordingly - this
approach is more convenient for the users than constraints on
SymbolExtent, because checker authors do not need to perform extra
actions to see if SymbolExtent folds into a constant; and also symbolic
malloc arguments would be handled better. Again, i can prepare a patch
if such fix is desired - i think i even know how to write a test :)
Hmm. We can also use alpha-renaming on SymbolExtent, to replace it with
the concrete SVal in MallocChecker, when we have alpha-renaming, that it.
More information about the cfe-dev
mailing list