[cfe-dev] How to extract a symbol stored in LazyCompoundVal?

Artem Dergachev via cfe-dev cfe-dev at lists.llvm.org
Wed Jun 26 12:15:35 PDT 2019


Hmm, weird.

I suspect that assignment was handled with "small struct optimization", 
i.e. field-by-field rather than lazily (cf. 
RegionStoreManager::tryBindSmallStruct).

Could you do a State->dump() to verify that? If it shows that there's no 
default binding but instead there are two derived symbols bound to two 
different offsets, then the information about the "whole struct symbol" 
is already more or less lost: the static analyzer no longer remembers 
that this whole structure is the same as pos1, but it does remember that 
its fields, separately, are exactly the same as they were in pos1, which 
is what you see by looking at the fields separately.

Generally we don't have many checkers that track structures as a whole 
and we don't really know how *should* the checker API look like in order 
to make such checkers easy to implement. The only such checker that we 
have is IteratorChecker and it kinda tries to do something but it's not 
very convenient. For C++ objects i'm thinking of tracking a "whole 
structure symbol" artificially, so that it didn't have anything to do 
with the actual contents of the structure but more with its semantic 
meaning: it would be preserved by const operations (even if they mutate 
memory contents of mutable fields) or through copies/moves and 
additionally you would be able to attach state traits to it without 
thinking about manually modeling copies/moves.

I guess in your case, which seems to be more like a C world, the ad-hoc 
solution would be to do something like

     let's see...
     pos2.x comes from pos1...
     pos2.y also comes from pos1...
     aha, got it!
     the whole pos2 comes from pos1!

You will *anyway* have to do this because the programmer is free to copy 
the structure field-by-field manually instead of just assigning the 
structure. This would also happen in C++ if the structure has a 
non-trivial constructor. For the same reason it's not enough to check 
only 'x' but skip 'y': the programmer can easily overwrite one field but 
not the other field.

Finally, i'm surprised that it returns a UndefinedVal (i.e., in 
particular, it allows you to unwrap the Optional) instead of None. This 
sounds like a bug. But it might be because the structure does indeed 
have an undefined default binding (eg., this happens when it's allocated 
by malloc() or operator new). It'd make sense because assigning every 
field wouldn't overwrite the default binding. Which, in turn, should 
remind you that relying on the "structure symbol" in order to figure out 
what the contents of the structure are is not a good idea unless your 
structure is immutable and completely opaque or you somehow know that 
it's freshly created. But direct bindings to fields are actually always 
trustworthy. That's how our memory model works.


On 6/25/19 9:10 PM, Torry Chen wrote:
> Thank you Artem! It seems StoreManager::getDefaultBinding() won't work 
> if the struct variable is copied. As shown below, getDefaultBinding() 
> returns an undefined SVal.
>
> I could go down into fields to get the derived symbols for X and Y 
> respectively, and then use getParentSymbol() to get the symbol for the 
> whole struct. This looks cumbersome though. Is there a more convenient 
> way to get the symbol for the whole struct in this case?
>
> // checkBind: pos1 -> conj_$3{struct XY, LC1, S45418, #1}
> struct XY pos1 = next_pos(10, 20);
>
> // checkBind: pos2 -> lazyCompoundVal{0x5d4bb38,pos1}
> struct XY pos2 = pos1;
>
> move_to_pos(pos2);
>
> /** evalCall for move_to_pos():
>   SVal Pos = C.getSVal(CE->getArg(0));
>   ProgramStateRef State = C.getState();
>   StoreManager &StoreMgr = State->getStateManager().getStoreManager();
>   auto LCV = Pos.getAs<nonloc::LazyCompoundVal>();
>   SVal LCSVal = *StoreMgr.getDefaultBinding(*LCV);
>   LCSVal.dump() // <- Undefined
>   ...
>   const Store St = LCV->getCVData()->getStore();
>   const SVal FieldSVal = StoreMgr.getBinding(St, 
> loc::MemRegionVal(FieldReg));
>   FieldSVal.dump(); // <- derived_$4{conj_$3{struct XY, LC1, S45418, 
> #1},pos1->X}
>
>   const auto *SD = dyn_cast<SymbolDerived>(FieldSVal.getAsSymbol());
>   const auto ParentSym = SD->getParentSymbol();
>   ParentSym.dump(); // <- conj_$3{struct XY, LC1, S45418, #1}
> **/
>
> On Tue, 25 Jun 2019 at 14:06, Artem Dergachev <noqnoqneo at gmail.com 
> <mailto:noqnoqneo at gmail.com>> wrote:
>
>     The "0x4aa1c58" part of "lazyCompoundVal{0x4aa1c58,pos1}" is a
>     Store object. You can access it with getStore() and then read it
>     with the help of a StoreManager.
>
>     Hmm, we seem to already have a convenient API for that, you can do
>     StoreManager::getDefaultBinding(nonloc::LazyCompoundVal) directly
>     if all you need is a default-bound conjured symbol. But if you
>     want to lookup, say, specific fields in the structure (X and Y
>     separately), you'll need to do getBinding() on manually
>     constructed FieldRegions (in your case it doesn't look very useful
>     because the whole structure is conjured anyway).
>
>     I guess at this point you might like the chapter 5 of my old
>     workbook
>     (https://github.com/haoNoQ/clang-analyzer-guide/releases/download/v0.1/clang-analyzer-guide-v0.1.pdf),
>     as for now it seems to be the only place where different kinds of
>     values are explained.
>
>
>     On 6/25/19 2:35 AM, Torry Chen via cfe-dev wrote:
>>     My project has a struct type as follows and I'm writing a checker
>>     for some functions that take the struct value as an argument. In
>>     the checkPreCall function I see the argument is an
>>     LazyCompoundVal, not a symbol as it would be for a primitive
>>     type. I tried a few ways to extract the symbol from the
>>     LazyCompountVal with no luck. Hope to get some help here.
>>
>>     struct XY {
>>       uint64_t X;
>>       uint64_t Y;
>>     };
>>
>>     ...
>>     // checkBind: pos1 -> conj_$3{struct XY, LC1, S63346, #1}
>>     struct XY pos1 = next_pos(...);
>>
>>     // checkPreCall: Arg0: lazyCompoundVal{0x4aa1c58,pos1}
>>     move_to_pos(pos1);
>>
>>     _______________________________________________
>>     cfe-dev mailing list
>>     cfe-dev at lists.llvm.org  <mailto:cfe-dev at lists.llvm.org>
>>     https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20190626/f42fc286/attachment.html>


More information about the cfe-dev mailing list