[cfe-commits] r63347 - in /cfe/trunk: include/clang/Analysis/PathSensitive/BasicValueFactory.h include/clang/Analysis/PathSensitive/MemRegion.h include/clang/Analysis/PathSensitive/SVals.h lib/Analysis/MemRegion.cpp lib/Analysis/RegionStore.cpp l
Ted Kremenek
kremenek at apple.com
Mon Feb 2 13:21:51 PST 2009
Zhongxing,
I'm going to respond to your comments out-of-order.
On Jan 30, 2009, at 1:36 AM, Zhongxing Xu wrote:
<SNIP>
> It saves much trouble later when p is used as array base.
>
> If we do not make that ElementRegion when doing cast, it is hard to
> distinguish the case for code:
>
> struct s* p = (struct s*) alloc(sizeof struct s);
> struct s v = p[1];
>
> For this case the region for p should not be used as the base region
> for element region.
>
> My point is that we make an ElementRegion in the first case and an
> AnonTypedRegion for the second case, and we can only do the right
> thing when doing cast, but not later when we getLValueElement for
> p[1].
>
> Then when we getLValueElement, we abandon (emit warning or retun
> unknown) when the base region is not an ElementRegion.
I honestly don't think we can have the invariant that getLValueElement
is always invoked on a MemRegion that is an ElementRegion. Consider
the following examples:
Example 1:
int foo(int *x) {
return x[0];
}
In Example 1, 'x' binds to a loc::SymbolVal. The region associated
with that symbol is a SymbolicRegion (let's call it SymR). How do we
ensure the an ElementRegion whose base is SymR and index 0 is passed
to getLValueElement? Honestly, what does this buy us? If we pass a
non-ElementRegion to getLvalueElement, it's a synonym for
ElementRegion(SymR, 0).
Example 2:
int bar() {
char x;
char *p = &x;
return p[0];
}
In this case we have a VarRegion for x (let's denote that
VarRegion(x)). After the second line 'p' binds to
loc::MemRegionVal(VarRegion(x)). This value is then passed to
getLValueElement. Essentially this is the same as passing
ElementRegion(VarRegion(x), 0).
I'm sure we can come up with many more cases like this. What does
creating an ElementRegion (with base 0) before calling
getLValueElement buy us? This logic could certainly be done in
GRStateManager, just before it calls StoreManager::getLValueElement.
Maybe that would simplify some things.
Creating an ElementRegion when we do the cast from the return value of
alloca() to another type doesn't make much sense to me at all. When
we see code like the following, 'p' binds to the *entire*
AnonTypedRegion, not just a specific element:
char *p = (char*) alloca(1000);
I just semantically adds an unnecessary veneer to have 'p' bind to
ElementRegion(AnonTypedRegion(AllocaRegion(...)), 0) instead of just
AnonTypedRegion(...). I argue that this veneer buys us nothing since
we have to handle all the other cases I mentioned above anyway in some
uniform fashion.
>
> As the rabbit hole goes arbitrary deep here, I suggest we abandon
> early when we getLValueElement for wonky[1]. For this example, I
> think the region of _wonky_gesticulate_cheese should be the brother
> but not the parent of the region for wonky[1].
I'm not certain what you mean by the "brother." We don't have that
notion (yet) in MemRegion. Can you elucidate? What would this buy
us? What would be the parent of
ElementRegion(_wonky_gesticulate_cheese, 1) (i.e., "wonky[1])?
> But when we getLValueElement for wonky[1], we actually use the
> region of _wonky_gesticulate_cheese as the super region for the
> element region.
I'm not certain what this dual view buys us. The variable 'wonky' has
the region 'VarRegion(wonky)'. That region binds to the location of
_wonky_gesticulate_cheese. The lvalue of 'wonky[1]' evaluates to
ElementRegion(_wonky_gesticulate_cheese, 1). Wouldn't the parent of
the later just be _wonky_gesticulate_cheese in all cases?
> Then it goes to my original argument for guaranteeing all base
> regions passed to getLValueElement() to be ElementRegions, since we
> do not have enough information in getLValueElement() for us to make
> the right decision.
I feel like I'm missing something here. What information are we
missing to make the right decision?
Put another way, what problem are you trying to solve by imposing the
precondition to getLValueElement that it is always passed an
ElementRegion? As I said earlier, we can enforce this precondition by
adding some wrapper logic in GRStateManager, but all that would be is
code refactoring. From my understanding of things now, all of the
information necessary to make the "right" decision can be done in
getLValueElement.
> For code
>
> char * p = (char*) alloc(4);
> char c = p[1];
>
> we make p points to an element region when we do the cast.
I honestly feel that special casing this example buys us nothing
(given the examples I provided above), and only adds complexity and
extra space overhead.
Don't get me wrong, I think we need some nothing of an "extended" type
system when doing analysis. Some casts could be deemed "okay" and
others not so much. Unions represent a good example where memory can
be reused for different data types, so we're just going to have to
make some choices at some point of how to handle pieces of memory that
are used in a conflated manner.
More information about the cfe-commits
mailing list