[cfe-dev] [PATCH] expand a symbolic pointer
Ted Kremenek
kremenek at apple.com
Wed Dec 10 21:54:36 PST 2008
On Dec 10, 2008, at 8:06 PM, Zhongxing Xu wrote:
> Thanks for your comments. They were helpful. Now the interfaces get
> simplified a lot. Variables and functions are commented well.
>
> Here I explain the main idea.
>
> Some struct base is not mutable, e.g., struct s a; a.f = 3; Here the
> struct base 'a' is not mutable. But some can, e.g.,
>
> void foo(struct s* p) {
> p->f = 3
> }
>
> When we getInitialStore(), p is initialized to have value
> loc::SymbolVal, which I call a symbolic pointer. (Note: here no
> symbolic region is involved).
>
> In this patch we try to concretize the struct pointer p. We have
> several preconditions for this concretization:
> - the struct base is a pointer
> - the pointer is symbolic
> - the pointer is an lvalue, so that it can be modified.
>
> When concretizing the struct base pointer, we pass the pointer's
> location to StoreManager. Therefore the pointer should be an lvalue
> and has a location. We get this location by calling
> GRExprEngine::getBasePtrLoc(). This method only handles a few cases.
> In other cases, BasePtrLoc is left Undefined intentionally.
Hi Zhongxing,
The patch is looking great, but I still have some concerns about the
overall approach. Consider your example with one modification:
void foo(struct s* p) {
struct s* q = p;
p->f = 3
}
After evaluating 'p->f' p will bind to an AnonPointeeRegion, but 'q'
will bind to the original symbol. They are aliases. It gets
arbitrarily worse if the value of 'p' is stored in other places,
passed to functions, etc.
This overall approach feels a little "destructive." We're losing
information here. After evaluating 'p->f' we've lost the fact that
'p' *still* binds to the value at the entry to the function.
I see this as being fundamental problematic for four reasons:
(1) Different analyses may attach information (perhaps using the GDM)
to a symbolic value. Actually replacing a symbolic value with
something "concrete" loses that information.
(2) Consider function summaries. We don't have them yet, but
basically we would build up a set of constraints and effects in terms
of the input to the function. That input is the symbolic values.
Even if we "concretize" them we need to keep them around.
(3) Aliasing (i.e., the example I gave). Concretization via replacing
loses this critical information.
(4) The BugReporter "module" needs to reconstruct information from the
ExplodedGraph, including when aliasing relationships come into effect,
etc. Blowing away the original value of 'p' makes this difficult at
best.
The only way to replace a symbolic value with a concrete value is to
alpha-rename it throughout the entire state. This isn't trivial,
especially when one considers the GDM.
I honestly think we need to take a "layered" approach. Keep a map on
the side from "symbols" to concrete values. That's basically what we
do in ConstraintManager when it comes to equality relationships with
integers. Equality relationships with integers is a form of
concretizations; that's all were basically doing here, except we're
reasoning about the concretization of the heap.
What do you think? I have specific comments about your patch (which
is looking great!) but I will defer them until we iterate more on this
design point.
Ted
More information about the cfe-dev
mailing list