[cfe-dev] Allowing checkers to mark symbols as live

Mon Aug 9 12:31:04 PDT 2010

Right, here's the use case in CStringChecker. I guess I should have
explained this first.

1. Look up the strlen of some string.
2. Conjure a symbol to represent it if we don't have one.
3. At the end of the statement, symbols are cleaned. Any information about
the strlen is now lost.
4. Look up the strlen of the same string. (!)

I'll admit that marking these symbols live is a little dangerous,
especially without a callback for dead regions being sweeped. But otherwise
we can get incorrect results from cases like this:

void strlen_liveness(const char *x) {
  if (strlen(x) < 5)
    return;
  if (strlen(x) < 5)
    (void)*(char*)0; // expected-warning{{never executed}}
}

This isn't entirely contrived; if the second call to strlen is replaced
with a strcpy, we could warn about buffer overflows.

There is no reference to the symbol besides what's in the GDM. A
SymbolRegionValue is for region contents, not metadata, and a SymbolDerived
requires a symbolic parent. SymbolExtent, besides currently having a
dedicated use, has the wrong semantics in that it isn't dependent on being
at a certain place in the code.

This is intentionally before RemoveDeadBindings, because
RemoveDeadBindings wipes the constraints for a symbol, and that would
defeat the purpose here.

I guess it wasn't as simple as I thought! One possibility would be to
either change SymbolExtent or add a new symbol with path-sensitive fields
like SymbolConjured. Would that be a better solution here?

On Mon, 9 Aug 2010 10:15:51 -0700, Ted Kremenek <kremenek at apple.com>
wrote:
> Hi Jordy,
> 
> I have mixed feelings about adding this, and I think this needs strong
> justification.  Why is it needed?  What symbols do we need to keep
around
> that aren't related to live regions, or the values bound to live
regions? 
> Right now symbols can indicate that they are derived from other symbols
(or
> are related to regions), and their liveness is extended if they thing
they
> derive from is also live.  If a symbol is no longer live, why do we need
to
> continue tracking state?
> 
> My concern about adding this is that it artificially will extend the
> lifetime of symbols, causing the GRState to accumulate extra state that
> defeats state caching.  The current set of checkers already do a bad job
of
> removing stale data from the GDM for dead symbols.  This would
exacerbate
> the problem.
> 
> Also, from the way you've implemented the callback, the checkers will be
> called before the state is crawled by RemoveDeadBindings.  This means
that
> a Checker basically gets to unconditionally mark a symbol as live
> regardless of whether or not the rest of the analysis would mark the
symbol
> live.  That seems too unconditional to me.  It also extends the lifetime
of
> other derived symbols that would otherwise get pruned from the state.
> 
> On Aug 8, 2010, at 1:20 PM, Jordy Rose wrote:
> 
>> The last bit of infrastructure I'd like to add for CStringChecker: the
>> ability for checkers to mark symbols live. This lets checkers use
>> conjured
>> symbols for data that lives in the GDM, rather than in the store. This
>> doesn't affect EvalDeadSymbols.
>> 
>> I think it's pretty straightforward, but since it's another new
callback
>> I
>> figured it was worth waiting for comments before committing. (When we
>> work
>> out a general mechanism for checker caching we can make this a cached
>> callback.)