[cfe-commits] [scan-build][PATCH 1/1] Unreachable code checker : false positive in switch statements ?

Wed Feb 29 09:01:20 PST 2012

On Tue, Feb 28, 2012 at 10:53 PM, Ted Kremenek <kremenek at apple.com> wrote:
> On Feb 28, 2012, at 10:29 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
> Should we just change the way the CFG is built to include edges for
> default labels on covered switches all the time?
>
>
> Hi David,
>
> In short, no.  This wouldn't be generally the best solution for all clients.
>
> I think have a better solution (see below).
>
> That would fix the
> same bug (I'm on the fence about whether it is a bug, myself) in
> Clang's -Wunreachable-code.
>
> Presumably we're actually doing work/have code designed to check
> covered switches & skip the default edge - we could just remove that
> code I would imagine.
>
> (one possible problem: that would then mean the default case is
> considered reachable - for reachable-based warnings in
> AnalysisBasedWarnings. Do these default cases fall into the special
> "neither provably reachable nor unreachable" middle ground (like the
> unreachable code in template instantiations that isn't unreachable in
> the pattern, or code that's unreachable on the basis of sizeof
> expressions, macros, etc))
>
>
> I think I've mentioned before my idea of leaving in "ghost edges" in the CFG
> that are dead but want to be retained by some clients.  For example,
> analyses like -Wuninitialized or the static analyzer would largely want to
> treat ghost edges as being not there for maximum dataflow precision.  Other
> analyses, such as -Wunreachable-code, may want to include them in their
> analysis.  Having ghost edges allows us to construct one CFG that can serve
> multiple clients, and at one extreme allows us to represent both the
> unoptimized and optimized CFG simultaneously in the same data structure.

Yep - that's exactly what I've implemented (first pass, a bit hackish,
but the basic idea) in the code I sent out to cfe-dev (& you) for
fixing sizeof-based conditional unreachable code false positives.

See the thread entitled "sizeof conditions and -Wunreachable-code"

> Ghost edges would be easy to implement.  Currently CFGBlocks have a vector
> of successors and predecessors CFGBlock*'s.  Pruned out edges are explicitly
> NULL CFGBlock pointers.  A ghost edge could be a bitmasked pointer, which
> could indicate that the edge is dead, but provides extra breadcrumbs for
> analyses.  We could modify the iterators to handle ghost edges for most
> clients using the currently observed behavior (i.e., treat all ghost edges
> as dead), but provide a more general iterator that allows one to interrogate
> the individual ghost edges when desired.

& exactly that - a PointerIntPair<CFGBlock*, 1> - works pretty well,
only one minor issue since the pointers (& thus iterator dereferences
over pred/succ) aren't assignable (& even if they were, existing
clients using the 'old' bit-flag-unaware APIs wouldn't know how to
persist/move the flag along with the pointer) so I had to implement an
explicit 'reverse' op (rather than just reverse(pred_begin(),
pred_end()) over the CFGBlock* iterators). I'm also not sure what to
name that boolean flag - currently for the sizeof case I'd named it
"MayVaryByBuild" - but perhaps there's some more generic term that can
describe edges with such flags as being "optimistically navigable" or
somesuch. Open to suggestions as I mentioned in the email/review.

> Getting back to -Wunreachable-code and the static analyzer checker,
> unfortunately I think just hacking the CFG is not a general solution to
> solve both of their similar concerns.  While -Wunreachable-code could
> benefit from ghost edges, the static analyzer path-sensitive engine would
> first prove that blocks are unreachable by doing its own path-sensitive
> dataflow analysis on the CFG abstraction that *it* wants to work with.

Sorry, I'm a little unclear here - you used the word "would" in
regards to the SA's behavior. Are you saying that currently
-Wunreachable-code and the SA checker use the same logic, but,
ideally, will need different implementations as they are improved?

Presumably the SA should make judgments on values - is this enum value
in the range of valid enums (or are we going to guess that it is, for
diagnostic benefit) or only in the range of the actual enumeration
constants? Actually, perhaps it'll need the same concept as
-Wunreachable-code, except ghost /values/ rather than ghost /edges/.
So that you can assume an enum function parameter (in the absence of
interprocedural analysis) is in the smaller set for uninitialized
warnings and the larger set for reachability warnings.

>  It
> is then up the checker to decide whether or not to prune specific warnings.
>  For those, it could use ghost edges as well, just like -Wunreachable-code,
> to do its post-process analysis to prune false positives (or rather warnings
> people don't care about), but the analysis won't be directly the same.  Make
> sense?

I'm a little confused as to why this would be a post-processing
"Remove certain diagnostics" rather than a function of how the CFG
itself is built or, more likely, how the unreachable code is
discovered (eg: path-sensitive data analysis used to find unreachable
code rather than just walking the graph & only evaluating constant
expressions - the algorithm finds the right bits for the right
clients, rather than finding a bunch & then dropping them by using
another analysis on top of that). But I feel like I'm perhaps missing
something in my understanding there.

- David