[cfe-dev] Fwd: -Wunreachable-code and templates

Wed Nov 30 08:13:54 PST 2011

Hi Ted - thanks for the reply (given that you mentioned this
previously, I thought you were likely to have an informed opinion on
the subject & wanted to get that before wading out too deep into the
code)

> My concern about approach #1 is that not all analyses are savvy to analyze templates directly, and the CFG itself will be incomplete.  The CFG contains destructor calls and possibly other entities whose specifics are only known when the template is actually instantiated (e.g., a template parameter 'T' may expand to something with a destructor, or it may expand to something like 'int' where no destructor is involved).  The result is that two instantiations may have very different CFGs.  Further, I don't think most analyses would be savvy to reason about such "abstract definitions" of functions.  Uninstantiated templates are really more syntax than semantics, and analyses rely on precise semantic information.

Makes sense - I had that concern at a very vague level, not knowing
all the particulars about how Clang represents template patterns, so
thanks for clarifying/making that more concrete.

> My concern about #2 is twofold.  First, it is potentially very costly.  If you analyze instantiated templates as they stream by, you potentially have to retain all the analysis information in order to fuse them up later to reach the final conclusions.  With a naive implementation, the space overhead of the analysis could grow linearly with the number of instantiations, while a good implementation would incrementally merge the new analysis results for each instantiation into the old one, and then once we finished parsing the translation unit reach the final decisions.  This sound potentially really complex, but I suppose it could be tractable since only few analyses would possibly care about taking this extra step.

For what it's worth, the idea I had was to do it incrementally -
essentially having to walk the template pattern side-by-side with the
instantiation (which, given (1) would be more work than I anticipated,
certainly - having to account for tolerable differences in the CFG
between pattern and instantiation) & only keep the template pattern's
CFG across multiple function definitions. So memory cost linear on the
number of templates (because you'd never know if there might be
another instantiation coming up later)

> My second concern about #2 is that I don't think it is technically correct.  Even if you analyze all N instantiations of a function template, do the dataflow analysis, and conclude that from those N instantiations that a given piece of code was dead, that doesn't mean that if you did a N+1'th instantiation that was different from the others that the block would be guaranteed to be dead.  Basically, proving that a block is dead for the N instantiations in the way you propose doesn't prove that the block is dead in general.

My naive thinking was that if we've seen all the instantiations &
something still isn't covered, then it is dead code - but you're
right; Perhaps it's a library that a given translation unit only uses
one or two instantiations of and another TU reaches the apparently
'dead' code. Scratch that, then.

> The idea I had to tackle this was a bit simpler, although it could certainly be flawed.  Looking at the results of -Wunreachable-code that you mentioned, the reason we report this code as unreachable is because the edges for the branch conditions in the CFG are intentionally pruned as being trivially satisfiable/unsatisfiable.

Right, this hinges on the
(anon)::CFGBuilder::tryEvaluate/tryEvaluateBool functions (in turn on
Expr::EvaluateAsRValue/EvaluateAsBooleanCondition).

> Instead of just chopping off the edges completely, we can retain all edges, but mark them as being unreachable for various reasons, e.g. value dependent because of template instantiations or expanded from a macro.

So you're suggesting that those try/Evaluate functions would return
this extra information as part of their tree walk evaluation? "true,
but it's also value dependent", "true, but depends on a macro
expansion"

This was the first puzzle I hit upon when I was looking through this
code - the type/value dependence checks in tryEvaluate/tryEvaluateBool
weren't 'working' (they never fired, even with an assert) because
templates weren't visited, only instantiations - so I was wondering if
I could just simply check the Expr to see if it was an instantiation
of a type/value dependent expression, but that information wasn't
available (which is why I got talking to Chandler & Richard on IRC
about whether it could be retrieved, etc). But if we're going to
evaluate the expression anyway (for other clients) then it shouldn't
be hard to record the extra fact that while doing so we had to rely on
template or macro expansions (at least the template case has a
specific expression node type - I haven't looked at how I'd detect the
macro case yet).

> If we retain this information, what you are left with is a CFG that contains both pruned edges (the default view for most clients) but also the "hidden" edges for those clients interested in the complete control-flow as implied by the AST.  There are already some clients that care about getting both the "unoptimized" and "optimized" CFG; this enhancement would collapse the two into the same data structure.  -Wunreachable-code for example could enhance its walk of the CFG to take special attention to these hidden edges and handle them accordingly.  The nice thing about this approach is that it helps marginalize out overly optimistic conclusions on reachable code that were made based on the specific template instantiations that you saw.
>
> The other nice thing about this approach is that it is very simple.  All CFGBlocks currently record a vector of predecessors and successors (each represented as a CFGBlock*), with iterators over them.  We could change those vectors to contain bitmasked pointers representing real edges and pruned edges, with special bits to indicate why an edge was pruned.

What kind of granularity did you have in mind for this bitmask? Are
there reasons we would need to store the reason for a particular
branch being trivially unsatisfiable (macro, type dependence, value
dependence)? Or just that it is. (& then another flag for the
optimized/unoptimized case you mentioned already exists)

> We could then add a new successor/predecessor iterator that just skips the hidden edges (the default view for most clients) but then provide an alternate iterator that viewed the hidden edges as well (and means to query them).

Would it make sense, possibly, to just have the iteration itself
accept a query in the form a bitmask of the type of edges to visit, or
would that be too heavyweight? (it could be a template parameter to
the iterator rather than a runtime one, I suppose - the default value
of the argument being the "NoHidden" flag to get the old/current use
case)

> What do you think?

Makes sense so far (we'll see what happens when I try to implement it
- I might have some more questions).

- David