[cfe-dev] Fwd: -Wunreachable-code and templates

Wed Nov 30 00:45:22 PST 2011

Hi David,

Thanks for bringing up this topic.

My concern about approach #1 is that not all analyses are savvy to analyze templates directly, and the CFG itself will be incomplete.  The CFG contains destructor calls and possibly other entities whose specifics are only known when the template is actually instantiated (e.g., a template parameter 'T' may expand to something with a destructor, or it may expand to something like 'int' where no destructor is involved).  The result is that two instantiations may have very different CFGs.  Further, I don't think most analyses would be savvy to reason about such "abstract definitions" of functions.  Uninstantiated templates are really more syntax than semantics, and analyses rely on precise semantic information.

My concern about #2 is twofold.  First, it is potentially very costly.  If you analyze instantiated templates as they stream by, you potentially have to retain all the analysis information in order to fuse them up later to reach the final conclusions.  With a naive implementation, the space overhead of the analysis could grow linearly with the number of instantiations, while a good implementation would incrementally merge the new analysis results for each instantiation into the old one, and then once we finished parsing the translation unit reach the final decisions.  This sound potentially really complex, but I suppose it could be tractable since only few analyses would possibly care about taking this extra step.

My second concern about #2 is that I don't think it is technically correct.  Even if you analyze all N instantiations of a function template, do the dataflow analysis, and conclude that from those N instantiations that a given piece of code was dead, that doesn't mean that if you did a N+1'th instantiation that was different from the others that the block would be guaranteed to be dead.  Basically, proving that a block is dead for the N instantiations in the way you propose doesn't prove that the block is dead in general.

The idea I had to tackle this was a bit simpler, although it could certainly be flawed.  Looking at the results of -Wunreachable-code that you mentioned, the reason we report this code as unreachable is because the edges for the branch conditions in the CFG are intentionally pruned as being trivially satisfiable/unsatisfiable.  Instead of just chopping off the edges completely, we can retain all edges, but mark them as being unreachable for various reasons, e.g. value dependent because of template instantiations or expanded from a macro.

If we retain this information, what you are left with is a CFG that contains both pruned edges (the default view for most clients) but also the "hidden" edges for those clients interested in the complete control-flow as implied by the AST.  There are already some clients that care about getting both the "unoptimized" and "optimized" CFG; this enhancement would collapse the two into the same data structure.  -Wunreachable-code for example could enhance its walk of the CFG to take special attention to these hidden edges and handle them accordingly.  The nice thing about this approach is that it helps marginalize out overly optimistic conclusions on reachable code that were made based on the specific template instantiations that you saw.

The other nice thing about this approach is that it is very simple.  All CFGBlocks currently record a vector of predecessors and successors (each represented as a CFGBlock*), with iterators over them.  We could change those vectors to contain bitmasked pointers representing real edges and pruned edges, with special bits to indicate why an edge was pruned.  We could then add a new successor/predecessor iterator that just skips the hidden edges (the default view for most clients) but then provide an alternate iterator that viewed the hidden edges as well (and means to query them).

What do you think?

Ted

On Nov 30, 2011, at 12:09 AM, David Blaikie wrote:

> [sorry, wrong list - moving to cfe-dev]
> 
> I'm just wondering if anyone's already working on addressing the
> current shortcomings around -Wunreachable-code in regards to
> templates.
> 
> Take the following simple example:
> 
> 
> $ cat unreachable.cpp
> int func1();
> int func2();
> template<bool b>
> int func() {
>  return !b ? func1() : func2();
> }
> 
> int main() {
>  return func<true>() + func<false>();
> }
> $ clang++ -Wunreachable-code unreachable.cpp
> unreachable.cpp:5:15: warning: will never be executed [-Wunreachable-code]
>  return !b ? func1() : func2();
>              ^~~~~
> unreachable.cpp:9:10: note: in instantiation of function template
> specialization 'func<true>' requested here
>  return func<true>() + func<false>();
>         ^
> unreachable.cpp:5:25: warning: will never be executed [-Wunreachable-code]
>  return !b ? func1() : func2();
>                        ^~~~~
> unreachable.cpp:9:25: note: in instantiation of function template
> specialization 'func<false>' requested here
>  return func<true>() + func<false>();
>                        ^
> 2 warnings generated.
> 
> 
> I have at least two (possibly naive) ideas about how to handle this:
> 
> 1) Visit templates & skip instantiations. This is even consistent with
> some existing code (see CFG.cpp:441-ish, where the Expr *S is checked
> for type/value dependence & the condition is not evaluated if that is
> the case - in fact that's never the case because
> AnalysisBasedWarnings.cpp:821 skips analyzing any dependent context.
> Instead we could remove the check in AnalysisBasedWarnings.cpp and use
> the check in CFG.cpp (& add another check to ignore the actual
> template instantiations)). This, I believe, should remove the false
> positives with a minimum of fuss. It will also remove a bunch of
> correct positive reports from specific template instantiations (for
> example if we only instantiated func<true> above, the false case is
> unreachable but would not produce a warning)
> 
> 2) A more interesting option would be to visit the instantiations, but
> instead of reporting on the instantiation, build a CFG of the original
> template (this would require effectively inter-function analysis -
> across all the instantiations of a function template). Once all the
> instantiations have been visited we could look at any blocks still
> marked unreachable in the original template & produce warnings for
> those. My only query here would be where to store the CFG for these
> templates & when to visit them to ensure all the instantiations had
> already been visited (probably a map as a member of Sema and visit
> them at the end of sema - though these may have significant perf
> (mostly space - the time is already being spent walking each
> instantiation) impact)
> 
> So - anyone else already working on this? Existing PRs I should look
> at? Other design ideas people have already had?
> 
> Thanks,
> - David
> 
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev