[cfe-dev] new -Wuninitialized implementation in Clang

Fri Feb 4 10:54:30 PST 2011

I asked David Li (one of our gcc people) about gcc's -Wuninitialized
warning, and here are his answers:

On Thu, Feb 3, 2011 at 6:35 PM, Ted Kremenek <kremenek at apple.com> wrote:
> Hi fellow Clangers,
>
> During the last two weeks, I have been working on implementing -Wuninitialized in Clang.  Clang currently doesn't implement this warning, and is a glaring feature deficiency when compared with GCC.
>
> Unlike GCC's implementation, which is based on the backend optimizer, Clang's implementation is based on dataflow analysis in the frontend.  This means the warning still works at -O0.
>
> Now that there is a prototype of this feature in TOT Clang, I wanted to open up the list to general discussion of this feature, its deployment, and what expectations users should have.
>
> For some background, because GCC's implementation of -Wuninitialized is based on the optimizer, the results of the warning can differ depending on the flags passed to the compiler.  For example:
>
> (1) The warnings can vary depending on the optimization level selected.
>
> (2) The warnings can vary depending on the target architecture.
>
> (3) The warnings can vary depending on which version of GCC you are using.
>
> While I am not 100% certain, I also suspect that GCC's implementation is not completely sound, and will not flag warnings in some cases.  My hypothesis is that this is done to avoid spurious false positive warnings, leaving the impression that the analysis is smarter than it actually is (while possibly missing real issues).

It is true that gcc's warning is not sound -- i.e, false negatives,
but they are very rare. However it is not because gcc uses some
heuristics to suppress possible spurious warning, but because the some
optimization may make it go away -- e.g, constant propagation. This is
in fact bugs, not features. Again, these are rare, and gcc's uninit
analysis is sound -- no false negatives. Note that the soundness is
about non-aliased scalar variables. If the variable is aliased with
indirect accesses (may alias) or address exposed to function calls, no
intra-procedural analysis is sound.

[He means that gcc will miss warnings when a variable may be
initialized inside a function call, but that it always errs on the
side of false positives for variables whose addresses don't escape.]

> My design goal for -Wuninitialized was threefold:
>
> (a) Works under -O0.  Users care about seeing these warnings the most when they are doing their debug builds.
>
> (b) Has consistent results that are invariant of the optimization flags, target architecture, phases of the moon, etc.
>
> (c) Provides predictable results that are (for the most part) sound and complete.
>
> (d) Has marginal impact on compile time.
>
>
> The last three goals mean that the analysis can only do limited reasoning about control-dependencies, e.g.:
>
>  int x;
>  ...
>  if (flag)
>    x = ...
>  ...
>  if (flag)
>    use(x);
>
> Inherently analyzing this code correctly requires path-sensitive analysis, which inherently has exponential cost in the general case.  There are tricks where we can mitigate such algorithmic complexity for some common cases, but handling these control-dependencies in general is something that really is in the purview of the static analyzer.  Amazingly, GCC often doesn't flag warnings in such cases, but I suspect that it is because GCC is silently dropping warnings in some cases where it deems it can't accurately reason precisely about a given variable.
>

False -- gcc does not silently drops warnings -- that is evil.
Instead, Gcc has predicate aware analysis. Gcc's uninit analysis by
itself is sound.

> My proposal is that Clang's analysis errs on the side of producing more warnings instead of worrying about such control-dependencies.  This means that for the above code example that Clang would emit a warning, even when no use of an uninitialized variable is possible.  My rationale is twofold:
>
> (a) The cost of initializing a variable is usually miniscule.

True about the cost -- and can possibly optimized further by compiler
(sinking etc).

False about the people's tolerance level on false positives. When
there are too many of them, people simply ignore all of them.

However, this might not be a problem for clang's FE based
implementation because most of the false positives are actually
exposed due to inlining which gcc has to face. The downside (a clear
drawback) for clang's FE based implementation is that it may have too
many false negatives for aliased scalars.

> (b) Users get predictable results, and the compiler doesn't play games when deciding when to emit a warning in the face of control-dependencies that it cannot reason about.
>

Gcc does not play games on this -- but we certainly talked about it
when being accused of too many false positives.

> *** Question #1 ***
>
> Is this a reasonable level of behavior we can set for this warning that users will accept?  I have received reports from Nico and Chandler that initially this warning produced copious warnings on Chrome, but now my understand that the number of warnings is down to 11 (which seems quite manageable to me, given that 2 of the issues were cases that GCC didn't flag).
>
> As a bonus, Clang's warnings also include Fixit hints to the user on how to initialize the variable to silence the warning.  With the proper editor support, I think responding to Clang's warnings requires minimal effort from the developer.
>
> For examples of what Clang's -Wuninitialized warns about, we have several tests available now in the clang test suite:
>
>  test/Sema/uninit-variables.c
>  test/SemaObjC/uninit-variables.m
>  test/SemaCXX/uninit-variables.cpp
>
> *** Question #2 ***
>
> Another point I want to raise (which was brought up by Chandler) is whether or not uninitialized variables checking should continue to be under the -Wuninitialized flag?  If we go with the behavior that I propose, we are deviating from GCC's behavior with a more "sound" (but noisy) analysis.  Since several checks in clang are bundled under the -Wuninitialized flag, there may be some argument to splitting these under separate flags.
>
> Is there a general desire to do this?  If so, why?
>
> My personal feeling is that things should be kept simple: keep a single -Wuninitialized flag that turns on all uninitialized variables checking.  Users can then either suppress the warning by initializing their variables (which I argue is cheap and overall good for code cleanliness) or use pragmas to shut up the warning entirely in regions of their code.  The latter may be a gross solution, but it is there for those who want it.
>
> Thoughts?  Comments?
>
>
>
> Cheers,
> Ted
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>