[cfe-dev] new -Wuninitialized implementation in Clang

Thu Feb 3 18:35:57 PST 2011

Hi fellow Clangers,

During the last two weeks, I have been working on implementing -Wuninitialized in Clang.  Clang currently doesn't implement this warning, and is a glaring feature deficiency when compared with GCC.

Unlike GCC's implementation, which is based on the backend optimizer, Clang's implementation is based on dataflow analysis in the frontend.  This means the warning still works at -O0.

Now that there is a prototype of this feature in TOT Clang, I wanted to open up the list to general discussion of this feature, its deployment, and what expectations users should have.

For some background, because GCC's implementation of -Wuninitialized is based on the optimizer, the results of the warning can differ depending on the flags passed to the compiler.  For example:

(1) The warnings can vary depending on the optimization level selected.

(2) The warnings can vary depending on the target architecture.

(3) The warnings can vary depending on which version of GCC you are using.

While I am not 100% certain, I also suspect that GCC's implementation is not completely sound, and will not flag warnings in some cases.  My hypothesis is that this is done to avoid spurious false positive warnings, leaving the impression that the analysis is smarter than it actually is (while possibly missing real issues).

My design goal for -Wuninitialized was threefold:

(a) Works under -O0.  Users care about seeing these warnings the most when they are doing their debug builds.

(b) Has consistent results that are invariant of the optimization flags, target architecture, phases of the moon, etc.

(c) Provides predictable results that are (for the most part) sound and complete.

(d) Has marginal impact on compile time.

The last three goals mean that the analysis can only do limited reasoning about control-dependencies, e.g.:

  int x;
  ...
  if (flag)
    x = ...
  ...
  if (flag)
    use(x);

Inherently analyzing this code correctly requires path-sensitive analysis, which inherently has exponential cost in the general case.  There are tricks where we can mitigate such algorithmic complexity for some common cases, but handling these control-dependencies in general is something that really is in the purview of the static analyzer.  Amazingly, GCC often doesn't flag warnings in such cases, but I suspect that it is because GCC is silently dropping warnings in some cases where it deems it can't accurately reason precisely about a given variable.

My proposal is that Clang's analysis errs on the side of producing more warnings instead of worrying about such control-dependencies.  This means that for the above code example that Clang would emit a warning, even when no use of an uninitialized variable is possible.  My rationale is twofold:

(a) The cost of initializing a variable is usually miniscule.

(b) Users get predictable results, and the compiler doesn't play games when deciding when to emit a warning in the face of control-dependencies that it cannot reason about.

*** Question #1 ***

Is this a reasonable level of behavior we can set for this warning that users will accept?  I have received reports from Nico and Chandler that initially this warning produced copious warnings on Chrome, but now my understand that the number of warnings is down to 11 (which seems quite manageable to me, given that 2 of the issues were cases that GCC didn't flag).

As a bonus, Clang's warnings also include Fixit hints to the user on how to initialize the variable to silence the warning.  With the proper editor support, I think responding to Clang's warnings requires minimal effort from the developer.

For examples of what Clang's -Wuninitialized warns about, we have several tests available now in the clang test suite:

  test/Sema/uninit-variables.c
  test/SemaObjC/uninit-variables.m
  test/SemaCXX/uninit-variables.cpp

*** Question #2 ***

Another point I want to raise (which was brought up by Chandler) is whether or not uninitialized variables checking should continue to be under the -Wuninitialized flag?  If we go with the behavior that I propose, we are deviating from GCC's behavior with a more "sound" (but noisy) analysis.  Since several checks in clang are bundled under the -Wuninitialized flag, there may be some argument to splitting these under separate flags.

Is there a general desire to do this?  If so, why?

My personal feeling is that things should be kept simple: keep a single -Wuninitialized flag that turns on all uninitialized variables checking.  Users can then either suppress the warning by initializing their variables (which I argue is cheap and overall good for code cleanliness) or use pragmas to shut up the warning entirely in regions of their code.  The latter may be a gross solution, but it is there for those who want it.

Thoughts?  Comments?

Cheers,
Ted