[cfe-dev] warn when ctor-initializer leaves uninitialized data?

Nico Weber thakis at chromium.org
Sat Mar 7 18:58:29 PST 2015


Since I lost some hours last week to a constructor that forgot to
initialize a certain field, I figured I'd give Matthieu's suggestion of
only warning for constructors with an empty body a shot. I couldn't get it
to work, even when I made the warning more and more restricted, using
fairly desperate heuristics in the end.

* Let's only warn on constructors with empty bodies!
* …but that warns on constructors that use member arrays as writable
scratch buffers (used in protobuf), so only do this for scalar members
* …but that warns on `ParsedInternalKey() { }  // Intentionally left
uninitialized (for speed)` constructors (e.g. leveldb), so only do this if
an explicit init sequence is present
* …but that warns on classes that intentionally initialize only 1-2 members
and keep the rest uninitialized, so only warn if a constructor list
initializes all but one member
* …but that still warns all over the place, admittedly sometimes on code
with questionable design (which however is still correct), so make it an
opt-in warning and only enable it for "good" code
* …but that warns if base class fields are protected and initialized in all
subclasses (e.g. WebKit's Vector), so only do this for private fields
* …but even "high-quality" libraries (I think Chromium's net/ library is
generally pretty good, for example) still contain many instances where some
fields aren't initialized intentionally. Some of that code does look
reasonable.

I did find some things that look like bugs during this exercise, but the
false-positive rate remains very high and I'm out of ideas to make the
warning more restricted.

I'm attaching my patch in case anyone wants to play with it (run it on LLVM
instead of Chromium, for example). It includes a few interesting test cases
too.

Nico

On Fri, Apr 27, 2012 at 11:38 AM, Patrick Beard <beard at apple.com> wrote:

>
> On Apr 23, 2012, at 11:33 AM, Patrick Beard <beard at apple.com> wrote:
>
>
> On Apr 10, 2012, at 1:22 PM, Evan Wallace wrote:
>
> I guess the warning I was envisioning was more of a stylistic warning. The
> equivalent warning from gcc is -Weffc++ which includes warnings about
> violations of several style guidelines from Scott Meyers' book "Effective
> C++, Second Edition" including this warning (always put all members in the
> initializer list). It includes a few other warnings and is helpful in
> preventing mistakes, especially for beginners, but generates a lot of noise
> when run on large existing projects and unfortunately can't be split into
> separate, fine-grained warnings. The warning in this patch is much more
> targeted but makes the choice to reject some valid code instead of silently
> ignoring invalid code. It was more targeted for beginner to intermediate
> users of C++ and not for large, highly optimized projects. Maybe this wants
> to be two warnings, one for initializer lists as it is currently and one
> that attempts to trace uninitialized members to the end of each constructor
> but doesn't warn when it isn't sure?
>
>
> Having spent a little bit of time trying to track down a bug that was
> caused by unitialized POD data members, I'd also like to see your patch
> committed as an opt-in warning.
>
>
> When I use this patch, it breaks one of the other tests:
>
> FAIL: Clang :: SemaCXX/default-constructor-initializers.cpp (3644 of 4580)
>
> ******************** TEST 'Clang ::
> SemaCXX/default-constructor-initializers.cpp' FAILED ********************
>
> Script:
>
> --
>
> /private/var/tmp/compiler.build/Release+Asserts/bin/clang -cc1
> -internal-isystem /private/var/tmp/compiler.build/Release+Asserts
> /bin/../lib/clang/3.2/include -fsyntax-only -verify
> /private/var/tmp/compiler/llvm/tools/clang/test/SemaCXX/default-constructor-
> initializers.cpp
>
> --
> Exit Code: 1
> Command Output (stderr):
> --
> error: 'note' diagnostics expected but not seen:
>   Line 54: first required here
> 1 error generated.
>
> Here's the patch I'm using:
>
>
>
> If I comment out the change to BuildImplicitMemberInitializer(), I confirm
> that the output of the default-constructor-initializers.cpp test differs by
> this line:
>
> /Trees/compiler/llvm/tools/clang/test/SemaCXX/default-constructor-initializers.cpp:54:4:
> note: implicit default constructor for 'Z1' first required here
> Z1 z1; // expected-note {{first required here}}
>    ^
>
> That is, this note appears if the change is commented out, but isn't there
> if:
>
>   // Warn if ctor-initializer for a non-POD type leaves uninitialized
> data.
>   if (!Constructor->getParent()->isPOD()) {
>     SemaRef.Diag(Constructor->getLocation(), diag::warn_uninit_member)
>       << Field->getDeclName();
>   }
>
> is included.
>
> - Patrick
>
>
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150307/1614a0fb/attachment.html>
-------------- next part --------------
Index: include/clang/Basic/DiagnosticGroups.td
===================================================================
--- include/clang/Basic/DiagnosticGroups.td	(revision 231572)
+++ include/clang/Basic/DiagnosticGroups.td	(working copy)
@@ -377,10 +377,12 @@
 def UndefinedReinterpretCast : DiagGroup<"undefined-reinterpret-cast">;
 def ReinterpretBaseClass : DiagGroup<"reinterpret-base-class">;
 def Unicode  : DiagGroup<"unicode">;
+def UninitializedInCtor : DiagGroup<"uninitialized-in-ctor-list">;
 def UninitializedMaybe : DiagGroup<"conditional-uninitialized">;
 def UninitializedSometimes : DiagGroup<"sometimes-uninitialized">;
 def UninitializedStaticSelfInit : DiagGroup<"static-self-init">;
-def Uninitialized  : DiagGroup<"uninitialized", [UninitializedSometimes,
+def Uninitialized  : DiagGroup<"uninitialized", [
+                                                 UninitializedSometimes,
                                                  UninitializedStaticSelfInit]>;
 def UnknownPragmas : DiagGroup<"unknown-pragmas">;
 def IgnoredPragmas : DiagGroup<"ignored-pragmas">;
Index: include/clang/Basic/DiagnosticSemaKinds.td
===================================================================
--- include/clang/Basic/DiagnosticSemaKinds.td	(revision 231572)
+++ include/clang/Basic/DiagnosticSemaKinds.td	(working copy)
@@ -1532,6 +1532,11 @@
   "reference %0 is not yet bound to a value when used within its own"
   " initialization">,
   InGroup<Uninitialized>;
+def warn_uninit_in_ctor_list : Warning<
+ "constructor init list is missing initialization of %0">,
+  InGroup<UninitializedInCtor>, DefaultIgnore;
+def note_nonempty_body : Note<
+  "give constructor a non-empty body to silence this warning">;
 def warn_uninit_var : Warning<
   "variable %0 is uninitialized when %select{used here|captured by block}1">,
   InGroup<Uninitialized>, DefaultIgnore;
Index: lib/Parse/ParseDeclCXX.cpp
===================================================================
--- lib/Parse/ParseDeclCXX.cpp	(revision 231573)
+++ lib/Parse/ParseDeclCXX.cpp	(working copy)
@@ -3359,8 +3359,10 @@
   // This nested class has some members that will need to be processed
   // after the top-level class is completely defined. Therefore, add
   // it to the list of nested classes within its parent.
-  assert(getCurScope()->isClassScope() && "Nested class outside of class scope?");
-  ClassStack.top()->LateParsedDeclarations.push_back(new LateParsedClass(this, Victim));
+  assert(getCurScope()->isClassScope() &&
+         "Nested class outside of class scope?");
+  ClassStack.top()->LateParsedDeclarations.push_back(
+      new LateParsedClass(this, Victim));
   Victim->TemplateScope = getCurScope()->getParent()->isTemplateParamScope();
 }
 
Index: lib/Sema/SemaDecl.cpp
===================================================================
--- lib/Sema/SemaDecl.cpp	(revision 231573)
+++ lib/Sema/SemaDecl.cpp	(working copy)
@@ -10637,8 +10637,73 @@
 
       MarkBaseAndMemberDestructorsReferenced(Destructor->getLocation(),
                                              Destructor->getParent());
+    } else if (auto *Constructor = dyn_cast<CXXConstructorDecl>(dcl)) {
+      CXXRecordDecl *RD = Constructor->getParent();
+      // Check for initializer lists that miss a few fields.  Do this only for
+      // constructors with empty bodies to prevent false positives through
+      // constructors that do ` memset(this, 0, sizeof(*this));` or call
+      // Init() methods that do initialization.
+      // It would be more natural to do this towards the end of
+      // Sema::ActOnMemInitializers(), but the constructor's body isn't known
+      // yet at that point.
+      // XXX not the best place, not called for implicitly ctors (where this
+      // is interesting for missing in-class initializers?)
+      if (Constructor->hasTrivialBody() &&
+          !Constructor->isDelegatingConstructor() &&
+          !RD->isDependentContext() &&  // XXX or can this work?
+          // XXX do i want this?
+          Constructor->getNumCtorInitializers()) {
+
+        // Holds fields that are uninitialized.
+        llvm::SmallPtrSet<FieldDecl *, 4> UninitializedFields;
+        for (auto *I : RD->decls()) {
+          if (auto *FD = dyn_cast<FieldDecl>(I)) {
+            if (!FD->isAnonymousStructOrUnion() &&
+                !FD->getType()->isArrayType() && FD->getAccess() == AS_private)
+              UninitializedFields.insert(FD);
+          //} else if (auto *IFD = dyn_cast<IndirectFieldDecl>(I)) {  XXX
+            //UninitializedFields.insert(IFD->getAnonField());
+          }
+        }
+
+        bool AnyWritten = false;
+        for (const CXXCtorInitializer *FieldInit : Constructor->inits()) {
+          if (UninitializedFields.empty()) break;
+          if (!FieldInit->isAnyMemberInitializer()) continue;
+          //if (FieldInit->getAnyMember()->isAnonymousStructOrUnion()) continue;
+          UninitializedFields.erase(FieldInit->getAnyMember());
+
+          if (FieldInit->isWritten())
+            AnyWritten = true;
+        }
+
+        //if (AnyWritten && !UninitializedFields.empty()) {
+        if (AnyWritten && UninitializedFields.size() == 1) {
+          std::string Result;
+          llvm::raw_string_ostream OS(Result);
+          //OS << *this;
+          //return OS.str();
+          bool IsFirstField = true;
+          for (const auto *FD : UninitializedFields) {
+            //FD->dump();
+            if (!IsFirstField)
+              OS << ", ";
+            else
+              IsFirstField = false;
+            OS << '\'' << FD->getDeclName() << '\'';
+          }
+
+          Diag(Constructor->getLocation(), diag::warn_uninit_in_ctor_list)
+              << OS.str();
+          for (const auto *FD : UninitializedFields)
+            Diag(FD->getLocation(), diag::note_member_declared_here) << FD;
+          Diag(Constructor->getLocation(), diag::note_nonempty_body);
+        }
+
+        //fprintf(stderr, "asdf\n");
+      }
     }
-    
+
     // If any errors have occurred, clear out any temporaries that may have
     // been leftover. This ensures that these temporaries won't be picked up for
     // deletion in some later function.
Index: test/SemaCXX/constructor-initializer.cpp
===================================================================
--- test/SemaCXX/constructor-initializer.cpp	(revision 231572)
+++ test/SemaCXX/constructor-initializer.cpp	(working copy)
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -Wreorder -fsyntax-only -verify %s
+// RUN: %clang_cc1 -Wreorder -fsyntax-only -verify -Wuninitialized-in-ctor-list %s
+// RUN: %clang_cc1 -Wreorder -fsyntax-only -verify -Wuninitialized-in-ctor-list -std=c++11 %s
 class A { 
   int m;
 public:
@@ -291,3 +292,127 @@
   struct S2 { union { union { int n; }; char c; }; S2() : n(n) {} };  // expected-warning {{field 'n' is uninitialized when used here}}
   struct S3 { struct { int n; }; S3() : n(n) {} };  // expected-warning {{field 'n' is uninitialized when used here}}
 }
+
+namespace MissingInits {
+
+struct CtorInitialized {
+  CtorInitialized();
+  CtorInitialized(const CtorInitialized&);
+};
+
+struct S {
+  S() : a(0) {} // expected-warning {{constructor init list is missing initialization of 'c'}} expected-note {{give constructor a non-empty body to silence this warning}}
+
+  // No warning, too many false positives.
+  S(float f) {}
+
+  S(int, int) : a(0) { c = 0; }
+
+#if __cplusplus >= 201103L
+  S(int a) : S() {} // Don't warn on delegating ctors.
+#endif
+
+private:
+  int a;
+#if __cplusplus >= 201103L
+  int b = 0;
+#endif
+  int c; // expected-note {{member 'c' declared here}}
+  CtorInitialized ci;
+};
+
+struct S22 {
+  int a;
+#if __cplusplus >= 201103L
+  int b = 0;
+#endif
+  // XXX want to warn here? probably? probably Just Works as long ctor is used?
+};
+
+template <class T>
+class WeakPtrFactory {
+ public:
+  explicit WeakPtrFactory(T* ptr) : ptr_(ptr) {}
+
+  CtorInitialized ci;
+  T* ptr_;
+
+
+  class ProxyWithStub {
+   public:
+    explicit ProxyWithStub(int* receiver) : a(0) {}
+    int a;
+    typename T::Client::Stub_ stub;  // XXX shouldn't warn
+  };
+  
+};
+
+WeakPtrFactory<int> w(0);  // XXX should warn, I guess?
+
+char* IntToBuf(int i, char* buf) {
+  // Imagine that this does itoa(i) into the storage provided by buf and returns
+  // that.
+  return buf;
+}
+
+int StrLen(const char* t) { return 42; }
+
+class Protobuf {
+  // Shouldn't warn for scratch_.
+  Protobuf(int i) : text_(IntToBuf(i, scratch_)), size_(StrLen(text_)) {}
+  const char* text_;
+  int size_;
+  char scratch_[10];
+};
+
+// This is from leveldb.
+enum ValueType {
+  kTypeDeletion = 0x0,
+  kTypeValue = 0x1
+};
+typedef unsigned long long SequenceNumber;
+struct ParsedInternalKey {
+  CtorInitialized user_key;
+  SequenceNumber sequence;
+  ValueType type;
+
+  // Shouldn't warn:
+  ParsedInternalKey() { }  // Intentionally left uninitialized (for speed)
+  ParsedInternalKey(
+      const CtorInitialized &u, const SequenceNumber &seq, ValueType t)
+      : user_key(u), sequence(seq), type(t) {}
+};
+
+// This is from skia.
+class Iterator {
+public:
+  Iterator() : fRgn(0), fDone(true) {} // Empty iterator, fRect not used
+  Iterator(int *);                        // Real iterator, sets fRuns
+  bool done() const { return fDone; }
+  void next();
+  // may return null
+  const int *rgn() const { return fRgn; }
+
+private:
+  const int *fRgn;
+  const int *fRuns; // Only valid if !fDone
+  int foo;
+  bool fDone;
+};
+
+// This is simplified from WebKit's Vector class.
+struct Base {
+  Base() : a(0) {}
+
+protected:
+  int a, b;
+};
+
+struct Sub : public Base {
+  Sub() { b = 0; }
+};
+
+
+//S22 s;
+
+}


More information about the cfe-dev mailing list