[PATCH] Make UninitializedValues detect the use of uninitialized fields in a constructor.

Enrico Pertoso epertoso at google.com
Thu Nov 21 12:32:22 PST 2013



================
Comment at: lib/Analysis/UninitializedValues.cpp:571
@@ +570,3 @@
+  if (ME == findLeftmostMemberExpr(ME) &&
+      ME->getMemberDecl()->getType()->isReferenceType()) {
+    classify(ME, Use);
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > Seems this case should apply if any of the `FieldDecls` in the chain are of reference type, not just if the last one is.
> > I'm not sure I understand:
> > 
> >   lang=c++
> >   struct A {
> >     int &a;
> >   };
> >   struct B {
> >     A &a;
> >     B(int) {}
> >   };
> >   struct C {
> >     B &ref;
> >     B b;
> >     C() : b(ref.a.a) {}
> >     C(B &o) : b(o.a.a) {}
> >   };
> >   
> > The only thing we know here is that "ref" is uninitialized.
> > Do you have anything specific in mind?
> Suppose we had:
> 
>   struct A { int n; };
>   struct B { A &a; };
>   struct C {
>     int n;
>     B b;
>     C(A &a) : b{a}, n(b.a.n = 0) {}
>   };
> 
> Here (because the `n` initializer runs before the `b` initializer), `b` is uninitialized in the `b.a.n = 0` expression. Because `b.a` is a reference, that expression is a use of `b`, not an initialization of it. So we should diagnose this.
Thanks, you helped me diagnose a bigger problem here (record access wasn't being counted as use in general).
I did this because when you have

  lang=c++
  struct A {
    int &a;
    int &b;
    int n;
    A() : a(b), n(b) {}
  };

the use of ##b## in ##n(b)## will be detected because of the lvalue-to-rvalue cast, but there's no such cast on references.
Each ##MemberExpr## is visited, so the code warns on your example. I've added a test case.

================
Comment at: lib/Analysis/UninitializedValues.cpp:581-586
@@ -426,1 +580,8 @@
 
+void ClassifyRefs::VisitCXXDefaultInitExpr(CXXDefaultInitExpr *DIE) {
+  if (DIE->getExpr() == 0)
+    return;
+  CXXDefaultInitExprVisitor<ClassifyRefs> V(*this);
+  V.Visit((Stmt*) DIE->getExpr());
+}
+
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > This looks suspicious. If we're visiting a `CXXDefaultInitExpr` for an object other than `this`, and that initializer references `this`, we'll think that we have a reference to one of our subobjects, but we don't.
> > ##isTrackedField## verifies that the specified field has the same parent of the constructor. A ##CXXDefaultInitExpr## appears either in a ##CtorInitializer## or within the context of (c++1y) aggregate initialization, in which case the record of the current constructor and that of any ##MemberExpr## in the aggregate can't be the same.
> I don't think that is the case. For instance:
> 
>   struct A {
>     int a;
>     int b;
>     int c;
>     struct B;
>     A();
>   };
>   struct A::B : A {
>     int d = (c = 0);
>   };
>   A::A() : a(B{}.d), b(c) {} // B{} should not mark our `c` as initialized.
> 
> As I alluded to in another comment, I think we could handle this correctly by not walking into a `CXXDefaultInitExpr` in aggregate initialization (perhaps only walk into them when they appear at the top level in a `CtorInitializer`, and do that in the CFG builder).
In that case ##B{}## is a ##CXXConstructExpr##. You can't use aggregate initialization for ##B## because it has a base class.
Unless there's a change in the aggregate definition after n3653.

================
Comment at: lib/Analysis/UninitializedValues.cpp:555
@@ +554,3 @@
+    case CK_ArrayToPointerDecay:
+      classify(CE->getSubExpr(), Ignore);
+      break;
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > This looks curious. Treating this case as `Init` seems appropriate to me: we don't track how the resulting pointer is used, and it might be used to initialize the object. Is the idea to ignore all array subobjects of class type in the analysis? That is:
> > > 
> > >   struct X {
> > >     int a[3];
> > >     int b;
> > >   };
> > >   struct Y {
> > >     X x;
> > >     Y() {
> > >       x.a[0] = 1; // don't treat 'x' as initialized
> > >       x.a[1] = x.b; // warning: 'x' used uninitialized
> > >     }
> > >   };
> > > 
> > > If so, this really deserves a comment.
> > This is actually another case in which ##x## is treated as initialized because the default constructor is called, hence ##x## is immediately treated as initialized.
> > The analysis could potentially be made smarter for aggregates, but I'm not sure this patch is the best place to do it.
> > 
> > This was intended for:
> > 
> >   lang=c++
> >   void foo(int *a);
> >   struct X {
> >     int a[3];
> >     int b;
> >     X() {
> >       foo(a);
> >       b = a[1]; // do not warn on 'a' here.
> >     }
> >   };
> > 
> > But actually it's not necessary because the current code is not classifying ##ArraySubscriptExprs##.
> > Should I try to deal with ##ArraySubscriptExpr## it in this same patch?
> My preference would be to keep this patch as small as possible while adding the new constructor-handling functionality. I'd prefer for `ArraySubscriptExpr` support to be added separately if that's practical.
Acknowledged.

================
Comment at: lib/Analysis/UninitializedValues.cpp:795-796
@@ +794,4 @@
+      E = UO->getSubExpr()->IgnoreParenCasts();
+    else if (const ArraySubscriptExpr *ASE = dyn_cast<ArraySubscriptExpr>(E))
+      E = ASE->getLHS()->IgnoreParenCasts();
+  } while (OE != E);
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > I assume you're trying to cope with `this[0].blah` here? This doesn't look quite right: it'll also find `this[-1].blah` (which might sometimes be valid) and will do the wrong thing on `0[this].blah`. Maybe just drop the array indexing support here -- it's a corner case and not worth the cost of getting all the details right.
> > Actually no. This just detects things like
> > 
> >   lang=c++
> >   struct X {
> >     int x, y;
> >     X(int f) {
> >       reinterpret_cast<int*>(this)[0] = f;
> >       y = x;
> >     }
> >   };
> > 
> > Warning on ##y = x## is wrong. On the other hand, I'm simply doing as if ##this## escaped analysis here, and yes, it won't work on ##0[reinterpret_cast<int*>(this)] = f##.
> > It's still seems to be a corner case, though, plus it's not warning on member access via ##this[0]##.
> > 
> > Should I still drop this?
> Unless you have real-world examples of bugs the extra accuracy catches here, I'd prefer to drop this for simplicity's sake.
Actually, it prevents a false warning, but it's still not common. I'll get rid of it.

And I've realized that this function could get into an infinite loop. Fixed that.

================
Comment at: test/SemaCXX/uninitialized.cpp:444
@@ +443,3 @@
+    J() : a(1) {
+      float f = b;
+      b = -f;
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > We should really warn on this. A FIXME for that would be nice.
> > I don't think we should. ##J## is a union, and one of its fields (##a##) has been initialized.
> > If you're really sure that a warning should still be issued here, it's actually easier to do so rather than not.
> Well, it's undefined behavior to write to field `a` then read from field `b`. The "uninitialized" warning isn't really the right one to issue, though, so I think your approach in this patch is fine.
Acknowledged.

================
Comment at: lib/Analysis/UninitializedValues.cpp:317-323
@@ +316,9 @@
+  void VisitStmt(Stmt* S) {
+    llvm::SmallVector<Stmt*, 12> children;
+    for (Stmt::child_range I = S->children(); I; ++I) {
+      children.push_back(*I);
+    }
+    for (std::size_t i = children.size(); i > 0; --i) {
+      Visit(children[i - 1]);
+    }
+  }
----------------
Richard Smith wrote:
> Enrico Pertoso wrote:
> > Richard Smith wrote:
> > > Why do you need the extra complexity of building a list and iterating backwards here, rather than just visiting the children inside the first loop?
> > That's the way statements are appended to the CFG in the base case.
> > But it doesn't seem to cause problems if I don't do it here (##classify## is always called when there's enough context).
> > 
> Hmm, does this do good things if there is control flow in the `CXXDefaultInitExpr`?
> 
>   struct S {
>     enum Kind { Foo, Bar } K;
>     int FooVal;
>     int BarVal;
>     int Value = K == Foo ? FooVal : BarVal;
>     S() : K(Foo), FooVal(0) {} // warning about uninitialized use of BarVal?
>   };
> 
> Maybe we should teach the CFG builder to inline `CXXDefaultInitExprs` when they appear as a `CXXCtorInitializer`? That should not break the CFG's invariants, and allow you to drop a lot (all?) of your special-case handling for them (since you don't actually want to walk into them when they occur within aggregate initialization).
The only difference I see is that, as there's no control flow information at all, what's usually reported as sometimes uninitialized will be reported as always uninitialized.
But I'll have a look at the ##CFGBuilder##.

================
Comment at: include/clang/Analysis/Analyses/UninitializedValues.h:51
@@ -49,1 +50,3 @@
 
+  /// If the use refers to a field, the use happens in this constructor.
+  const CXXConstructorDecl *Constructor;
----------------
Richard Smith wrote:
> Please extend the comment here: something like "If the use happens in a default initializer for a field, this is the constructor in which the default initializer was used."
Done.

================
Comment at: include/clang/Analysis/Analyses/UninitializedValues.h:73
@@ -67,2 +72,3 @@
   const Expr *getUser() const { return User; }
+  const CXXConstructorDecl *getConstructor() const { return Constructor; }
 
----------------
Richard Smith wrote:
> Maybe `getConstructorForDefaultInit`?
Done.

================
Comment at: lib/Analysis/UninitializedValues.cpp:529
@@ +528,3 @@
+        Ex = stripCasts(DC->getParentASTContext(), UO->getSubExpr());
+      classify(Ex, (*I)->getType().isConstQualified() ? Ignore : Init);
+    }
----------------
Enrico Pertoso wrote:
> Richard Smith wrote:
> > Did you mean to use `IsPointerToConst` here?
> Done.
I've changed this a bit to make sure that #IsPointerToConst()## is only called if ##(*I)->isGLValue()## is false. This is to avoid a regression for cases like:

  lang=c++
  int foo(const int *&a) { a = &some_global_int_; }
  void bar() {
    const int *j;
    foo(j); // don't classify the DeclRefExpr for j as Ignore here, just let it escape
    int a = *j;
  }


http://llvm-reviews.chandlerc.com/D2181



More information about the cfe-commits mailing list