[PATCH] D45532: [StaticAnalyzer] Checker to find uninitialized fields after a constructor call

Mon May 28 03:36:31 PDT 2018

Szelethus marked 4 inline comments as done.
Szelethus added a comment.

I decided to mark the inline comments in `isPointerOrReferenceUninit` regarding the dereferencing loop termination done for now. I left several TODO's in the function to be fixed later, with many of them depending on one another, so fixing them all or many at once would probably be the best solution.

I left two conversations "not done" (fatal/nonfatal errors and base classes), but if I may, do you have any other objections?

================
Comment at: lib/StaticAnalyzer/Checkers/UninitializedObjectChecker.cpp:223-225
+  ExplodedNode *Node = Context.generateNonFatalErrorNode(Context.getState());
+  if (!Node)
+    return;
----------------
NoQ wrote:
> Szelethus wrote:
> > NoQ wrote:
> > > I suspect that a fatal error is better here. We don't want the user to receive duplicate report from other checkers that catch uninitialized values; just one report is enough.
> > I think that would be a bad idea. For example, this checker shouts so loudly while checking the LLVM project, it would practically halt the analysis of the code, reducing the coverage, which means that checkers other then uninit value checkers would "suffer" from it.
> > 
> > However, I also think that having multiple uninit reports for the same object might be good, especially with this checker, as it would be very easy to see where the problem originated from.
> > 
> > What do you think?
> Well, i guess that's the reason to not use the checker on LLVM. Regardless of fatal/nonfatal warnings, enabling this checker on LLVM regularly would be a bad idea because it's unlikely that anybody will be able to fix all the false positives to make it usable. And for other projects that don't demonstrate many false positives, this shouldn't be a problem.
> 
> In order to indicate where the problem originates from, we have our bug reporter visitors that try their best to add such info directly to the report. In particular, @george.karpenkov's recent `NoStoreFuncVisitor` highlights functions in which a variable was //not// initialized but was probably expected to be. Not sure if it highlights constructors in its current shape, but that's definitely a better way to give this piece of information to the user because it doesn't make the user look for a different report to understand the current report.
LLVM is a special project in the fact that almost every part of it is so performance critical, that leaving many fields uninit matters. However, I would believe that in most projects, only a smaller portion of the code would be like that.

Suppose that we have a project that also defines a set of ADTs, like an `std::list`-like container. If that container had a field that would be left uninit after a ctor call, analysis on every execution path would be halted which would use an object like that.

My point is, as long as there is no way to tell the analyzer (or the checker) to ignore certain constructor calls, I think it would be best not to generate a fatal error.

>Regardless of fatal/nonfatal warnings, enabling this checker on LLVM regularly would be a bad idea because it's unlikely that anybody will be able to fix all the false positives to make it usable. And for other projects that don't demonstrate many false positives, this shouldn't be a problem.
I wouldn't necessarily call them false positives. This checker doesn't look for bugs, and all reports I checked were correct in the fact that those fields really were left uninit. They just don't cause any trouble (just yet!).

================
Comment at: lib/StaticAnalyzer/Checkers/UninitializedObjectChecker.cpp:359-372
+  // Checking bases.
+  const auto *CXXRD = dyn_cast<CXXRecordDecl>(RD);
+  if (!CXXRD)
+    return ContainsUninitField;
+
+  for (const CXXBaseSpecifier BaseSpec : CXXRD->bases()) {
+    const auto *Base = BaseSpec.getType()->getAsCXXRecordDecl();
----------------
Szelethus wrote:
> NoQ wrote:
> > Szelethus wrote:
> > > NoQ wrote:
> > > > Are there many warnings that will be caused by this but won't be caused by conducting the check for the constructor-within-constructor that's currently disabled? Not sure - is it even syntactically possible to not initialize the base class?
> > > I'm not a 100% sure what you mean. Can you clarify?
> > > >Not sure - is it even syntactically possible to not initialize the base class?
> > > If I understand the question correctly, no, as far as I know.
> > You have a check for `isCalledByConstructor()` in order to skip base class constructors but then you check them manually. That's a lot of code, but it seems that the same result could have been achieved by simply skipping the descent into the base class.
> > 
> > Same question for class-type fields, actually.
> >Same question for class-type fields, actually.
> 
> My problem with that approach is that the same uninitialized field could've been emitted multiple times in different warnings. From an earlier conversation (assume that its run in pedantic mode):
> 
> >For this code snippet:
> >
> > ```
> >struct A {
> >   struct B {
> >      int x, y;
> >      
> >      B() {}
> >   } b;
> >   int w;
> >
> >   A() {
> >      b = B();
> >   }
> >};
> >```
> >the warning message after a call to `A::A()` would be "3 uninitialized fields[...]", and for `B::B()` inside `A`s constructor would be "2 uninitialized fields[...]", so it wouldn't be filtered out.
> 
> In my opinion, if `A` contains a field of type `B`, it should be the responsibility of `A`'s constructors to initialize those fields properly.
> >You have a check for isCalledByConstructor() in order to skip base class constructors but then you check them manually. That's a lot of code, but it seems that the same result could have been achieved by simply skipping the descent into the base class.
> I also believe that a constructor "should be held responsible" for the initialization of the inherited data members. However, in the case of base classes, uniqueing isn't an issue, as in this case:
> ```
> struct B {
>   int a;
>   B() {}
> };
> 
> struct D : public B {
>   int b;
>   D() {}
> };
> ```
> a call to `D::D()` would not check the inherited member (`a`), if I didn't implement it explicitly. That also means that if `isCalledByConstructor` was refactored to not filter out base classes, we would get two warning each with a single note, reporting `b` and `a` respectively. (I haven't actually checked it, but its a reasonable assumption)
> 
> Actually, I'm not 100% sure which approach is better: a warning for each base, or one warning for the object, bases included. I personally think one warning per object results in the best user experience.
> 
> I like both ideas, but I think if we'd come to the conclusion that a warning per base should be the way to go, it should definitely be implemented in a followup patch, as `isCalledByConstructor` is already in line for some major refactoring.
> 
> What do you think? :)
>[...]but it seems that the same result could have been achieved by simply skipping the descent into the base class.
I can't edit inline comments sadly, but I'd just like to point out that it wouldn't achieve the same thing, as highlighted above. But, both solution would be okay.

https://reviews.llvm.org/D45532