[cfe-commits] [PATCH] Checking for sometimes-uninitialized variables

Thu May 17 15:22:13 PDT 2012

On Thu, May 17, 2012 at 2:04 PM, David Blaikie <dblaikie at gmail.com> wrote:

> On Thu, May 17, 2012 at 1:01 PM, Richard Smith <richard at metafoo.co.uk>
> wrote:
> > Hi,
> >
> > The attached patch adds a new classification of uninitialized variables,
> > between 'always used uninitialized' and 'may be used uninitialized',
> which
> > (unlike 'may be used uninitialized') has no false positives (after making
> > one conservative assumption), and produces more warnings than 'always
> used
> > uninitialized'.
>
> The principle and results look great. Here's some nitpicky/code review
> details, though:
>
> > include/clang/Analysis/Analyses/UninitializedValues.h:
> > /// If non-NULL, this use is always uninitialized if occurs after any of
>

Done.

> If what is non-NULL (ultra-pedantry/pet-peeve: pointer values are
> null, not NULL - the latter being just a specific constant)? The
> vector? "non-empty", perhaps?
>
> "if occurs" -> "if it occurs" ?
>

Done.

> > /// these branches is taken.
>
> "any of these branches are taken" ? It sounds more right, but strictly
> speaking it feels wrong because you're talking about only one of
> them... I'm not sure what the correct grammatical rules are here, it
> just sounded weird when I read it.
>

I was taught that "{none,one,neither,either,any} of these things" is a
singular, since we're referring to just one of them at a time. But I would
not be surprised if that isn't universal amongst English-speakers.

> > const CFGBlock *B = cfg.begin()[Queue.back()];
>
> If CFG iterators are so flexible as to allow array access, perhaps we
> should just add an op[] to the CFG type itself rather than indexing
> into an iterator?
>

Good idea, I meant to come back to that, but forgot. Thanks! I've added an
operator[] to CFG.

> > #if 0
> >     cfg.dump(dc.getParentASTContext().getLangOpts(), true);
> > #endif
>
> Not sure if you meant to leave this in, nor whether it should be
> preprocessed out, constant conditional-ed out, or commented out, if it
> is to remain (I sort of like the constant conditional option - that
> ensures that it doesn't bitrot. Between preprocessing and commenting
> out, I'd probably choose commenting out.
>

This file has a pile of "#if 0"s for debugging already, which produce all
the useful information except for dumping the CFG. This seemed like a
natural addition (though it would be great to be able to enable these
without a rebuild...).

> > switch (Use.getKind()) {
> > case UninitUse::Always: DiagID = diag::warn_uninit_var; break;
> > case UninitUse::Sometimes: DiagID = diag::warn_sometimes_uninit_var;
> break;
> > case UninitUse::Maybe: DiagID = diag::warn_maybe_uninit_var; break;
>
> OCD: worth aligning the DiagID so the maybe/sometimes/ stand out better?
>
> The switch 10 lines down is also written with multiple line breaks
> (couldn't quite fit it under 80 cols, I guess?) Perhaps worth just
> moving the switch out to the top & doing it unconditionally?
>

The two switches actually produce different values.

> > // FIXME: This iteration order, and thus the resulting diagnostic order,
> > //        is nondeterministic.
>
> Is it just luck that we don't test the cases where this would come up?
> Or are our tests unstable & just haven't exhibited it yet?
>

-verify doesn't care which order diagnostics appear in. IIRC, at least one
of current tests exhibits nondeterminism.

> > - if (DiagnoseUninitializedUse(S, vd, vi->first,
> > -                              /*isAlwaysUninit=*/vi->second))
> > - // If we have self-init, downgrade all uses to 'may be uninitialized'.
> > + UninitUse Use = hasSelfInit ? UninitUse(vi->getUser(), false) : *vi;
> > + if (DiagnoseUninitializedUse(S, vd, Use))
>
> Is this a change in behavior, or is the comment clarifying existing
> behavior? (just a little curious because the 'isAlwaysUninit'
> parameter went from 'vi->second' to 'false')
>

There's no change in behavior. If there's a self-init and we have an
always-uninitialized use, we can't get here (the diff cuts off the relevant
context, which is just a few lines above).

> > static void NoteUninitBranches(Sema &S, const UninitUse &Use) {
>
> Should you use a switch in the diagnostic rather than string literals
> in the source for the 'for', 'if' , ':?', etc? (I realize those are
> keywords & can't be translated, which is the usual hypothetical
> argument for these things - so it's not a definite
> improvement/important change in that regard)
>

I started that way, but it led to a very repetitive definition in the
diagnostic.

> >      // "condition is true / condition is false".
> >    case Stmt::IfStmtClass:
> ...
> >      // "loop is entered / loop is exited".
> >    case Stmt::WhileStmtClass:
> ...
> >      // "switch case is taken".
> >    case Stmt::CaseStmtClass:
>
> Alignment of these comments is a bit strange (they're indented, but
> they pertain (as far as I understand it) to the following case which
> is not indentend). It might even be better on the same line as (&
> following) the DiagKind assignment. Hmm, I see - it pertains to a set
> of cases, yeah, maybe just unindenting would suffice - but I can see
> that it's not clear either way how to group the case statements under
> the comment.
>

I've unindented the comments. I think you're right that that's a little
better.

> Diagnostic text: the note trace talks about loops being "entered" but
> then about them being "terminated", would it make sense to refer to
> them as "exited" and/or is there precedent for any other terminology
> here? (& I still haven't come up with any better way to describe the
> "loop exited through its condition becoming false" than your
> "exited/terminated 'normally'")
>

Yes, that not isn't as clear as it could be. I've changed this to
"uninitialized use occurs whenever 'for' loop exits because its condition
is false".

> The shortcircuit cases are nice - does the note point to the && now, I
> assume? (you could split the expressions over multiple lines to
> demonstrate this) Is it worth any clarity to point to the first
> operand to the &&? I'm wondering if people would be confused if they
> saw:
>

The diagnostic points to the expression on the LHS of the && (the one whose
value is being described in the note). I've updated the tests as you
suggest.

> if (x && y) { // whenever && condition is false
>

In this case, we say 'whenever the 'if' condition is false'.

> and think it was talking about the condition for the if block, rather
> than that of the 'x' - but perhaps it's good enough that we get that
> far.

I think we'd only want to get the '&&' diagnostic here if the
initialization happens within 'y' itself (though we don't currently catch
that case at all, because the uninitialized variable tracking code's
special case for '&&' and '||' doesn't quite do the right thing when
they're immediately used in a condition).

Updated patch attached.

-- Richard
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120517/de700ee8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sometimes-uninit.diff
Type: application/octet-stream
Size: 31432 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120517/de700ee8/attachment.obj>