[cfe-commits] [Patch] Add a warning for for loops with conditions that do not change

Mon Mar 12 16:33:04 PDT 2012

On Mon, Mar 12, 2012 at 1:52 PM, Douglas Gregor <dgregor at apple.com> wrote:

>
> On Mar 12, 2012, at 1:38 PM, Richard Trieu wrote:
>
>
>
> On Sun, Mar 11, 2012 at 1:58 PM, Douglas Gregor <dgregor at apple.com> wrote:
>
>>
>> On Feb 14, 2012, at 3:42 PM, Richard Trieu wrote:
>>
>> On Tue, Feb 7, 2012 at 10:38 AM, Douglas Gregor <dgregor at apple.com>wrote:
>>
>>>
>>> On Feb 6, 2012, at 1:00 PM, Eli Friedman wrote:
>>>
>>> > On Mon, Feb 6, 2012 at 2:50 PM, Richard Trieu <rtrieu at google.com>
>>> wrote:
>>> >> On Mon, Feb 6, 2012 at 2:02 PM, Eli Friedman <eli.friedman at gmail.com>
>>> wrote:
>>> >>>
>>> >>> On Mon, Feb 6, 2012 at 1:55 PM, Richard Trieu <rtrieu at google.com>
>>> wrote:
>>> >>>> The motivation of this path is to catch code like this:
>>> >>>>
>>> >>>> for (int i = 0; i < 10; ++i)
>>> >>>>   for (int j = 0; j < 10; ++i)
>>> >>>>     { }
>>> >>>>
>>> >>>> The second for loop increments i instead of j causing an infinite
>>> loop.
>>> >>>>  The
>>> >>>> warning also checks the body of the for loop so it will trigger on:
>>> >>>>
>>> >>>> for (int i; i <10; ) { }
>>> >>>>
>>> >>>> But not trigger on:
>>> >>>>
>>> >>>> for (int i; i< 10; ) { ++i; }
>>> >>>>
>>> >>>> I'm still fine-tuning the trigger conditions, but would like some
>>> >>>> feedback
>>> >>>> on this patch.
>>> >>>
>>> >>> Adding an additional recursive traversal of the body of every parsed
>>> >>> for loop is very expensive.
>>> >>>
>>> >>> -Eli
>>> >>
>>> >>
>>> >> Is it that expensive?  I would think that once the AST was
>>> constructed, the
>>> >> visitor would be pretty fast, especially if no action is taken on
>>> most of
>>> >> the nodes.  I also made the warning default ignore and put in an early
>>> >> return to prevent the visitors from running in the default case.
>>> >>
>>> >> Do you have any suggestions on removing the recursive traversal?
>>> >
>>> > Okay, thinking about it a bit more, maybe it's not that expensive, but
>>> > you should at least measure to make sure.
>>>
>>> I'm still very nervous about adding such an AST traversal. Presumably,
>>> it's not going to be a concern in practice because most of the time, the
>>> increment statement of the 'for' loop will mention one of the variables in
>>> the condition, and therefore we'll short-circuit the expensive walk of the
>>> loop body. It would be helpful to know that's the case.
>>>
>>> > I don't have any good suggestion for an alternative.
>>>
>>>
>>> Nor do I.
>>>
>>> A couple more comments:
>>>
>>> +    ValueDecl *VD = E->getDecl();
>>> +    for (SmallVectorImpl<ValueDecl*>::iterator I = Decls.begin(),
>>> +                                               E = Decls.end();
>>> +         I != E; ++I)
>>> +      if (*I == VD) {
>>> +        FoundDecl = true;
>>> +        return;
>>> +      }
>>>
>>> This is linear; please use a SmallPtrSet instead.
>>>
>> Done.
>>
>>>
>>> Plus, I think you want to narrow this check to only consider VarDecls.
>>> Functions and enumerators are not interesting.
>>>
>> Done.
>>
>>>
>>> +      S.Diag(Second->getLocStart(),
>>> diag::warn_variables_not_in_loop_body)
>>> +          << Second->getSourceRange();
>>>
>>> This warning needs to specify which variables were not used or point to
>>> them in the source.
>>>
>> The diagnostic now underlines all the variables in the condition.
>>
>>>
>>> Do we want to actually look for modification, e.g., any use of the
>>> variable that isn't immediately consumed by an lvalue-to-rvalue conversion?
>>>
>> Yes, that is exactly what we are checking for.  The VisitCastExpr checks
>> for LvalueToRvalue casts and skips any DeclRefExpr's that are direct
>> sub-expressions.
>>
>>>
>>>        - Doug
>>>
>>
>> I also did a comparison between runs with and without -Wloop-analysis.
>>  [snip]
>>
>>
>> Thanks for all of the performance testing. A few more comments on the
>> actual code:
>>
>> Index: include/clang/Basic/DiagnosticSemaKinds.td
>> ===================================================================
>> --- include/clang/Basic/DiagnosticSemaKinds.td (revision 150501)
>> +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy)
>> @@ -14,6 +14,11 @@
>>  let Component = "Sema" in {
>>  let CategoryName = "Semantic Issue" in {
>>
>> +// For loop analysis
>> +def warn_variables_not_in_loop_body : Warning<
>> +  "variables used in loop condition not modified in loop body">,
>> +  InGroup<DiagGroup<"loop-analysis">>, DefaultIgnore;
>>
>> It would be really nice if this diagnostic could mention the variables by
>> name ("variables 'i' and 'n' used in loop condition are not modified in
>> loop body"), like we do with missing enumerators in a switch. It'll take
>> some special casing for 1/2/many variables, but it will be much nicer for
>> the user.
>>
>
> I will look into this.
>
>
>> +  void VisitDeclRefExpr(DeclRefExpr *E) {
>> +    VarDecl *VD = dyn_cast<VarDecl>(E->getDecl());
>> +    if (!VD) return;
>> +
>> +    PDiag << E->getSourceRange();
>> +
>> +    // Dont do analysis on pointers.
>> +    if (VD->getType()->isAnyPointerType()) {
>> +      Simple = false;
>> +      return;
>> +    }
>> +
>> +    Decls.insert(VD);
>> +  }
>>
>> Presumably, anything volatile-qualified should make the expression
>> non-simple.
>>
> The check for volatile is performed after the DeclMatcher finishes.
>
>>
>> +  void VisitCastExpr(CastExpr *E) {
>> +    // Don't do analysis on function calls or arrays.
>> +    if (E->getCastKind() == CK_FunctionToPointerDecay ||
>> +        E->getCastKind() == CK_ArrayToPointerDecay) {
>> +      Simple = false;
>> +      return;
>> +    }
>> +
>> +     Visit(E->getSubExpr());
>> +  }
>>
>> If you want to avoid calls, why not just intercept CallExpr and
>> CXXConstructExpr and call those non-simple?
>>
>> Backing up a step, it seems strange that DeclExtractor assumes that an
>> expression is 'simple' unless it sees some very small number of cases that
>> aren't consider simple. That feels very brittle to me because, e.g.,
>> CXXConstructExpr is certainly not simple (for non-trivial constructors),
>> yet it would be considered simple by this approach. Similarly for
>> CXXNewExpr and a host of other non-trivial expressions that aren't covered.
>> If we have to have the notion of 'simple', then we need to build it
>> constructively by white-listing expressions that are known to be simple and
>> banning (by default) all other expressions.
>>
>> I'm a bit worried about how many visitor methods I need to write to, but
> I'll give it a try and see what happens.
>
>
> I could settle for a full audit of all of the expressions, but the fact
> that there are several obvious missing cases worries me a lot.
>
> +  // DeclMatcher checks to see if the decls are used in a non-evauluated
>> +  // context.
>> +  class DeclMatcher : public StmtVisitor<DeclMatcher> {
>>
>> Should this use an EvaluatedExprVisitor to provide more accurate
>> diagnostics? e.g., the presence of sizeof(x) doesn't mean that 'x' has been
>> used in a meaningful way.
>>
>
> An EvaluatedExprVisitor is used to find the decls used in the loop
> condition, which is called the DeclExtractor.  A separate visitor, the
> DeclMatch, is used on the loop conditional, the loop increment, and the
> loop body to see if the decls are in use.  For this, non-evaluated contexts
> that could change the decl are important, so an EvaluatedExprVisitor can
> not be used here.
>
>
> How would a non-evaluated context change the value of the variable?
>
> - Doug
>
>
> Oops, I seemed to have taken part of this conversation off list.  I was a
little fuzzy on the definition of an evaluated context, and a bit of
testing shows that an EvaluatedExprVisitor will work for this warning.  I
will drop the StmtVisitor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20120312/ad77888b/attachment.html>