[cfe-commits] [PATCH][Review Request] Handle CXXForRangeStmt

Ted Kremenek kremenek at apple.com
Tue Oct 4 22:03:19 PDT 2011


Hi Jim,

I'm looking for a test case that shows that the analyzer clearly understands the semantics of that statement.  A good way to do that is two provide two test cases, one that generates an error and another that doesn't, with the one that doesn't utilizing some bit about the semantics of the statement.

Here would be a good example:

void test1() {
  int array[2] = { 1, 2 };
  int j = 0;
  for (auto i : array) { j += i; }
  
  int *p = 0;
  *p = 0xDEADBEEF;  // expected-warning {{null}}
}

void test2() {
  int array[2] = { 1, 2 };
  int j = 0;
  for (auto i : array) { j += i; }
  
  if (j == 3)
    return;

  int *p = 0;
  *p = 0xDEADBEEF;  // no-warning
}


Notice that the two tests are practically identical.  The two test cases are important because they serve cross-check each other.  The first test shows that the analyzer is basically working, even in the presence of the special for loop, but it doesn't rely on any specific semantics of that statement.

Note that I also made the array small, to avoid any issues with exhausting the number of loop traversals.  In this case, we should explore every path.

The second test shows that we properly reason about the for loop in path-sensitive manner, which causes the warning to disappear.

If we omitted the first test, the second test would not validate our logic.  We could instead be missing a warning because we silently dropped the analysis of the function on the floor.

The test file to modify would be test/Analysis/misc-ps-cxx0x.cpp.

After applying this patch, I noticed that these tests do not pass, so the next step is to understand why.

It's instructive to look how this statement is modeled in the CFG.  If I just put test2 into a file t.cpp, I can look at the CFG:

$ clang -std=c++0x --analyze -Xclang -analyzer-checker=debug.DumpCFG t.cpp

 [ B8 (ENTRY) ]
    Predecessors (0):
    Successors (1): B7

 [ B1 ]
      1: 0
      2: [B1.1]
      3: int *p = 0;
      4: 3735928559U
      5: [B1.4]
      6: p
      7: [B1.6]
      8: *[B1.7]
      9: [B1.8] = [B1.5]
    Predecessors (1): B3
    Successors (1): B0

 [ B2 ]
      1: return;
    Predecessors (1): B3
    Successors (1): B0

 [ B3 ]
      1: j
      2: [B3.1]
      3: 3
      4: [B3.2] == [B3.3]
      T: if [B3.4]
    Predecessors (1): B4
    Successors (2): B2 B1

 [ B4 ]
      1: [B6.1]
      2: [B6.1]
      3: __end
      4: [B4.3]
      5: [B4.2] != [B4.4]
      T: for (int i : [B7.7]) {
    {
        [B6.9];
    }
}

    Predecessors (2): B5 B7
    Successors (2): B6 B3

 [ B5 ]
      1: [B6.1]
      2: ++[B6.1]
    Predecessors (1): B6
    Successors (1): B4

 [ B6 ]
      1: __begin
      2: [B6.1]
      3: *[B6.2]
      4: [B6.3]
      5: int i = *__begin;
      6: i
      7: [B6.6]
      8: j
      9: [B6.8] += [B6.7]
    Predecessors (1): B4
    Successors (1): B5

 [ B7 ]
      1: 2
      2: 1
      3: { [B7.2], [B7.1] }
      4: int array[2] = { 1, 2 };
      5: 0
      6: int j = 0;
      7: array
      8: auto int (&&__range)[2] = array;
      9: [B7.12]
     10: [B7.12]
     11: auto int *__begin = __range;
     12: __range
     13: [B7.12]
     14: 2L
     15: [B7.13] + [B7.14]
     16: auto int *__end = __range + 2L;
    Predecessors (1): B8
    Successors (1): B4

 [ B0 (EXIT) ]
    Predecessors (2): B1 B2
    Successors (0):


Take a look at block B7, which is the meat of the CFG representation for CXXForRangeStmt.  I've highlighted the problems:

 [ B7 ]
      1: 2
      2: 1
      3: { [B7.2], [B7.1] }
      4: int array[2] = { 1, 2 };
      5: 0
      6: int j = 0;
      7: array
      8: auto int (&&__range)[2] = array;
-->  9: [B7.12]
-->  10: [B7.12]
     11: auto int *__begin = __range;
     12: __range
     13: [B7.12]
     14: 2L
     15: [B7.13] + [B7.14]
     16: auto int *__end = __range + 2L;
    Predecessors (1): B8
    Successors (1): B4

It appears that the 12th expression in the basic block is referred to by the 9th and 10th (before it gets evaluated).  That's completely bogus.  From the looks of it, it appears the CFG support for CXXForRangeStmt is wrong.

The problem is clearer when you view the analysis exploded graph:

$ ~/clang-xcode/bin/Debug/clang -std=c++0x --analyze t.cpp -Xclang -analyzer-viz-egraph-graphviz

I won't show it here, but basically the analysis graph caches out the path when evaluating block B7 by creating an artificial cycle.  Thus it is completely busted.

I suspected the culprit is here:

CFGBlock *CFGBuilder::VisitCXXForRangeStmt(CXXForRangeStmt *S) {
  ...

  // Add the initialization statements.
  Block = createBlock();
  addStmt(S->getBeginEndStmt());
  return addStmt(S->getRangeStmt());
}


An AST dump shows the problem:

  (CXXForRangeStmt 0x7fb8db83dad8 <line:4:3, col:34>
    (DeclStmt 0x7fb8db83d5a8 <col:17>
      0x7fb8db83d3a0 "auto int (&&__range)[2] =
        (DeclRefExpr 0x7fb8db83d260 <col:17> 'int [2]' lvalue Var 0x7fb8db83d030 'array' 'int [2]')")
    (DeclStmt 0x7fb8db83d8e8 <col:15>
      0x7fb8db83d600 "auto int *__begin =
        (ImplicitCastExpr 0x7fb8db83d7c0 <col:15> 'int *' <ArrayToPointerDecay>
-->          (DeclRefExpr 0x7fb8db83d5c0 <col:15> 'int [2]' lvalue Var 0x7fb8db83d3a0 '__range' 'int (&&)[2]'))"
      0x7fb8db83d670 "auto int *__end =
        (BinaryOperator 0x7fb8db83d818 <col:15, col:17> 'int *' '+'
          (ImplicitCastExpr 0x7fb8db83d800 <col:15> 'int *' <ArrayToPointerDecay>
-->            (DeclRefExpr 0x7fb8db83d5c0 <col:15> 'int [2]' lvalue Var 0x7fb8db83d3a0 '__range' 'int (&&)[2]'))
          (IntegerLiteral 0x7fb8db83d7d8 <col:17> 'long' 2))")
    (BinaryOperator 0x7fb8db83d980 <col:15> '_Bool' '!='
      (ImplicitCastExpr 0x7fb8db83d950 <col:15> 'int *':'int *' <LValueToRValue>
        (DeclRefExpr 0x7fb8db83d900 <col:15> 'int *':'int *' lvalue Var 0x7fb8db83d600 '__begin' 'int *':'int *'))
      (ImplicitCastExpr 0x7fb8db83d968 <col:15> 'int *':'int *' <LValueToRValue>
        (DeclRefExpr 0x7fb8db83d928 <col:15> 'int *':'int *' lvalue Var 0x7fb8db83d670 '__end' 'int *':'int *')))
    (UnaryOperator 0x7fb8db83d9a8 <col:15> 'int *':'int *' lvalue prefix '++'
      (DeclRefExpr 0x7fb8db83d900 <col:15> 'int *':'int *' lvalue Var 0x7fb8db83d600 '__begin' 'int *':'int *'))

The same DeclRefExpr for __range appears TWICE (you can tell from the pointer address of the DeclRefExpr) as subexpressions.  This completely breaks an invariant in the CFG that all expressions appear only once, as a single subexpression.  This causes the analyzer to cache out the analysis of the for loop when evaluating the prologue of the loop.

I don't know of any other AST node that does this, because this essentially makes the AST a DAG instead of a tree.  Arguably the AST should be changed, but this may be considered a reasonable optimization.  Alternatively, the CFG builder can be taught to do the right thing here.  In either case, one of these changes needs to be made in order for us to try and model the semantics of CXXForRangeStmt correctly.

I'll look into fixing this problem in the next few days, but feel free to tackle it if you are interested.

Cheers,
Ted

On Oct 4, 2011, at 4:46 PM, Jim Goodnow II wrote:

> Or do you just want an example of the ForRange and show that it doesn't cause the analyzer to generate an error? Is there a file for that kind of thing already or should it be a new file? For example:
> 
> int array[ 10 ];
> for ( auto i : array ) ;
> 
> Thanks!
> 
> - jim
> 
> On 10/4/2011 3:36 PM, Jim Goodnow II wrote:
>> Hi Ted,
>> 
>> Um, I'm not sure how I would go about writing a test case for a statement class that shouldn't be showing up as an expression in the CFG. It's just like "for", "break", "continue", etc. Is there an example test for those cases I could look at? Thanks!
>> 
>> - jim
>> 
>> On 10/4/2011 3:20 PM, Ted Kremenek wrote:
>>> Test case please.  :)
>>> 
>>> On Oct 4, 2011, at 1:32 PM, Jim Goodnow II wrote:
>>> 
>>>> This is just a program control statement and should be handled the same as any other.
>>>> 
>>>> - jim
>>>> 
>>>> Index: lib/StaticAnalyzer/Core/ExprEngine.cpp
>>>> ===================================================================
>>>> --- lib/StaticAnalyzer/Core/ExprEngine.cpp    (revision 141095)
>>>> +++ lib/StaticAnalyzer/Core/ExprEngine.cpp    (working copy)
>>>> @@ -453,7 +453,6 @@
>>>>     case Stmt::CXXBindTemporaryExprClass:
>>>>     case Stmt::CXXCatchStmtClass:
>>>>     case Stmt::CXXDependentScopeMemberExprClass:
>>>> -    case Stmt::CXXForRangeStmtClass:
>>>>     case Stmt::CXXPseudoDestructorExprClass:
>>>>     case Stmt::CXXTemporaryObjectExprClass:
>>>>     case Stmt::CXXThrowExprClass:
>>>> @@ -501,6 +500,7 @@
>>>>     case Stmt::CaseStmtClass:
>>>>     case Stmt::CompoundStmtClass:
>>>>     case Stmt::ContinueStmtClass:
>>>> +    case Stmt::CXXForRangeStmtClass:
>>>>     case Stmt::DefaultStmtClass:
>>>>     case Stmt::DoStmtClass:
>>>>     case Stmt::ForStmtClass:
>>>> 
>>>> <ForRange.patch>
>>> 
>>> 




More information about the cfe-commits mailing list