[clang] [analyzer] Widen loops: Augment assignment note when values invalidated at the start of a loop (PR #122398)

via cfe-commits cfe-commits at lists.llvm.org
Thu Jan 9 17:33:50 PST 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Douglas (dgg5503)

<details>
<summary>Changes</summary>

This PR attempts to improve the value assignment diagnostic specifically for the case where `widen-loops` is enabled during Clang static analysis.

The motivation behind this diagnostic is the fact that the current diagnostic is a bit confusing at first glance. For example:
```cpp
class A {
public:
  void m_fn1();
};
void fn1() {
  A a;
  A *b = &a;
  for (;;) {
    (void)!b;
    b->m_fn1();
  }
}
```

```
> clang -cc1 -internal-isystem C:\...\lib\clang\include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core -analyzer-max-loop 2 -analyzer-config widen-loops=true -analyzer-output=text loop-widening-notes-best.cpp

loop-widening-notes-best.cpp:10:5: warning: Called C++ object pointer is null [core.CallAndMessage]
   10 |     b->m_fn1();
      |     ^
loop-widening-notes-best.cpp:8:3: note: Loop condition is true.  Entering loop body
    8 |   for (;;) {
      |   ^
loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b'
loop-widening-notes-best.cpp:8:3: note: Loop condition is true.  Entering loop body
loop-widening-notes-best.cpp:9:11: note: Assuming 'b' is null
    9 |     (void)!b;
      |           ^~
loop-widening-notes-best.cpp:10:5: note: Called C++ object pointer is null
   10 |     b->m_fn1();
      |     ^
1 warning generated.
```

The message `loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b'` appearing where it is makes technical sense if you understand what `widen-loops` does behind the scenes. In short, `widen-loops` invalidates nearly all values in the current state before the start of the loop. At least in this example, the variable `b` is invalidated at the start of the for loop and, as part of the process of invalidation, is reassigned a conjured symbol hence `Value assigned to 'b'`. Without that knowledge, it is not intuitive from the diagnostic why the assignment message appears where it does.

In such cases, I'd like to make the diagnostic a bit more verbose. I propose something like:
```
loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b' (invalidation as part of widen-loops made this symbolic here)
```

This indicates to the user that `b` has been invalidated and turned into a symbolic representation because of actions taken by `widen-loops`.

The implementation I present in this PR minimally passes all Clang static analysis LIT tests, however, I am not confident the approach I've taken is idiomatic with respect to the design of Clang's static analysis. I have what I think is a slightly more general solution [here](https://github.com/llvm/llvm-project/compare/main...dgg5503:llvm-project:dgg5503/main/invalidation-diagnostic-b) where `MemRegion` invalidations are tracked for each `ProgramState`, but it also has its own pitfalls (see TODO comments in the [diff](https://github.com/llvm/llvm-project/compare/main...dgg5503:llvm-project:dgg5503/main/invalidation-diagnostic-b)). I'd be curious to know if there's a preference in either approach, or, if a different approach should be taken.

I am open to any and all suggestions as my knowledge in the Clang static analyzer is limited.


---
Full diff: https://github.com/llvm/llvm-project/pull/122398.diff


5 Files Affected:

- (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h (+4) 
- (modified) clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp (+43) 
- (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+6-1) 
- (modified) clang/test/Analysis/loop-widening-notes.cpp (+5-5) 
- (modified) clang/test/Analysis/loop-widening.cpp (+1-1) 


``````````diff
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
index 20c446e33ef9a5..6017b7e0ea8f81 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
@@ -290,6 +290,10 @@ class ExprEngine {
   /// This tag applies to a node created after removeDead.
   static const ProgramPointTag *cleanupNodeTag();
 
+  /// A tag to track when loop widening occurs.
+  /// This tag applies to a node created after getWidenedLoopState.
+  static const ProgramPointTag *loopWideningNodeTag();
+
   /// processCFGElement - Called by CoreEngine. Used to generate new successor
   ///  nodes by processing the 'effects' of a CFG element.
   void processCFGElement(const CFGElement E, ExplodedNode *Pred,
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index a9b4dbb39b5bd6..53494b5cf9cca8 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -1353,6 +1353,49 @@ static void showBRDefaultDiagnostics(llvm::raw_svector_ostream &OS,
     OS << " to ";
     SI.Dest->printPretty(OS);
   }
+
+  // If the value was part of a loop widening node and its value was
+  // invalidated (i.e. replaced with a conjured symbol) then let the user know
+  // that it was due to loop widening.
+  if (SI.StoreSite && SI.Dest &&
+      SI.StoreSite->getLocation().getTag() ==
+          ExprEngine::loopWideningNodeTag()) {
+
+    // TODO: Is it always the case that there's only one predecessor?
+    assert(SI.StoreSite->hasSinglePred() &&
+           "more than one predecessor found, this needs to be handled...");
+    if (const ExplodedNode *previous = SI.StoreSite->getFirstPred()) {
+      // Was the associated memory region for this variable changed from
+      // non-Symbolic to Symbolic because of invalidation?
+      // TODO: Better yet, if there was simply a way to know if a given
+      // SVal / MemRegion was invalidated as part of the current state,
+      // then that should be more robust than what's going on here.
+      // Could we somehow make use of "RegionChanges" /
+      // "runCheckersForRegionChanges" and the ExplicitRegions parameter.
+      // Still need to somehow know when a particular Global
+      // Variable is invalidated. I have not found this to be possible with
+      // "RegionChanges" unless I'm missing something...
+      const ProgramState *lastState = previous->getState().get();
+      const SVal &lastSVal = lastState->getSVal(SI.Dest);
+      const SymbolRef valueSymbol = SI.Value.getAsSymbol(true);
+      const SymbolRef lastSymbol = lastSVal.getAsSymbol(true);
+
+      bool isNowSymbolic = false;
+      if (valueSymbol) {
+        if (!lastSymbol) {
+          // Last state did not have a symbol for SI.Value in current state.
+          // Probably (?) invalidated.
+          isNowSymbolic = true;
+        } else {
+          // If the symbol types differ, then there was likely invalidation
+          isNowSymbolic = valueSymbol->getKind() != lastSymbol->getKind();
+        }
+      }
+
+      if (isNowSymbolic)
+        OS << " (invalidation as part of widen-loops made this symbolic here)";
+    }
+  }
 }
 
 static bool isTrivialCopyOrMoveCtor(const CXXConstructExpr *CE) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index ff8bdcea9a2201..4660edbf3dde78 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1109,6 +1109,11 @@ const ProgramPointTag *ExprEngine::cleanupNodeTag() {
   return &cleanupTag;
 }
 
+const ProgramPointTag *ExprEngine::loopWideningNodeTag() {
+  static SimpleProgramPointTag loopWideningTag(TagProviderName, "Widen Loop");
+  return &loopWideningTag;
+}
+
 void ExprEngine::ProcessStmt(const Stmt *currStmt, ExplodedNode *Pred) {
   // Reclaim any unnecessary nodes in the ExplodedGraph.
   G.reclaimRecentlyAllocatedNodes();
@@ -2546,7 +2551,7 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
     const LocationContext *LCtx = Pred->getLocationContext();
     ProgramStateRef WidenedState =
         getWidenedLoopState(Pred->getState(), LCtx, BlockCount, Term);
-    nodeBuilder.generateNode(WidenedState, Pred);
+    nodeBuilder.generateNode(WidenedState, Pred, loopWideningNodeTag());
     return;
   }
 
diff --git a/clang/test/Analysis/loop-widening-notes.cpp b/clang/test/Analysis/loop-widening-notes.cpp
index a3f030dfe98826..ac98d81c23475e 100644
--- a/clang/test/Analysis/loop-widening-notes.cpp
+++ b/clang/test/Analysis/loop-widening-notes.cpp
@@ -9,7 +9,7 @@ int test_for_bug_25609() {
     bar();
   for (int i = 0;  // expected-note {{Loop condition is true.  Entering loop body}}                    
                    // expected-note at -1 {{Loop condition is false. Execution continues on line 16}}
-       ++i,        // expected-note {{Value assigned to 'p_a'}} 
+       ++i,        // expected-note {{Value assigned to 'p_a' (invalidation as part of widen-loops made this symbolic here)}}
        i < flag_a;
        ++i) {}
                                       
@@ -23,7 +23,7 @@ int while_analyzer_output() {
   flag_b = 100;
   int num = 10;
   while (flag_b-- > 0) { // expected-note {{Loop condition is true.  Entering loop body}} 
-                         // expected-note at -1 {{Value assigned to 'num'}} 
+                         // expected-note at -1 {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
                          // expected-note at -2 {{Loop condition is false. Execution continues on line 30}}
     num = flag_b;
   }
@@ -45,7 +45,7 @@ int do_while_analyzer_output() {
   do {   // expected-note {{Loop condition is true. Execution continues on line 47}} 
          // expected-note at -1 {{Loop condition is false.  Exiting loop}}
     num--;
-  } while (flag_c-- > 0); //expected-note {{Value assigned to 'num'}}
+  } while (flag_c-- > 0); //expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
   int local = 0;
   if (num == 0)       // expected-note {{Assuming 'num' is equal to 0}} 
                       // expected-note at -1 {{Taking true branch}}
@@ -59,7 +59,7 @@ int test_for_loop() {
   int num = 10;
   for (int i = 0;    // expected-note {{Loop condition is true.  Entering loop body}} 
                      // expected-note at -1 {{Loop condition is false. Execution continues on line 67}}
-       new int(10),  // expected-note {{Value assigned to 'num'}}
+       new int(10),  // expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
        i < flag_d;
        ++i) {         
     ++num;
@@ -73,7 +73,7 @@ int test_for_loop() {
 
 int test_for_range_loop() {
   int arr[10] = {0};
-  for(auto x : arr) { // expected-note {{Assigning value}} 
+  for(auto x : arr) { // expected-note {{Assigning value (invalidation as part of widen-loops made this symbolic here)}}
     ++x;
   }
   if (arr[0] == 0)   // expected-note {{Assuming the condition is true}} 
diff --git a/clang/test/Analysis/loop-widening.cpp b/clang/test/Analysis/loop-widening.cpp
index fbcb72dee160ae..0b05267c463fc3 100644
--- a/clang/test/Analysis/loop-widening.cpp
+++ b/clang/test/Analysis/loop-widening.cpp
@@ -16,7 +16,7 @@ void fn1() {
 
   for (;;) { // expected-note{{Loop condition is true.  Entering loop body}}
              // expected-note at -1{{Loop condition is true.  Entering loop body}}
-             // expected-note at -2{{Value assigned to 'b'}}
+             // expected-note at -2{{Value assigned to 'b' (invalidation as part of widen-loops made this symbolic here)}}
              // no crash during bug report construction
 
     g = !b;     // expected-note{{Assuming 'b' is null}}

``````````

</details>


https://github.com/llvm/llvm-project/pull/122398


More information about the cfe-commits mailing list