[clang] [analyzer] Widen loops: Augment assignment note when values invalidated at the start of a loop (PR #122398)
via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 9 17:33:17 PST 2025
https://github.com/dgg5503 created https://github.com/llvm/llvm-project/pull/122398
This PR attempts to improve the value assignment diagnostic specifically for the case where `widen-loops` is enabled during Clang static analysis.
The motivation behind this diagnostic is the fact that the current diagnostic is a bit confusing at first glance. For example:
```cpp
class A {
public:
void m_fn1();
};
void fn1() {
A a;
A *b = &a;
for (;;) {
(void)!b;
b->m_fn1();
}
}
```
```
> clang -cc1 -internal-isystem C:\...\lib\clang\include -nostdsysteminc -analyze -analyzer-constraints=range -setup-static-analyzer -analyzer-checker=core -analyzer-max-loop 2 -analyzer-config widen-loops=true -analyzer-output=text loop-widening-notes-best.cpp
loop-widening-notes-best.cpp:10:5: warning: Called C++ object pointer is null [core.CallAndMessage]
10 | b->m_fn1();
| ^
loop-widening-notes-best.cpp:8:3: note: Loop condition is true. Entering loop body
8 | for (;;) {
| ^
loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b'
loop-widening-notes-best.cpp:8:3: note: Loop condition is true. Entering loop body
loop-widening-notes-best.cpp:9:11: note: Assuming 'b' is null
9 | (void)!b;
| ^~
loop-widening-notes-best.cpp:10:5: note: Called C++ object pointer is null
10 | b->m_fn1();
| ^
1 warning generated.
```
The message `loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b'` appearing where it is makes technical sense if you understand what `widen-loops` does behind the scenes. In short, `widen-loops` invalidates nearly all values in the current state before the start of the loop. At least in this example, the variable `b` is invalidated at the start of the for loop and, as part of the process of invalidation, is reassigned a conjured symbol hence `Value assigned to 'b'`. Without that knowledge, it is not intuitive from the diagnostic why the assignment message appears where it does.
In such cases, I'd like to make the diagnostic a bit more verbose. I propose something like:
```
loop-widening-notes-best.cpp:8:3: note: Value assigned to 'b' (invalidation as part of widen-loops made this symbolic here)
```
This indicates to the user that `b` has been invalidated and turned into a symbolic representation because of actions taken by `widen-loops`.
The implementation I present in this PR minimally passes all Clang static analysis LIT tests, however, I am not confident the approach I've taken is idiomatic with respect to the design of Clang's static analysis. I have what I think is a slightly more general solution [here](https://github.com/llvm/llvm-project/compare/main...dgg5503:llvm-project:dgg5503/main/invalidation-diagnostic-b) where `MemRegion` invalidations are tracked for each `ProgramState`, but it also has its own pitfalls (see TODO comments in the [diff](https://github.com/llvm/llvm-project/compare/main...dgg5503:llvm-project:dgg5503/main/invalidation-diagnostic-b)). I'd be curious to know if there's a preference in either approach, or, if a different approach should be taken.
I am open to any and all suggestions as my knowledge in the Clang static analyzer is limited.
>From f42c02647f1ecc7ae5cdbda56ab9a5bd6495a0e1 Mon Sep 17 00:00:00 2001
From: Douglas Gliner <Douglas.Gliner at sony.com>
Date: Thu, 9 Jan 2025 11:27:26 -0800
Subject: [PATCH] [analyzer] Widen loops: Augment assignment note when values
invalidated at the start of a loop
This patch augments the assignment note to inform the user when a value
is invalidated and reassigned a conjured symbol as part of loop
widening.
---
.../Core/PathSensitive/ExprEngine.h | 4 ++
.../Core/BugReporterVisitors.cpp | 43 +++++++++++++++++++
clang/lib/StaticAnalyzer/Core/ExprEngine.cpp | 7 ++-
clang/test/Analysis/loop-widening-notes.cpp | 10 ++---
clang/test/Analysis/loop-widening.cpp | 2 +-
5 files changed, 59 insertions(+), 7 deletions(-)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
index 20c446e33ef9a5..6017b7e0ea8f81 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/ExprEngine.h
@@ -290,6 +290,10 @@ class ExprEngine {
/// This tag applies to a node created after removeDead.
static const ProgramPointTag *cleanupNodeTag();
+ /// A tag to track when loop widening occurs.
+ /// This tag applies to a node created after getWidenedLoopState.
+ static const ProgramPointTag *loopWideningNodeTag();
+
/// processCFGElement - Called by CoreEngine. Used to generate new successor
/// nodes by processing the 'effects' of a CFG element.
void processCFGElement(const CFGElement E, ExplodedNode *Pred,
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index a9b4dbb39b5bd6..53494b5cf9cca8 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -1353,6 +1353,49 @@ static void showBRDefaultDiagnostics(llvm::raw_svector_ostream &OS,
OS << " to ";
SI.Dest->printPretty(OS);
}
+
+ // If the value was part of a loop widening node and its value was
+ // invalidated (i.e. replaced with a conjured symbol) then let the user know
+ // that it was due to loop widening.
+ if (SI.StoreSite && SI.Dest &&
+ SI.StoreSite->getLocation().getTag() ==
+ ExprEngine::loopWideningNodeTag()) {
+
+ // TODO: Is it always the case that there's only one predecessor?
+ assert(SI.StoreSite->hasSinglePred() &&
+ "more than one predecessor found, this needs to be handled...");
+ if (const ExplodedNode *previous = SI.StoreSite->getFirstPred()) {
+ // Was the associated memory region for this variable changed from
+ // non-Symbolic to Symbolic because of invalidation?
+ // TODO: Better yet, if there was simply a way to know if a given
+ // SVal / MemRegion was invalidated as part of the current state,
+ // then that should be more robust than what's going on here.
+ // Could we somehow make use of "RegionChanges" /
+ // "runCheckersForRegionChanges" and the ExplicitRegions parameter.
+ // Still need to somehow know when a particular Global
+ // Variable is invalidated. I have not found this to be possible with
+ // "RegionChanges" unless I'm missing something...
+ const ProgramState *lastState = previous->getState().get();
+ const SVal &lastSVal = lastState->getSVal(SI.Dest);
+ const SymbolRef valueSymbol = SI.Value.getAsSymbol(true);
+ const SymbolRef lastSymbol = lastSVal.getAsSymbol(true);
+
+ bool isNowSymbolic = false;
+ if (valueSymbol) {
+ if (!lastSymbol) {
+ // Last state did not have a symbol for SI.Value in current state.
+ // Probably (?) invalidated.
+ isNowSymbolic = true;
+ } else {
+ // If the symbol types differ, then there was likely invalidation
+ isNowSymbolic = valueSymbol->getKind() != lastSymbol->getKind();
+ }
+ }
+
+ if (isNowSymbolic)
+ OS << " (invalidation as part of widen-loops made this symbolic here)";
+ }
+ }
}
static bool isTrivialCopyOrMoveCtor(const CXXConstructExpr *CE) {
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
index ff8bdcea9a2201..4660edbf3dde78 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngine.cpp
@@ -1109,6 +1109,11 @@ const ProgramPointTag *ExprEngine::cleanupNodeTag() {
return &cleanupTag;
}
+const ProgramPointTag *ExprEngine::loopWideningNodeTag() {
+ static SimpleProgramPointTag loopWideningTag(TagProviderName, "Widen Loop");
+ return &loopWideningTag;
+}
+
void ExprEngine::ProcessStmt(const Stmt *currStmt, ExplodedNode *Pred) {
// Reclaim any unnecessary nodes in the ExplodedGraph.
G.reclaimRecentlyAllocatedNodes();
@@ -2546,7 +2551,7 @@ void ExprEngine::processCFGBlockEntrance(const BlockEdge &L,
const LocationContext *LCtx = Pred->getLocationContext();
ProgramStateRef WidenedState =
getWidenedLoopState(Pred->getState(), LCtx, BlockCount, Term);
- nodeBuilder.generateNode(WidenedState, Pred);
+ nodeBuilder.generateNode(WidenedState, Pred, loopWideningNodeTag());
return;
}
diff --git a/clang/test/Analysis/loop-widening-notes.cpp b/clang/test/Analysis/loop-widening-notes.cpp
index a3f030dfe98826..ac98d81c23475e 100644
--- a/clang/test/Analysis/loop-widening-notes.cpp
+++ b/clang/test/Analysis/loop-widening-notes.cpp
@@ -9,7 +9,7 @@ int test_for_bug_25609() {
bar();
for (int i = 0; // expected-note {{Loop condition is true. Entering loop body}}
// expected-note at -1 {{Loop condition is false. Execution continues on line 16}}
- ++i, // expected-note {{Value assigned to 'p_a'}}
+ ++i, // expected-note {{Value assigned to 'p_a' (invalidation as part of widen-loops made this symbolic here)}}
i < flag_a;
++i) {}
@@ -23,7 +23,7 @@ int while_analyzer_output() {
flag_b = 100;
int num = 10;
while (flag_b-- > 0) { // expected-note {{Loop condition is true. Entering loop body}}
- // expected-note at -1 {{Value assigned to 'num'}}
+ // expected-note at -1 {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
// expected-note at -2 {{Loop condition is false. Execution continues on line 30}}
num = flag_b;
}
@@ -45,7 +45,7 @@ int do_while_analyzer_output() {
do { // expected-note {{Loop condition is true. Execution continues on line 47}}
// expected-note at -1 {{Loop condition is false. Exiting loop}}
num--;
- } while (flag_c-- > 0); //expected-note {{Value assigned to 'num'}}
+ } while (flag_c-- > 0); //expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
int local = 0;
if (num == 0) // expected-note {{Assuming 'num' is equal to 0}}
// expected-note at -1 {{Taking true branch}}
@@ -59,7 +59,7 @@ int test_for_loop() {
int num = 10;
for (int i = 0; // expected-note {{Loop condition is true. Entering loop body}}
// expected-note at -1 {{Loop condition is false. Execution continues on line 67}}
- new int(10), // expected-note {{Value assigned to 'num'}}
+ new int(10), // expected-note {{Value assigned to 'num' (invalidation as part of widen-loops made this symbolic here)}}
i < flag_d;
++i) {
++num;
@@ -73,7 +73,7 @@ int test_for_loop() {
int test_for_range_loop() {
int arr[10] = {0};
- for(auto x : arr) { // expected-note {{Assigning value}}
+ for(auto x : arr) { // expected-note {{Assigning value (invalidation as part of widen-loops made this symbolic here)}}
++x;
}
if (arr[0] == 0) // expected-note {{Assuming the condition is true}}
diff --git a/clang/test/Analysis/loop-widening.cpp b/clang/test/Analysis/loop-widening.cpp
index fbcb72dee160ae..0b05267c463fc3 100644
--- a/clang/test/Analysis/loop-widening.cpp
+++ b/clang/test/Analysis/loop-widening.cpp
@@ -16,7 +16,7 @@ void fn1() {
for (;;) { // expected-note{{Loop condition is true. Entering loop body}}
// expected-note at -1{{Loop condition is true. Entering loop body}}
- // expected-note at -2{{Value assigned to 'b'}}
+ // expected-note at -2{{Value assigned to 'b' (invalidation as part of widen-loops made this symbolic here)}}
// no crash during bug report construction
g = !b; // expected-note{{Assuming 'b' is null}}
More information about the cfe-commits
mailing list