[clang] [analyzer] Untangle subcheckers of CStringChecker (PR #113312)

Kristóf Umann via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 22 06:28:33 PDT 2024


https://github.com/Szelethus created https://github.com/llvm/llvm-project/pull/113312

It turns out, that some checks for cstring functions happened as a side effect of other checks. For example, whether the arguments to memcpy were uninitialized happened during buffer overflow checking.

The way this was implemented is that if alpha.unix.cstring.OutOfBounds was disabled, alpha.unix.cstring.UninitializedRead couldn't emit any warnings. It turns out that major modeling steps are early-exited if a certain checker is disabled!

This patch moved the early returns to the report emission parts -- modeling still happens, only the bug report construction is omitted. This would mean that if we find a fatal error (like buffer overflow) we _should_ stop analysis even if we don't emit a warning (thats a part of doing modeling), but I decided against implementing that.

One hurdle is that CStringChecker is a dependency of MallocChecker, and the current tests rely on the CStringChecker _not_ terminating execution paths prematurely. Considering that the checkers that would do that are in alpha anyways, this doesn't seem to be an urgent step immediately.

I added FIXMEs to all tests would have failed if the patch sank the analysis at the fatal cstring function call, but didn't. I also added a new test case for buffers overlapping, but not being quite equal.

This patch might be NFC, but it shouldn't have much of an observable behaviour.

>From 306e0f0869582a9618f4c871200814c75bc34f56 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Krist=C3=B3f=20Umann?= <dkszelethus at gmail.com>
Date: Tue, 22 Oct 2024 11:18:37 +0200
Subject: [PATCH] [analyzer] Untangle subcheckers of CStringChecker

It turns out, that some checks for cstring functions happened as a side
effect of other checks. For example, whether the arguments to memcpy
were uninitialized happened during buffer overflow checking.

The way this was implemented is that if alpha.unix.cstring.OutOfBounds
was disabled, alpha.unix.cstring.UninitializedRead couldn't emit any
warnings. It turns out that major modeling steps are early-exited if a
certain checker is disabled!

This patch moved the early returns to the report emission parts --
modeling still happens, only the bug report construction is omitted.
This would mean that if we find a fatal error (like buffer overflow) we
_should_ stop analysis even if we don't emit a warning (thats a part of
doing modeling), but I decided against implementing that.

One hurdle is that CStringChecker is a dependency of MallocChecker, and
the current tests rely on the CStringChecker _not_ terminating execution
paths prematurely. Considering that the checkers that would do that are
in alpha anyways, this doesn't seem to be an urgent step immediately.

I added FIXMEs to all tests would have failed if the patch sank the
analysis at the fatal cstring function call, but didn't. I also added a
new test case for buffers overlapping, but not being quite equal.
---
 .../Checkers/CStringChecker.cpp               | 77 ++++++++++++++-----
 clang/test/Analysis/bstring.cpp               | 55 +++++++++++--
 clang/test/Analysis/malloc.c                  | 11 +++
 3 files changed, 118 insertions(+), 25 deletions(-)

diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index 8dd08f14b2728b..6dd5863909e73c 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -579,8 +579,14 @@ ProgramStateRef CStringChecker::CheckLocation(CheckerContext &C,
     // These checks are either enabled by the CString out-of-bounds checker
     // explicitly or implicitly by the Malloc checker.
     // In the latter case we only do modeling but do not emit warning.
-    if (!Filter.CheckCStringOutOfBounds)
-      return nullptr;
+    // FIXME: We detected a fatal error here, we should stop analysis even if we
+    // chose not to emit a report here. However, as long as our out-of-bounds
+    // checker is in alpha, lets just pretend nothing happened.
+    if (!Filter.CheckCStringOutOfBounds) {
+      //C.addSink();
+      //return nullptr;
+      return state;
+    }
 
     // Emit a bug report.
     ErrorMessage Message =
@@ -614,10 +620,6 @@ CStringChecker::CheckBufferAccess(CheckerContext &C, ProgramStateRef State,
   if (!State)
     return nullptr;
 
-  // If out-of-bounds checking is turned off, skip the rest.
-  if (!Filter.CheckCStringOutOfBounds)
-    return State;
-
   SVal BufStart =
       svalBuilder.evalCast(BufVal, PtrTy, Buffer.Expression->getType());
 
@@ -665,8 +667,6 @@ ProgramStateRef CStringChecker::CheckOverlap(CheckerContext &C,
                                              SizeArgExpr Size, AnyArgExpr First,
                                              AnyArgExpr Second,
                                              CharKind CK) const {
-  if (!Filter.CheckCStringBufferOverlap)
-    return state;
 
   // Do a simple check for overlap: if the two arguments are from the same
   // buffer, see if the end of the first is greater than the start of the second
@@ -702,9 +702,17 @@ ProgramStateRef CStringChecker::CheckOverlap(CheckerContext &C,
       state->assume(svalBuilder.evalEQ(state, *firstLoc, *secondLoc));
 
   if (stateTrue && !stateFalse) {
-    // If the values are known to be equal, that's automatically an overlap.
-    emitOverlapBug(C, stateTrue, First.Expression, Second.Expression);
-    return nullptr;
+    if (Filter.CheckCStringBufferOverlap) {
+      // If the values are known to be equal, that's automatically an overlap.
+      emitOverlapBug(C, stateTrue, First.Expression, Second.Expression);
+      return nullptr;
+    }
+    // FIXME: We detected a fatal error here, we should stop analysis even if we
+    // chose not to emit a report here. However, as long as our overlap checker
+    // is in alpha, lets just pretend nothing happened.
+    //C.addSink();
+    //return nullptr;
+    return state;
   }
 
   // assume the two expressions are not equal.
@@ -768,9 +776,17 @@ ProgramStateRef CStringChecker::CheckOverlap(CheckerContext &C,
   std::tie(stateTrue, stateFalse) = state->assume(*OverlapTest);
 
   if (stateTrue && !stateFalse) {
-    // Overlap!
-    emitOverlapBug(C, stateTrue, First.Expression, Second.Expression);
-    return nullptr;
+    if (Filter.CheckCStringBufferOverlap) {
+      // Overlap!
+      emitOverlapBug(C, stateTrue, First.Expression, Second.Expression);
+      return nullptr;
+    }
+    // FIXME: We detected a fatal error here, we should stop analysis even if we
+    // chose not to emit a report here. However, as long as our overlap checker
+    // is in alpha, lets just pretend nothing happened.
+    //C.addSink();
+    //return nullptr;
+    return state;
   }
 
   // assume the two expressions don't overlap.
@@ -780,6 +796,8 @@ ProgramStateRef CStringChecker::CheckOverlap(CheckerContext &C,
 
 void CStringChecker::emitOverlapBug(CheckerContext &C, ProgramStateRef state,
                                   const Stmt *First, const Stmt *Second) const {
+  assert(Filter.CheckCStringBufferOverlap &&
+         "Can't emit from a checker that is not enabled!");
   ExplodedNode *N = C.generateErrorNode(state);
   if (!N)
     return;
@@ -799,6 +817,8 @@ void CStringChecker::emitOverlapBug(CheckerContext &C, ProgramStateRef state,
 
 void CStringChecker::emitNullArgBug(CheckerContext &C, ProgramStateRef State,
                                     const Stmt *S, StringRef WarningMsg) const {
+  assert(Filter.CheckCStringNullArg &&
+         "Can't emit from a checker that is not enabled!");
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_Null) {
       // FIXME: This call uses the string constant 'categories::UnixAPI' as the
@@ -820,6 +840,8 @@ void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                                               ProgramStateRef State,
                                               const Expr *E,
                                               StringRef Msg) const {
+  assert(Filter.CheckCStringUninitializedRead &&
+         "Can't emit from a checker that is not enabled!");
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_UninitRead)
       BT_UninitRead.reset(new BugType(Filter.CheckNameCStringUninitializedRead,
@@ -838,8 +860,15 @@ void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
 void CStringChecker::emitOutOfBoundsBug(CheckerContext &C,
                                         ProgramStateRef State, const Stmt *S,
                                         StringRef WarningMsg) const {
+  // FIXME: This is absurd. If a checker is disabled, we just emit the bug
+  // under another name?
+  assert((Filter.CheckCStringOutOfBounds || Filter.CheckCStringNullArg)  &&
+         "Can't emit from a checker that is not enabled!");
+
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_Bounds)
+      // FIXME: This is absurd. If a checker is disabled, we just emit the bug
+      // under another name?
       BT_Bounds.reset(new BugType(Filter.CheckCStringOutOfBounds
                                       ? Filter.CheckNameCStringOutOfBounds
                                       : Filter.CheckNameCStringNullArg,
@@ -858,6 +887,9 @@ void CStringChecker::emitOutOfBoundsBug(CheckerContext &C,
 void CStringChecker::emitNotCStringBug(CheckerContext &C, ProgramStateRef State,
                                        const Stmt *S,
                                        StringRef WarningMsg) const {
+  assert(Filter.CheckCStringNotNullTerm &&
+         "Can't emit from a checker that is not enabled!");
+
   if (ExplodedNode *N = C.generateNonFatalErrorNode(State)) {
     if (!BT_NotCString) {
       // FIXME: This call uses the string constant 'categories::UnixAPI' as the
@@ -876,6 +908,8 @@ void CStringChecker::emitNotCStringBug(CheckerContext &C, ProgramStateRef State,
 
 void CStringChecker::emitAdditionOverflowBug(CheckerContext &C,
                                              ProgramStateRef State) const {
+  assert(Filter.CheckCStringOutOfBounds &&
+         "Can't emit from a checker that is not enabled!");
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_AdditionOverflow) {
       // FIXME: This call uses the word "API" as the description of the bug;
@@ -902,10 +936,6 @@ ProgramStateRef CStringChecker::checkAdditionOverflow(CheckerContext &C,
                                                      ProgramStateRef state,
                                                      NonLoc left,
                                                      NonLoc right) const {
-  // If out-of-bounds checking is turned off, skip the rest.
-  if (!Filter.CheckCStringOutOfBounds)
-    return state;
-
   // If a previous check has failed, propagate the failure.
   if (!state)
     return nullptr;
@@ -941,8 +971,15 @@ ProgramStateRef CStringChecker::checkAdditionOverflow(CheckerContext &C,
 
     if (stateOverflow && !stateOkay) {
       // We have an overflow. Emit a bug report.
-      emitAdditionOverflowBug(C, stateOverflow);
-      return nullptr;
+      if (Filter.CheckCStringOutOfBounds)
+        emitAdditionOverflowBug(C, stateOverflow);
+
+      // FIXME: We detected a fatal error here, we should stop analysis even if we
+      // chose not to emit a report here. However, as long as our overlap checker
+      // is in alpha, lets just pretend nothing happened.
+      //C.addSink();
+      //return nullptr;
+      return state;
     }
 
     // From now on, assume an overflow didn't occur.
diff --git a/clang/test/Analysis/bstring.cpp b/clang/test/Analysis/bstring.cpp
index 1b6397c3455ebd..cf364e06b53121 100644
--- a/clang/test/Analysis/bstring.cpp
+++ b/clang/test/Analysis/bstring.cpp
@@ -1,8 +1,43 @@
-// RUN: %clang_analyze_cc1 -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -verify -analyzer-config eagerly-assume=false %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring,debug.ExprInspection -verify -analyzer-config eagerly-assume=false %s
-// RUN: %clang_analyze_cc1 -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,unix.Malloc,debug.ExprInspection -verify -analyzer-config eagerly-assume=false %s
-// RUN: %clang_analyze_cc1 -DUSE_BUILTINS -DVARIANT -analyzer-checker=core,unix.cstring,alpha.unix.cstring,unix.Malloc,debug.ExprInspection -verify -analyzer-config eagerly-assume=false %s
-// RUN: %clang_analyze_cc1 -DSUPPRESS_OUT_OF_BOUND -analyzer-checker=core,unix.cstring,unix.Malloc,alpha.unix.cstring.BufferOverlap,alpha.unix.cstring.NotNullTerminated,debug.ExprInspection -verify -analyzer-config eagerly-assume=false %s
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=unix.cstring \
+// RUN:   -analyzer-checker=unix.Malloc \
+// RUN:   -analyzer-checker=alpha.unix.cstring \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
+
+// RUN: %clang_analyze_cc1 -verify %s -DUSE_BUILTINS \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=unix.cstring \
+// RUN:   -analyzer-checker=unix.Malloc \
+// RUN:   -analyzer-checker=alpha.unix.cstring \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
+
+// RUN: %clang_analyze_cc1 -verify %s -DVARIANT \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=unix.cstring \
+// RUN:   -analyzer-checker=alpha.unix.cstring \
+// RUN:   -analyzer-checker=unix.Malloc \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
+
+// RUN: %clang_analyze_cc1 -verify %s -DUSE_BUILTINS -DVARIANT \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=unix.cstring \
+// RUN:   -analyzer-checker=alpha.unix.cstring \
+// RUN:   -analyzer-checker=unix.Malloc \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
+
+// RUN: %clang_analyze_cc1 -verify %s -DSUPPRESS_OUT_OF_BOUND \
+// RUN:   -analyzer-checker=core \
+// RUN:   -analyzer-checker=unix.cstring \
+// RUN:   -analyzer-checker=unix.Malloc \
+// RUN:   -analyzer-checker=alpha.unix.cstring.BufferOverlap \
+// RUN:   -analyzer-checker=alpha.unix.cstring.NotNullTerminated \
+// RUN:   -analyzer-checker=debug.ExprInspection \
+// RUN:   -analyzer-config eagerly-assume=false
 
 #include "Inputs/system-header-simulator-cxx.h"
 #include "Inputs/system-header-simulator-for-malloc.h"
@@ -103,6 +138,8 @@ void memset1_inheritance() {
 #ifdef SUPPRESS_OUT_OF_BOUND
 void memset2_inheritance_field() {
   Derived d;
+  // FIXME: The analyzer should stop analysis after memset. The argument to
+  // sizeof should be Derived::d_mem.
   memset(&d.d_mem, 0, sizeof(Derived));
   clang_analyzer_eval(d.b_mem == 0); // expected-warning{{UNKNOWN}}
   clang_analyzer_eval(d.d_mem == 0); // expected-warning{{UNKNOWN}}
@@ -110,6 +147,8 @@ void memset2_inheritance_field() {
 
 void memset3_inheritance_field() {
   Derived d;
+  // FIXME: The analyzer should stop analysis after memset. The argument to
+  // sizeof should be Derived::b_mem.
   memset(&d.b_mem, 0, sizeof(Derived));
   clang_analyzer_eval(d.b_mem == 0); // expected-warning{{TRUE}}
   clang_analyzer_eval(d.d_mem == 0); // expected-warning{{TRUE}}
@@ -176,6 +215,8 @@ class DerivedVirtual : public BaseVirtual {
 #ifdef SUPPRESS_OUT_OF_BOUND
 void memset8_virtual_inheritance_field() {
   DerivedVirtual d;
+  // FIXME: The analyzer should stop analysis after memset. The argument to
+  // sizeof should be Derived::b_mem.
   memset(&d.b_mem, 0, sizeof(Derived));
   clang_analyzer_eval(d.b_mem == 0); // expected-warning{{UNKNOWN}}
   clang_analyzer_eval(d.d_mem == 0); // expected-warning{{UNKNOWN}}
@@ -188,6 +229,10 @@ void memset1_new_array() {
   int *array = new int[10];
   memset(array, 0, 10 * sizeof(int));
   clang_analyzer_eval(array[2] == 0); // expected-warning{{TRUE}}
+  // FIXME: The analyzer should stop analysis after memset. Maybe the intent of
+  // this test was to test for this as a desired behaviour, but it shouldn't be,
+  // going out-of-bounds with memset is a fatal error, even if we decide not to
+  // report it.
   memset(array + 1, 'a', 10 * sizeof(9));
   clang_analyzer_eval(array[2] == 0); // expected-warning{{UNKNOWN}}
   delete[] array;
diff --git a/clang/test/Analysis/malloc.c b/clang/test/Analysis/malloc.c
index 9c7ca43bfbc5af..d6c9bb39abc326 100644
--- a/clang/test/Analysis/malloc.c
+++ b/clang/test/Analysis/malloc.c
@@ -1095,12 +1095,23 @@ void doNotInvalidateWhenPassedToSystemCalls(char *s) {
   strlen(p);
   strcpy(p, s);
   strcpy(s, p);
+  // FIXME: We should stop analysis here, even if we emit no warnings, since
+  // overlapping buffers for strycpy is a fatal error.
   strcpy(p, p);
   memcpy(p, s, 1);
   memcpy(s, p, 1);
   memcpy(p, p, 1);
 } // expected-warning {{leak}}
 
+void doNotInvalidateWhenPassedToSystemCalls2(char *s) {
+  char *p = malloc(12);
+  // FIXME: We should stop analysis here, even if we emit no warnings, since
+  // overlapping buffers for strycpy is a fatal error.
+  int a[4] = {0};
+  memcpy(a+2, a+1, 8);
+  (void)p;
+} // expected-warning {{leak}}
+
 // Treat source buffer contents as escaped.
 void escapeSourceContents(char *s) {
   char *p = malloc(12);



More information about the cfe-commits mailing list