[clang] 752e103 - [analyzer] Explicitly register NoStoreFuncVisitor from alpha.unix.cst… (#108373)

via cfe-commits cfe-commits at lists.llvm.org
Thu Sep 19 01:04:50 PDT 2024


Author: Kristóf Umann
Date: 2024-09-19T10:04:47+02:00
New Revision: 752e10379c2ffb4f6eebf490f1fab7eb769dfbf6

URL: https://github.com/llvm/llvm-project/commit/752e10379c2ffb4f6eebf490f1fab7eb769dfbf6
DIFF: https://github.com/llvm/llvm-project/commit/752e10379c2ffb4f6eebf490f1fab7eb769dfbf6.diff

LOG: [analyzer] Explicitly register NoStoreFuncVisitor from alpha.unix.cst… (#108373)

…ring.UninitRead

This is a drastic simplification of #106982. If you read that patch,
this is the same thing with all BugReporterVisitors.cpp and
SValBuilder.cpp changes removed! (since all replies came regarding
changed to those files, I felt the new PR was justified)

The patch was inspired by a pretty poor bug report on FFMpeg:

![image](https://github.com/user-attachments/assets/8f4e03d8-45a4-4ea2-a63d-3ab78d097be9)

In this bug report, block is uninitialized, hence the bug report that it
should not have been passed to memcpy. The confusing part is in line 93,
where block was passed as a non-const pointer to seq_unpack_rle_block,
which was obviously meant to initialize block. As developers, we know
that clang likely didn't skip this function and found a path of
execution on which this initialization failed, but NoStoreFuncVisitor
failed to attach the usual "returning without writing to block" message.

I fixed this by instead of tracking the entire array, I tracked the
actual element which was found to be uninitialized (Remember, we
heuristically only check if the first and last-to-access element is
initialized, not the entire array). This is how the bug report looks
now, with 'seq_unpack_rle_block' having notes describing the path of
execution and lack of a value change:

![image](https://github.com/user-attachments/assets/8de5d101-052e-4ecb-9cd9-7c29724333d2)

![image](https://github.com/user-attachments/assets/8bf52a95-62de-44e7-aef8-03a46a3fa08e)

Since NoStoreFuncVisitor was a TU-local class, I moved it back to
BugReporterVisitors.h, and registered it manually in CStringChecker.cpp.
This was done because we don't have a good trackRegionValue() function,
only a trackExpressionValue() function. We have an expression for the
array, but not for its first (or last-to-access) element, so I only had
a MemRegion on hand.

Added: 
    clang/test/Analysis/cstring-uninitread-notes.c

Modified: 
    clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
    clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
    clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
    clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
    clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
    clang/lib/StaticAnalyzer/Core/SVals.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
index f97514955a5913..56f7ca63d00621 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h
@@ -718,6 +718,91 @@ class NoStateChangeFuncVisitor : public BugReporterVisitor {
                                    PathSensitiveBugReport &R) final;
 };
 
+/// Put a diagnostic on return statement of all inlined functions
+/// for which  the region of interest \p RegionOfInterest was passed into,
+/// but not written inside, and it has caused an undefined read or a null
+/// pointer dereference outside.
+class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
+  const SubRegion *RegionOfInterest;
+  MemRegionManager &MmrMgr;
+  const SourceManager &SM;
+  const PrintingPolicy &PP;
+
+  /// Recursion limit for dereferencing fields when looking for the
+  /// region of interest.
+  /// The limit of two indicates that we will dereference fields only once.
+  static const unsigned DEREFERENCE_LIMIT = 2;
+
+  using RegionVector = SmallVector<const MemRegion *, 5>;
+
+public:
+  NoStoreFuncVisitor(
+      const SubRegion *R,
+      bugreporter::TrackingKind TKind = bugreporter::TrackingKind::Thorough)
+      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
+        MmrMgr(R->getMemRegionManager()),
+        SM(MmrMgr.getContext().getSourceManager()),
+        PP(MmrMgr.getContext().getPrintingPolicy()) {}
+
+  void Profile(llvm::FoldingSetNodeID &ID) const override {
+    static int Tag = 0;
+    ID.AddPointer(&Tag);
+    ID.AddPointer(RegionOfInterest);
+  }
+
+private:
+  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
+  /// the value it holds in \p CallExitBeginN.
+  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
+                                 const ExplodedNode *CallExitBeginN) override;
+
+  /// Attempts to find the region of interest in a given record decl,
+  /// by either following the base classes or fields.
+  /// Dereferences fields up to a given recursion limit.
+  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
+  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
+  /// \return A chain fields leading to the region of interest or std::nullopt.
+  const std::optional<RegionVector>
+  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
+                               const MemRegion *R, const RegionVector &Vec = {},
+                               int depth = 0);
+
+  // Region of interest corresponds to an IVar, exiting a method
+  // which could have written into that IVar, but did not.
+  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
+                                                  const ObjCMethodCall &Call,
+                                                  const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
+                                                 const CXXConstructorCall &Call,
+                                                 const ExplodedNode *N) final;
+
+  PathDiagnosticPieceRef
+  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
+                             const ExplodedNode *N) final;
+
+  /// Consume the information on the no-store stack frame in order to
+  /// either emit a note or suppress the report entirely.
+  /// \return Diagnostics piece for region not modified in the current function,
+  /// if it decides to emit one.
+  PathDiagnosticPieceRef
+  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
+                const ExplodedNode *N, const RegionVector &FieldChain,
+                const MemRegion *MatchedRegion, StringRef FirstElement,
+                bool FirstIsReferenceType, unsigned IndirectionLevel);
+
+  bool prettyPrintRegionName(const RegionVector &FieldChain,
+                             const MemRegion *MatchedRegion,
+                             StringRef FirstElement, bool FirstIsReferenceType,
+                             unsigned IndirectionLevel,
+                             llvm::raw_svector_ostream &os);
+
+  StringRef prettyPrintFirstElement(StringRef FirstElement,
+                                    bool MoreItemsExpected,
+                                    int IndirectionLevel,
+                                    llvm::raw_svector_ostream &os);
+};
+
 } // namespace ento
 } // namespace clang
 

diff  --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
index 0365f9e41312df..168983fd5cb686 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
@@ -69,6 +69,7 @@ class CheckerContext {
   /// the state of the program before the checker ran. Note, checkers should
   /// not retain the node in their state since the nodes might get invalidated.
   ExplodedNode *getPredecessor() { return Pred; }
+  const ProgramPoint getLocation() const { return Location; }
   const ProgramStateRef &getState() const { return Pred->getState(); }
 
   /// Check if the checker changed the state of the execution; ex: added

diff  --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
index def2970d448d48..a054a819a15a85 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SVals.h
@@ -89,6 +89,8 @@ class SVal {
 
   SValKind getKind() const { return Kind; }
 
+  StringRef getKindStr() const;
+
   // This method is required for using SVal in a FoldingSetNode.  It
   // extracts a unique signature for this SVal object.
   void Profile(llvm::FoldingSetNodeID &ID) const {

diff  --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index 8dd08f14b2728b..21a2d8828249d1 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -16,6 +16,7 @@
 #include "clang/Basic/Builtins.h"
 #include "clang/Basic/CharInfo.h"
 #include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/BugReporter/BugReporterVisitors.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugType.h"
 #include "clang/StaticAnalyzer/Core/Checker.h"
 #include "clang/StaticAnalyzer/Core/CheckerManager.h"
@@ -337,7 +338,8 @@ class CStringChecker : public Checker< eval::Call,
                          const Stmt *S, StringRef WarningMsg) const;
   void emitAdditionOverflowBug(CheckerContext &C, ProgramStateRef State) const;
   void emitUninitializedReadBug(CheckerContext &C, ProgramStateRef State,
-                                const Expr *E, StringRef Msg) const;
+                                const Expr *E, const MemRegion *R,
+                                StringRef Msg) const;
   ProgramStateRef checkAdditionOverflow(CheckerContext &C,
                                             ProgramStateRef state,
                                             NonLoc left,
@@ -474,7 +476,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << "The first element of the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             FirstElementVal->getAsRegion(), OS.str());
     return nullptr;
   }
 
@@ -538,7 +541,8 @@ ProgramStateRef CStringChecker::checkInit(CheckerContext &C,
     OS << ") in the ";
     printIdxWithOrdinalSuffix(OS, Buffer.ArgumentIndex + 1);
     OS << " argument is undefined";
-    emitUninitializedReadBug(C, State, Buffer.Expression, OS.str());
+    emitUninitializedReadBug(C, State, Buffer.Expression,
+                             LastElementVal.getAsRegion(), OS.str());
     return nullptr;
   }
   return State;
@@ -818,7 +822,7 @@ void CStringChecker::emitNullArgBug(CheckerContext &C, ProgramStateRef State,
 
 void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                                               ProgramStateRef State,
-                                              const Expr *E,
+                                              const Expr *E, const MemRegion *R,
                                               StringRef Msg) const {
   if (ExplodedNode *N = C.generateErrorNode(State)) {
     if (!BT_UninitRead)
@@ -831,6 +835,7 @@ void CStringChecker::emitUninitializedReadBug(CheckerContext &C,
                     Report->getLocation());
     Report->addRange(E->getSourceRange());
     bugreporter::trackExpressionValue(N, E, *Report);
+    Report->addVisitor<NoStoreFuncVisitor>(R->castAs<SubRegion>());
     C.emitReport(std::move(Report));
   }
 }

diff  --git a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
index 7102bf51a57e8b..68c8a8dc682507 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporterVisitors.cpp
@@ -522,95 +522,6 @@ PathDiagnosticPieceRef NoStateChangeFuncVisitor::VisitNode(
   return maybeEmitNoteForParameters(R, *Call, N);
 }
 
-//===----------------------------------------------------------------------===//
-// Implementation of NoStoreFuncVisitor.
-//===----------------------------------------------------------------------===//
-
-namespace {
-/// Put a diagnostic on return statement of all inlined functions
-/// for which  the region of interest \p RegionOfInterest was passed into,
-/// but not written inside, and it has caused an undefined read or a null
-/// pointer dereference outside.
-class NoStoreFuncVisitor final : public NoStateChangeFuncVisitor {
-  const SubRegion *RegionOfInterest;
-  MemRegionManager &MmrMgr;
-  const SourceManager &SM;
-  const PrintingPolicy &PP;
-
-  /// Recursion limit for dereferencing fields when looking for the
-  /// region of interest.
-  /// The limit of two indicates that we will dereference fields only once.
-  static const unsigned DEREFERENCE_LIMIT = 2;
-
-  using RegionVector = SmallVector<const MemRegion *, 5>;
-
-public:
-  NoStoreFuncVisitor(const SubRegion *R, bugreporter::TrackingKind TKind)
-      : NoStateChangeFuncVisitor(TKind), RegionOfInterest(R),
-        MmrMgr(R->getMemRegionManager()),
-        SM(MmrMgr.getContext().getSourceManager()),
-        PP(MmrMgr.getContext().getPrintingPolicy()) {}
-
-  void Profile(llvm::FoldingSetNodeID &ID) const override {
-    static int Tag = 0;
-    ID.AddPointer(&Tag);
-    ID.AddPointer(RegionOfInterest);
-  }
-
-private:
-  /// \return Whether \c RegionOfInterest was modified at \p CurrN compared to
-  /// the value it holds in \p CallExitBeginN.
-  bool wasModifiedBeforeCallExit(const ExplodedNode *CurrN,
-                                 const ExplodedNode *CallExitBeginN) override;
-
-  /// Attempts to find the region of interest in a given record decl,
-  /// by either following the base classes or fields.
-  /// Dereferences fields up to a given recursion limit.
-  /// Note that \p Vec is passed by value, leading to quadratic copying cost,
-  /// but it's OK in practice since its length is limited to DEREFERENCE_LIMIT.
-  /// \return A chain fields leading to the region of interest or std::nullopt.
-  const std::optional<RegionVector>
-  findRegionOfInterestInRecord(const RecordDecl *RD, ProgramStateRef State,
-                               const MemRegion *R, const RegionVector &Vec = {},
-                               int depth = 0);
-
-  // Region of interest corresponds to an IVar, exiting a method
-  // which could have written into that IVar, but did not.
-  PathDiagnosticPieceRef maybeEmitNoteForObjCSelf(PathSensitiveBugReport &R,
-                                                  const ObjCMethodCall &Call,
-                                                  const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef maybeEmitNoteForCXXThis(PathSensitiveBugReport &R,
-                                                 const CXXConstructorCall &Call,
-                                                 const ExplodedNode *N) final;
-
-  PathDiagnosticPieceRef
-  maybeEmitNoteForParameters(PathSensitiveBugReport &R, const CallEvent &Call,
-                             const ExplodedNode *N) final;
-
-  /// Consume the information on the no-store stack frame in order to
-  /// either emit a note or suppress the report enirely.
-  /// \return Diagnostics piece for region not modified in the current function,
-  /// if it decides to emit one.
-  PathDiagnosticPieceRef
-  maybeEmitNote(PathSensitiveBugReport &R, const CallEvent &Call,
-                const ExplodedNode *N, const RegionVector &FieldChain,
-                const MemRegion *MatchedRegion, StringRef FirstElement,
-                bool FirstIsReferenceType, unsigned IndirectionLevel);
-
-  bool prettyPrintRegionName(const RegionVector &FieldChain,
-                             const MemRegion *MatchedRegion,
-                             StringRef FirstElement, bool FirstIsReferenceType,
-                             unsigned IndirectionLevel,
-                             llvm::raw_svector_ostream &os);
-
-  StringRef prettyPrintFirstElement(StringRef FirstElement,
-                                    bool MoreItemsExpected,
-                                    int IndirectionLevel,
-                                    llvm::raw_svector_ostream &os);
-};
-} // namespace
-
 /// \return Whether the method declaration \p Parent
 /// syntactically has a binary operation writing into the ivar \p Ivar.
 static bool potentiallyWritesIntoIvar(const Decl *Parent,

diff  --git a/clang/lib/StaticAnalyzer/Core/SVals.cpp b/clang/lib/StaticAnalyzer/Core/SVals.cpp
index 291e4fa752a8f7..84e7e033404c03 100644
--- a/clang/lib/StaticAnalyzer/Core/SVals.cpp
+++ b/clang/lib/StaticAnalyzer/Core/SVals.cpp
@@ -263,6 +263,23 @@ bool SVal::isZeroConstant() const {
 // Pretty-Printing.
 //===----------------------------------------------------------------------===//
 
+StringRef SVal::getKindStr() const {
+  switch (getKind()) {
+#define BASIC_SVAL(Id, Parent)                                                 \
+  case Id##Kind:                                                               \
+    return #Id;
+#define LOC_SVAL(Id, Parent)                                                   \
+  case Loc##Id##Kind:                                                          \
+    return #Id;
+#define NONLOC_SVAL(Id, Parent)                                                \
+  case NonLoc##Id##Kind:                                                       \
+    return #Id;
+#include "clang/StaticAnalyzer/Core/PathSensitive/SVals.def"
+#undef REGION
+  }
+  llvm_unreachable("Unkown kind!");
+}
+
 LLVM_DUMP_METHOD void SVal::dump() const { dumpToStream(llvm::errs()); }
 
 void SVal::printJson(raw_ostream &Out, bool AddQuotes) const {

diff  --git a/clang/test/Analysis/cstring-uninitread-notes.c b/clang/test/Analysis/cstring-uninitread-notes.c
new file mode 100644
index 00000000000000..b62519a85c8cc9
--- /dev/null
+++ b/clang/test/Analysis/cstring-uninitread-notes.c
@@ -0,0 +1,25 @@
+// RUN: %clang_analyze_cc1 -verify %s \
+// RUN:   -analyzer-checker=core,alpha.unix.cstring \
+// RUN:   -analyzer-output=text
+
+#include "Inputs/system-header-simulator.h"
+
+// Inspired by a report on ffmpeg, libavcodec/tiertexseqv.c, seq_decode_op1().
+int coin();
+
+void maybeWrite(const char *src, unsigned size, int *dst) {
+  if (coin()) // expected-note{{Assuming the condition is false}}
+              // expected-note at -1{{Taking false branch}}
+    memcpy(dst, src, size);
+} // expected-note{{Returning without writing to '*dst'}}
+
+void returning_without_writing_to_memcpy(const char *src, unsigned size) {
+  int block[8 * 8]; // expected-note{{'block' initialized here}}
+                                // expected-note at +1{{Calling 'maybeWrite'}}
+  maybeWrite(src, size, block); // expected-note{{Returning from 'maybeWrite'}}
+
+  int buf[8 * 8];
+  memcpy(buf, &block[0], 8); // expected-warning{{The first element of the 2nd argument is undefined [alpha.unix.cstring.UninitializedRead]}}
+                             // expected-note at -1{{The first element of the 2nd argument is undefined}}
+                             // expected-note at -2{{Other elements might also be undefined}}
+}


        


More information about the cfe-commits mailing list