[clang] [clang][analyzer] Print empty per-EP metrics as empty CSV cells, fix missing PathRunningTime metric (PR #162839)

via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 10 06:01:33 PDT 2025


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang

Author: Arseniy Zaostrovnykh (necto)

<details>
<summary>Changes</summary>

To avoid information loss, introduce a difference between unset stats and 0 for statistics that are supposed to be set once per entry point. Now, if the statistic is not set for an entry point, the corresponding CSV cell will be empty, and not 0.

Thanks to this differentiation, I noticed that `PathRunningTime` was actually never set, and fixed that.
Additionally, this patch enables the timers if `DumpEntryPointStatsToCSV` is set, because in most cases you dump these stats to get a detailed view on analyzer performance.

Finally, I added a dedicated debug checker that demonstrates the use of a statistic and tested the set and unset scenarios explicitly.

--

CPP-7097

---
Full diff: https://github.com/llvm/llvm-project/pull/162839.diff


8 Files Affected:

- (modified) clang/docs/analyzer/developer-docs/Statistics.rst (+1-1) 
- (modified) clang/include/clang/StaticAnalyzer/Checkers/Checkers.td (+4) 
- (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h (+1-1) 
- (modified) clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt (+1) 
- (added) clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp (+59) 
- (modified) clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp (+49-20) 
- (modified) clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp (+4-1) 
- (modified) clang/test/Analysis/analyzer-stats/entry-point-stats.cpp (+33-16) 


``````````diff
diff --git a/clang/docs/analyzer/developer-docs/Statistics.rst b/clang/docs/analyzer/developer-docs/Statistics.rst
index 4f2484a89a6af..355759d468282 100644
--- a/clang/docs/analyzer/developer-docs/Statistics.rst
+++ b/clang/docs/analyzer/developer-docs/Statistics.rst
@@ -22,7 +22,7 @@ However, note that with ``LLVM_ENABLE_STATS`` disabled, only storage of the valu
 If you want to define a statistic only for entry point, EntryPointStats.h has four classes at your disposal:
 
 
-- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body".
+- ``UnsignedEPStat`` - an unsigned value assigned at most once per entry point. For example: "the number of source characters in an entry-point body". If no value is assigned during analysis of an entry point, the corresponding CSV cell will be empty.
 - ``CounterEPStat`` - an additive statistic. It starts with 0 and you can add to it as many times as needed. For example: "the number of bugs discovered".
 - ``UnsignedMaxEPStat`` - a maximizing statistic. It starts with 0 and when you join it with a value, it picks the maximum of the previous value and the new one. For example, "the longest execution path of a bug".
 
diff --git a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
index 4473c54d8d6e3..0db6794b2310a 100644
--- a/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
+++ b/clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
@@ -1502,6 +1502,10 @@ def TaintTesterChecker : Checker<"TaintTest">,
   HelpText<"Mark tainted symbols as such.">,
   Documentation<NotDocumented>;
 
+def UnsignedStatTesterChecker : Checker<"UnsignedStatTester">,
+  HelpText<"Test checker for demonstrating UnsignedEPStat usage.">,
+  Documentation<NotDocumented>;
+
 // This checker *technically* depends on SteamChecker, but we don't allow
 // dependency checkers to emit diagnostics, and a debug checker isn't worth
 // the chore needed to create a modeling portion on its own. Since this checker
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
index 389f17d36e65a..0a45deb33fcc9 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h
@@ -85,7 +85,7 @@ class UnsignedEPStat : public EntryPointStat {
 
 public:
   explicit UnsignedEPStat(llvm::StringLiteral Name);
-  unsigned value() const { return Value.value_or(0); }
+  std::optional<unsigned> value() const { return Value; }
   void reset() { Value.reset(); }
   void set(unsigned V) {
     assert(!Value.has_value());
diff --git a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
index 29d2c4512d470..f70ae1a3e8a96 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
+++ b/clang/lib/StaticAnalyzer/Checkers/CMakeLists.txt
@@ -125,6 +125,7 @@ add_clang_library(clangStaticAnalyzerCheckers
   UninitializedObject/UninitializedPointee.cpp
   UnixAPIChecker.cpp
   UnreachableCodeChecker.cpp
+  UnsignedStatTesterChecker.cpp
   VforkChecker.cpp
   VLASizeChecker.cpp
   VAListChecker.cpp
diff --git a/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
new file mode 100644
index 0000000000000..a302f506a285a
--- /dev/null
+++ b/clang/lib/StaticAnalyzer/Checkers/UnsignedStatTesterChecker.cpp
@@ -0,0 +1,59 @@
+//== UnsignedStatTesterChecker.cpp --------------------------- -*- C++ -*--=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This checker demonstrates the use of UnsignedEPStat for per-entry-point
+// statistics. It conditionally sets a statistic based on the entry point name.
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/StaticAnalyzer/Checkers/BuiltinCheckerRegistration.h"
+#include "clang/StaticAnalyzer/Core/Checker.h"
+#include "clang/StaticAnalyzer/Core/CheckerManager.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h"
+#include "clang/StaticAnalyzer/Core/PathSensitive/EntryPointStats.h"
+
+using namespace clang;
+using namespace ento;
+
+#define DEBUG_TYPE "UnsignedStatTester"
+
+static UnsignedEPStat DemoStat("DemoStat");
+
+namespace {
+class UnsignedStatTesterChecker : public Checker<check::BeginFunction> {
+public:
+  void checkBeginFunction(CheckerContext &C) const;
+};
+} // namespace
+
+void UnsignedStatTesterChecker::checkBeginFunction(CheckerContext &C) const {
+  const Decl *D = C.getLocationContext()->getDecl();
+  if (!D)
+    return;
+
+  std::string Name =
+      D->getAsFunction() ? D->getAsFunction()->getNameAsString() : "";
+
+  // Conditionally set the statistic based on the function name
+  if (Name == "func_one") {
+    DemoStat.set(1);
+  } else if (Name == "func_two") {
+    DemoStat.set(2);
+  } else if (Name == "func_three") {
+    DemoStat.set(3);
+  }
+  // For any other function (e.g., "func_none"), don't set the statistic
+}
+
+void ento::registerUnsignedStatTesterChecker(CheckerManager &mgr) {
+  mgr.registerChecker<UnsignedStatTesterChecker>();
+}
+
+bool ento::shouldRegisterUnsignedStatTesterChecker(const CheckerManager &mgr) {
+  return true;
+}
diff --git a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
index abfb176d6384d..33f5ccf6002ba 100644
--- a/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
+++ b/clang/lib/StaticAnalyzer/Core/EntryPointStats.cpp
@@ -24,15 +24,21 @@ using namespace ento;
 
 namespace {
 struct Registry {
+  std::vector<UnsignedEPStat *> ExplicitlySetStats;
+  std::vector<UnsignedMaxEPStat *> MaxStats;
   std::vector<CounterEPStat *> CounterStats;
-  std::vector<UnsignedMaxEPStat *> UnsignedMaxStats;
-  std::vector<UnsignedEPStat *> UnsignedStats;
 
   bool IsLocked = false;
 
   struct Snapshot {
     const Decl *EntryPoint;
-    std::vector<unsigned> UnsignedStatValues;
+    // Explicitly set statistics may not have a value set, so they are separate
+    // from other unsigned statistics
+    std::vector<std::optional<unsigned>> ExplicitlySetStatValues;
+    // These are counting and maximizing statistics that initialize to 0, which
+    // is meaningful even if they are never updated, so their value is always
+    // present.
+    std::vector<unsigned> MaxOrCountStatValues;
 
     void dumpAsCSV(llvm::raw_ostream &OS) const;
   };
@@ -46,9 +52,12 @@ static llvm::ManagedStatic<Registry> StatsRegistry;
 
 namespace {
 template <typename Callback> void enumerateStatVectors(const Callback &Fn) {
+  // This order is important, it matches the order of the Snapshot fields:
+  // - ExplicitlySetStatValues
+  Fn(StatsRegistry->ExplicitlySetStats);
+  // - MaxOrCountStatValues
+  Fn(StatsRegistry->MaxStats);
   Fn(StatsRegistry->CounterStats);
-  Fn(StatsRegistry->UnsignedMaxStats);
-  Fn(StatsRegistry->UnsignedStats);
 }
 } // namespace
 
@@ -101,30 +110,37 @@ UnsignedMaxEPStat::UnsignedMaxEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedMaxStats.push_back(this);
+  StatsRegistry->MaxStats.push_back(this);
 }
 
 UnsignedEPStat::UnsignedEPStat(llvm::StringLiteral Name)
     : EntryPointStat(Name) {
   assert(!StatsRegistry->IsLocked);
   assert(!isRegistered(Name));
-  StatsRegistry->UnsignedStats.push_back(this);
+  StatsRegistry->ExplicitlySetStats.push_back(this);
 }
 
-static std::vector<unsigned> consumeUnsignedStats() {
-  std::vector<unsigned> Result;
-  Result.reserve(StatsRegistry->CounterStats.size() +
-                 StatsRegistry->UnsignedMaxStats.size() +
-                 StatsRegistry->UnsignedStats.size());
-  for (auto *M : StatsRegistry->CounterStats) {
+static std::vector<std::optional<unsigned>>
+consumeExplicitlySetStats() {
+  std::vector<std::optional<unsigned>> Result;
+  Result.reserve(StatsRegistry->ExplicitlySetStats.size());
+  for (auto *M : StatsRegistry->ExplicitlySetStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedMaxStats) {
+  return Result;
+}
+
+static std::vector<unsigned> consumeMaxAndCounterStats() {
+  std::vector<unsigned> Result;
+  Result.reserve(StatsRegistry->CounterStats.size() +
+                 StatsRegistry->MaxStats.size());
+  // Order is important, it must match the order in enumerateStatVectors
+  for (auto *M : StatsRegistry->MaxStats) {
     Result.push_back(M->value());
     M->reset();
   }
-  for (auto *M : StatsRegistry->UnsignedStats) {
+  for (auto *M : StatsRegistry->CounterStats) {
     Result.push_back(M->value());
     M->reset();
   }
@@ -150,20 +166,33 @@ static std::string getUSR(const Decl *D) {
 }
 
 void Registry::Snapshot::dumpAsCSV(llvm::raw_ostream &OS) const {
+  auto printAsUnsignOpt = [&OS](std::optional<unsigned> U) {
+    OS << (U.has_value() ? std::to_string(*U) : "");
+  };
+  auto commaIfNeeded = [&OS](const auto &Vec1, const auto &Vec2) {
+    if (!Vec1.empty() && !Vec2.empty())
+      OS << ",";
+  };
+  auto printAsUnsigned = [&OS](unsigned U) { OS << U; };
+
   OS << '"';
   llvm::printEscapedString(getUSR(EntryPoint), OS);
   OS << "\",\"";
   OS << StatsRegistry->EscapedCPPFileName << "\",\"";
   llvm::printEscapedString(
       clang::AnalysisDeclContext::getFunctionName(EntryPoint), OS);
-  OS << "\"";
-  OS << (UnsignedStatValues.empty() ? "" : ",");
-  llvm::interleave(UnsignedStatValues, OS, [&OS](unsigned U) { OS << U; }, ",");
+  OS << "\",";
+  llvm::interleave(ExplicitlySetStatValues, OS, printAsUnsignOpt, ",");
+  commaIfNeeded(ExplicitlySetStatValues, MaxOrCountStatValues);
+  llvm::interleave(MaxOrCountStatValues, OS, printAsUnsigned, ",");
 }
 
 void EntryPointStat::takeSnapshot(const Decl *EntryPoint) {
-  auto UnsignedValues = consumeUnsignedStats();
-  StatsRegistry->Snapshots.push_back({EntryPoint, std::move(UnsignedValues)});
+  auto ExplicitlySetValues = consumeExplicitlySetStats();
+  auto MaxOrCounterValues = consumeMaxAndCounterStats();
+  StatsRegistry->Snapshots.push_back({EntryPoint,
+                                      std::move(ExplicitlySetValues),
+                                      std::move(MaxOrCounterValues)});
 }
 
 void EntryPointStat::dumpStatsAsCSV(llvm::StringRef FileName) {
diff --git a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
index cf01e2f37c662..e97004f1bbe52 100644
--- a/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
+++ b/clang/lib/StaticAnalyzer/Frontend/AnalysisConsumer.cpp
@@ -39,6 +39,7 @@
 #include "llvm/Support/TimeProfiler.h"
 #include "llvm/Support/Timer.h"
 #include "llvm/Support/raw_ostream.h"
+#include <cmath>
 #include <memory>
 #include <utility>
 
@@ -142,7 +143,7 @@ class AnalysisConsumer : public AnalysisASTConsumer,
     DigestAnalyzerOptions();
 
     if (Opts.AnalyzerDisplayProgress || Opts.PrintStats ||
-        Opts.ShouldSerializeStats) {
+        Opts.ShouldSerializeStats || !Opts.DumpEntryPointStatsToCSV.empty()) {
       AnalyzerTimers = std::make_unique<llvm::TimerGroup>(
           "analyzer", "Analyzer timers");
       SyntaxCheckTimer = std::make_unique<llvm::Timer>(
@@ -788,6 +789,8 @@ void AnalysisConsumer::RunPathSensitiveChecks(Decl *D,
     ExprEngineTimer->stopTimer();
     llvm::TimeRecord ExprEngineEndTime = ExprEngineTimer->getTotalTime();
     ExprEngineEndTime -= ExprEngineStartTime;
+    PathRunningTime.set(static_cast<unsigned>(
+        std::lround(ExprEngineEndTime.getWallTime() * 1000)));
     DisplayTime(ExprEngineEndTime);
   }
 
diff --git a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
index 9cbe04550a8d3..395d2b2fefa65 100644
--- a/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
+++ b/clang/test/Analysis/analyzer-stats/entry-point-stats.cpp
@@ -1,13 +1,21 @@
 // REQUIRES: asserts
-// RUN: %clang_analyze_cc1 -analyzer-checker=core \
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,debug.UnsignedStatTester \
 // RUN:    -analyzer-config dump-entry-point-stats-to-csv="%t.csv" \
 // RUN:    -verify %s
 // RUN: %csv2json "%t.csv" | FileCheck --check-prefix=CHECK %s
 //
 // CHECK:      {
 // CHECK-NEXT:   "c:@F at fib#i#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
 // CHECK-NEXT:     "DebugName": "fib(unsigned int)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -33,18 +41,31 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
+// CHECK-NEXT:   },
+// CHECK-NEXT:   "c:@F at func_one#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_one()",
+// CHECK:          "DemoStat": "1",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F at func_two#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "func_two()",
+// CHECK:          "DemoStat": "2",
+//             .... not interesting statistics
+// CHECK:        },
+// CHECK-NEXT:   "c:@F at main#I#**C#": {
+// CHECK-NEXT:     "File": "{{.*}}/entry-point-stats.cpp",
+// CHECK-NEXT:     "DebugName": "main(int, char **)",
+// CHECK-NEXT:     "DemoStat": "",
+// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
 // CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
-// CHECK-NEXT:   },
-// CHECK-NEXT:   "c:@F at main#I#**C#": {
-// CHECK-NEXT:     "File": "{{.*}}entry-point-stats.cpp",
-// CHECK-NEXT:     "DebugName": "main(int, char **)",
 // CHECK-NEXT:     "NumBlocks": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumBlocksUnreachable": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumCTUSteps": "{{[0-9]+}}",
@@ -70,14 +91,7 @@
 // CHECK-NEXT:     "NumTimesZ3SpendsTooMuchTimeOnASingleEQClass": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumTimesZ3TimedOut": "{{[0-9]+}}",
 // CHECK-NEXT:     "NumZ3QueriesDone": "{{[0-9]+}}",
-// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxCFGSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxQueueSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxReachableSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxTimeSpentSolvingZ3Queries": "{{[0-9]+}}",
-// CHECK-NEXT:     "MaxValidBugClassSize": "{{[0-9]+}}",
-// CHECK-NEXT:     "PathRunningTime": "{{[0-9]+}}"
+// CHECK-NEXT:     "TimeSpentSolvingZ3Queries": "{{[0-9]+}}"
 // CHECK-NEXT:   }
 // CHECK-NEXT: }
 // CHECK-NOT: non_entry_point
@@ -102,3 +116,6 @@ int main(int argc, char **argv) {
   int i = non_entry_point(argc);
   return i;
 }
+
+void func_one() {}
+void func_two() {}

``````````

</details>


https://github.com/llvm/llvm-project/pull/162839


More information about the cfe-commits mailing list