[clang] [analyzer] Add an ownership change visitor to StreamChecker (PR #94957)

Tue Jun 18 01:22:57 PDT 2024

================
@@ -696,6 +730,69 @@ struct StreamOperationEvaluator {
 
 } // end anonymous namespace
 
+//===----------------------------------------------------------------------===//
+// Definition of NoStreamStateChangeVisitor.
+//===----------------------------------------------------------------------===//
+
+namespace {
+class NoStreamStateChangeVisitor final : public NoOwnershipChangeVisitor {
+protected:
+  /// Syntactically checks whether the callee is a freeing function. Since
+  /// we have no path-sensitive information on this call (we would need a
+  /// CallEvent instead of a CallExpr for that), its possible that a
+  /// freeing function was called indirectly through a function pointer,
+  /// but we are not able to tell, so this is a best effort analysis.
+  bool isFreeingCallAsWritten(const CallExpr &Call) const {
+    const auto *StreamChk = static_cast<const StreamChecker *>(&Checker);
+    if (StreamChk->FCloseDesc.matchesAsWritten(Call))
+      return true;
+
+    return false;
+  }
+
+  bool doesFnIntendToHandleOwnership(const Decl *Callee,
+                                     ASTContext &ACtx) override {
+    using namespace clang::ast_matchers;
+    const FunctionDecl *FD = dyn_cast<FunctionDecl>(Callee);
+
+    auto Matches =
+        match(findAll(callExpr().bind("call")), *FD->getBody(), ACtx);
+    for (BoundNodes Match : Matches) {
+      if (const auto *Call = Match.getNodeAs<CallExpr>("call"))
+        if (isFreeingCallAsWritten(*Call))
+          return true;
+    }
----------------
Szelethus wrote:

Showing that a function intended to close a stream already relies on heuristics, and its easy to construct dumb counterexamples for it:

```c++
void absolutelyDoesntCloseItsParam(FILE *p) {
  if (coin()) {
    FILE *q = fopen(...);
    fclose(q); // oh, we saw an fclose, this must've been intended for the param...
  }
}
```
We can surely make this smarter, maybe we should, but in finite steps we get to the point where we start rewriting the analyzer.

Regarding the CallGraph: if we start to wander of the beaten path (meaning the path of execution the analyzer was on), especially if we inspect non-trivial functions in the graph, that will lead to a weaker heuristic in my view. 

https://github.com/llvm/llvm-project/pull/94957