[clang] [clang][dataflow] Add matchers for smart pointer accessors to be cached (PR #120102)

Jan Voung via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 16 08:14:19 PST 2024


https://github.com/jvoung created https://github.com/llvm/llvm-project/pull/120102

This is part 1 of caching for smart pointer accessors, building on top
of the CachedConstAccessorsLattice, which caches "normal" accessors.

Smart pointer accessors are a bit different in that they may:
- have aliases to access the same underlying data (but potentially
  returning slightly different types like `&` vs `*`). Within a
  "checked" sequence users may mix uses of the different aliases and the
  check should apply to any of the spellings.
- may have non-const overloads in addition to the const version, where
  the non-const doesn't actually modify the container

Part 2 will follow and add transfer functions utilities. It will also
add a user UncheckedOptionalAccessModel. We'd seen false positives when
nesting StatusOr<optional<T>> and optional<StatusOr<T>>, etc. which this
can help address.


>From f9b8cbaed4c01c2051cdcde105d6a9bca1684388 Mon Sep 17 00:00:00 2001
From: Jan Voung <jvoung at gmail.com>
Date: Mon, 16 Dec 2024 16:06:43 +0000
Subject: [PATCH] [clang][dataflow] Add matchers for smart pointer accessors to
 be cached

This is part 1 of caching for smart pointer accessors, building on top
of the CachedConstAccessorsLattice, which caches "normal" accessors.

Smart pointer accessors are a bit different in that they may:
- have aliases to access the same underlying data (but potentially
  returning slightly different types like `&` vs `*`). Within a
  "checked" sequence users may mix uses of the different aliases and the
  check should apply to any of the spellings.
- may have non-const overloads in addition to the const version, where
  the non-const doesn't actually modify the container

Part 2 will follow and add transfer functions utilities. It will also
add a user UncheckedOptionalAccessModel. We'd seen false positives when
nesting StatusOr<optional<T>> and optional<StatusOr<T>>, etc. which this
can help address.
---
 .../SmartPointerAccessorCaching.h             |  63 ++++++
 .../lib/Analysis/FlowSensitive/CMakeLists.txt |   1 +
 .../SmartPointerAccessorCaching.cpp           | 134 ++++++++++++
 .../Analysis/FlowSensitive/CMakeLists.txt     |   1 +
 .../SmartPointerAccessorCachingTest.cpp       | 194 ++++++++++++++++++
 5 files changed, 393 insertions(+)
 create mode 100644 clang/include/clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h
 create mode 100644 clang/lib/Analysis/FlowSensitive/SmartPointerAccessorCaching.cpp
 create mode 100644 clang/unittests/Analysis/FlowSensitive/SmartPointerAccessorCachingTest.cpp

diff --git a/clang/include/clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h b/clang/include/clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h
new file mode 100644
index 00000000000000..1adb63af4e724b
--- /dev/null
+++ b/clang/include/clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h
@@ -0,0 +1,63 @@
+//===-- SmartPointerAccessorCaching.h ---------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines utilities to help cache accessors for smart pointer
+// like objects.
+//
+// These should be combined with CachedConstAccessorsLattice.
+// Beyond basic const accessors, smart pointers may have the following two
+// additional issues:
+//
+// 1) There may be multiple accessors for the same underlying object, e.g.
+//    `operator->`, `operator*`, and `get`. Users may use a mixture of these
+//    accessors, so the cache should unify them.
+//
+// 2) There may be non-const overloads of accessors. They are still safe to
+//    cache, as they don't modify the container object.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SMARTPOINTERACCESSORCACHING_H
+#define LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SMARTPOINTERACCESSORCACHING_H
+
+#include <cassert>
+
+#include "clang/AST/Decl.h"
+#include "clang/AST/Stmt.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+
+namespace clang::dataflow {
+
+/// Matchers:
+/// For now, these match on any class with an `operator*` or `operator->`
+/// where the return types have a similar shape as std::unique_ptr
+/// and std::optional.
+///
+/// - `*` returns a reference to a type `T`
+/// - `->` returns a pointer to `T`
+/// - `get` returns a pointer to `T`
+/// - `value` returns a reference `T`
+///
+/// (1) The `T` should all match across the accessors (ignoring qualifiers).
+///
+/// (2) The specific accessor used in a call isn't required to be const,
+///     but the class must have a const overload of each accessor.
+///
+/// For now, we don't have customization to ignore certain classes.
+/// For example, if writing a ClangTidy check for `std::optional`, these
+/// would also match `std::optional`. In order to have special handling
+/// for `std::optional`, we assume the (Matcher, TransferFunction) case
+/// with custom handling is ordered early so that these generic cases
+/// do not trigger.
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeOperatorStar();
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeOperatorArrow();
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeValueMethodCall();
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeGetMethodCall();
+
+} // namespace clang::dataflow
+
+#endif // LLVM_CLANG_ANALYSIS_FLOWSENSITIVE_SMARTPOINTERACCESSORCACHING_H
diff --git a/clang/lib/Analysis/FlowSensitive/CMakeLists.txt b/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
index 05cdaa7e27823d..0c30df8b4b194f 100644
--- a/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
+++ b/clang/lib/Analysis/FlowSensitive/CMakeLists.txt
@@ -10,6 +10,7 @@ add_clang_library(clangAnalysisFlowSensitive
   Logger.cpp
   RecordOps.cpp
   SimplifyConstraints.cpp
+  SmartPointerAccessorCaching.cpp
   Transfer.cpp
   TypeErasedDataflowAnalysis.cpp
   Value.cpp
diff --git a/clang/lib/Analysis/FlowSensitive/SmartPointerAccessorCaching.cpp b/clang/lib/Analysis/FlowSensitive/SmartPointerAccessorCaching.cpp
new file mode 100644
index 00000000000000..967455322b57fe
--- /dev/null
+++ b/clang/lib/Analysis/FlowSensitive/SmartPointerAccessorCaching.cpp
@@ -0,0 +1,134 @@
+#include "clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h"
+
+#include "clang/AST/CanonicalType.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Basic/OperatorKinds.h"
+
+namespace clang::dataflow {
+
+namespace {
+
+using ast_matchers::callee;
+using ast_matchers::cxxMemberCallExpr;
+using ast_matchers::cxxMethodDecl;
+using ast_matchers::cxxOperatorCallExpr;
+using ast_matchers::hasName;
+using ast_matchers::hasOverloadedOperatorName;
+using ast_matchers::ofClass;
+using ast_matchers::parameterCountIs;
+using ast_matchers::pointerType;
+using ast_matchers::referenceType;
+using ast_matchers::returns;
+
+bool hasSmartPointerClassShape(const CXXRecordDecl &RD, bool &HasGet,
+                               bool &HasValue) {
+  // We may want to cache this search, but in current profiles it hasn't shown
+  // up as a hot spot (possibly because there aren't many hits, relatively).
+  bool HasArrow = false;
+  bool HasStar = false;
+  CanQualType StarReturnType, ArrowReturnType, GetReturnType, ValueReturnType;
+  for (const auto *MD : RD.methods()) {
+    // We only consider methods that are const and have zero parameters.
+    // It may be that there is a non-const overload for the method, but
+    // there should at least be a const overload as well.
+    if (!MD->isConst() || MD->getNumParams() != 0) {
+      continue;
+    }
+    if (MD->getOverloadedOperator() == OO_Star &&
+        MD->getReturnType()->isReferenceType()) {
+      HasStar = true;
+      StarReturnType = MD->getReturnType()
+                           .getNonReferenceType()
+                           ->getCanonicalTypeUnqualified();
+    } else if (MD->getOverloadedOperator() == OO_Arrow &&
+               MD->getReturnType()->isPointerType()) {
+      HasArrow = true;
+      ArrowReturnType =
+          MD->getReturnType()->getPointeeType()->getCanonicalTypeUnqualified();
+    } else {
+      IdentifierInfo *II = MD->getIdentifier();
+      if (II == nullptr)
+        continue;
+      if (II->isStr("get") && MD->getReturnType()->isPointerType()) {
+        HasGet = true;
+        GetReturnType = MD->getReturnType()
+                            ->getPointeeType()
+                            ->getCanonicalTypeUnqualified();
+      } else if (II->isStr("value") && MD->getReturnType()->isReferenceType()) {
+        HasValue = true;
+        ValueReturnType = MD->getReturnType()
+                              .getNonReferenceType()
+                              ->getCanonicalTypeUnqualified();
+      }
+    }
+  }
+
+  if (!HasStar || !HasArrow || StarReturnType != ArrowReturnType)
+    return false;
+  HasGet = HasGet && (GetReturnType == StarReturnType);
+  HasValue = HasValue && (ValueReturnType == StarReturnType);
+  return true;
+}
+
+} // namespace
+} // namespace clang::dataflow
+
+// AST_MATCHER macros create an "internal" namespace, so we put it in
+// its own anonymous namespace instead of in clang::dataflow.
+namespace {
+
+AST_MATCHER(clang::CXXRecordDecl, smartPointerClassWithGet) {
+  bool HasGet = false;
+  bool HasValue = false;
+  bool HasStarAndArrow =
+      clang::dataflow::hasSmartPointerClassShape(Node, HasGet, HasValue);
+  return HasStarAndArrow && HasGet;
+}
+
+AST_MATCHER(clang::CXXRecordDecl, smartPointerClassWithValue) {
+  bool HasGet = false;
+  bool HasValue = false;
+  bool HasStarAndArrow =
+      clang::dataflow::hasSmartPointerClassShape(Node, HasGet, HasValue);
+  return HasStarAndArrow && HasValue;
+}
+
+AST_MATCHER(clang::CXXRecordDecl, smartPointerClassWithGetOrValue) {
+  bool HasGet = false;
+  bool HasValue = false;
+  bool HasStarAndArrow =
+      clang::dataflow::hasSmartPointerClassShape(Node, HasGet, HasValue);
+  return HasStarAndArrow && (HasGet || HasValue);
+}
+
+} // namespace
+
+namespace clang::dataflow {
+
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeOperatorStar() {
+  return cxxOperatorCallExpr(
+      hasOverloadedOperatorName("*"),
+      callee(cxxMethodDecl(parameterCountIs(0), returns(referenceType()),
+                           ofClass(smartPointerClassWithGetOrValue()))));
+}
+
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeOperatorArrow() {
+  return cxxOperatorCallExpr(
+      hasOverloadedOperatorName("->"),
+      callee(cxxMethodDecl(parameterCountIs(0), returns(pointerType()),
+                           ofClass(smartPointerClassWithGetOrValue()))));
+}
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeValueMethodCall() {
+  return cxxMemberCallExpr(callee(
+      cxxMethodDecl(parameterCountIs(0), returns(referenceType()),
+                    hasName("value"), ofClass(smartPointerClassWithValue()))));
+}
+
+ast_matchers::internal::Matcher<Stmt> isSmartPointerLikeGetMethodCall() {
+  return cxxMemberCallExpr(callee(
+      cxxMethodDecl(parameterCountIs(0), returns(pointerType()), hasName("get"),
+                    ofClass(smartPointerClassWithGet()))));
+}
+
+} // namespace clang::dataflow
diff --git a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
index 4e1819bfa166a8..6c01ae8fc2e541 100644
--- a/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
+++ b/clang/unittests/Analysis/FlowSensitive/CMakeLists.txt
@@ -21,6 +21,7 @@ add_clang_unittest(ClangAnalysisFlowSensitiveTests
   SignAnalysisTest.cpp
   SimplifyConstraintsTest.cpp
   SingleVarConstantPropagationTest.cpp
+  SmartPointerAccessorCachingTest.cpp
   TestingSupport.cpp
   TestingSupportTest.cpp
   TransferBranchTest.cpp
diff --git a/clang/unittests/Analysis/FlowSensitive/SmartPointerAccessorCachingTest.cpp b/clang/unittests/Analysis/FlowSensitive/SmartPointerAccessorCachingTest.cpp
new file mode 100644
index 00000000000000..a69d606cdb85ea
--- /dev/null
+++ b/clang/unittests/Analysis/FlowSensitive/SmartPointerAccessorCachingTest.cpp
@@ -0,0 +1,194 @@
+//===- unittests/Analysis/FlowSensitive/SmartPointerAccessorCachingTest.cpp ==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/Analysis/FlowSensitive/SmartPointerAccessorCaching.h"
+
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/Testing/TestAST.h"
+#include "llvm/ADT/StringRef.h"
+#include "gtest/gtest.h"
+
+namespace clang::dataflow {
+namespace {
+
+using clang::ast_matchers::match;
+
+template <typename MatcherT>
+bool matches(llvm::StringRef Decls, llvm::StringRef TestInput,
+             MatcherT Matcher) {
+  TestAST InputAST(Decls.str() + TestInput.str());
+  return !match(Matcher, InputAST.context()).empty();
+}
+
+TEST(SmartPointerAccessorCachingTest, MatchesClassWithStarArrowGet) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    template <class T>
+    struct unique_ptr {
+      T* operator->() const;
+      T& operator*() const;
+      T* get() const;
+    };
+    }  // namespace std
+
+    template <class T>
+    using UniquePtrAlias = std::unique_ptr<T>;
+
+    struct S { int i; };
+  )cc");
+
+  EXPECT_TRUE(matches(Decls,
+                      "int target(std::unique_ptr<S> P) { return (*P).i; }",
+                      isSmartPointerLikeOperatorStar()));
+  EXPECT_TRUE(matches(Decls,
+                      "int target(std::unique_ptr<S> P) { return P->i; }",
+                      isSmartPointerLikeOperatorArrow()));
+  EXPECT_TRUE(matches(Decls,
+                      "int target(std::unique_ptr<S> P) { return P.get()->i; }",
+                      isSmartPointerLikeGetMethodCall()));
+
+  EXPECT_TRUE(matches(Decls, "int target(UniquePtrAlias<S> P) { return P->i; }",
+                      isSmartPointerLikeOperatorArrow()));
+}
+
+TEST(SmartPointerAccessorCachingTest, NoMatchIfUnexpectedReturnTypes) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    // unique_ptr isn't really like this, but we aren't matching by name
+    template <class T, class U>
+    struct unique_ptr {
+      U* operator->() const;
+      T& operator*() const;
+      T* get() const;
+    };
+    }  // namespace std
+
+    struct S { int i; };
+    struct T { int j; };
+  )cc");
+
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S, T> P) { return (*P).i; }",
+                       isSmartPointerLikeOperatorStar()));
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S, T> P) { return P->j; }",
+                       isSmartPointerLikeOperatorArrow()));
+  // The class matching arguably accidentally matches, just because the
+  // instantiation is with S, S. Hopefully doesn't happen too much in real code
+  // with such operator* and operator-> overloads.
+  EXPECT_TRUE(matches(Decls,
+                      "int target(std::unique_ptr<S, S> P) { return P->i; }",
+                      isSmartPointerLikeOperatorArrow()));
+}
+
+TEST(SmartPointerAccessorCachingTest, NoMatchIfBinaryStar) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    template <class T>
+    struct unique_ptr {
+      T* operator->() const;
+      T& operator*(int x) const;
+      T* get() const;
+    };
+    }  // namespace std
+
+    struct S { int i; };
+  )cc");
+
+  EXPECT_FALSE(
+      matches(Decls, "int target(std::unique_ptr<S> P) { return (P * 10).i; }",
+              isSmartPointerLikeOperatorStar()));
+}
+
+TEST(SmartPointerAccessorCachingTest, NoMatchIfNoConstOverloads) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    template <class T>
+    struct unique_ptr {
+      T* operator->();
+      T& operator*();
+      T* get();
+    };
+    }  // namespace std
+
+    struct S { int i; };
+  )cc");
+
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S> P) { return (*P).i; }",
+                       isSmartPointerLikeOperatorStar()));
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S> P) { return P->i; }",
+                       isSmartPointerLikeOperatorArrow()));
+  EXPECT_FALSE(
+      matches(Decls, "int target(std::unique_ptr<S> P) { return P.get()->i; }",
+              isSmartPointerLikeGetMethodCall()));
+}
+
+TEST(SmartPointerAccessorCachingTest, NoMatchIfNoStarMethod) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    template <class T>
+    struct unique_ptr {
+      T* operator->();
+      T* get();
+    };
+    }  // namespace std
+
+    struct S { int i; };
+  )cc");
+
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S> P) { return P->i; }",
+                       isSmartPointerLikeOperatorArrow()));
+  EXPECT_FALSE(matches(Decls,
+                       "int target(std::unique_ptr<S> P) { return P->i; }",
+                       isSmartPointerLikeGetMethodCall()));
+}
+
+TEST(SmartPointerAccessorCachingTest, MatchesWithValueAndNonConstOverloads) {
+  llvm::StringRef Decls(R"cc(
+    namespace std {
+    template <class T>
+    struct optional {
+      const T* operator->() const;
+      T* operator->();
+      const T& operator*() const;
+      T& operator*();
+      const T& value() const;
+      T& value();
+    };
+    }  // namespace std
+
+    struct S { int i; };
+  )cc");
+
+  EXPECT_TRUE(matches(
+      Decls, "int target(std::optional<S> &NonConst) { return (*NonConst).i; }",
+      isSmartPointerLikeOperatorStar()));
+  EXPECT_TRUE(matches(
+      Decls, "int target(const std::optional<S> &Const) { return (*Const).i; }",
+      isSmartPointerLikeOperatorStar()));
+  EXPECT_TRUE(matches(
+      Decls, "int target(std::optional<S> &NonConst) { return NonConst->i; }",
+      isSmartPointerLikeOperatorArrow()));
+  EXPECT_TRUE(matches(
+      Decls, "int target(const std::optional<S> &Const) { return Const->i; }",
+      isSmartPointerLikeOperatorArrow()));
+  EXPECT_TRUE(matches(
+      Decls,
+      "int target(std::optional<S> &NonConst) { return NonConst.value().i; }",
+      isSmartPointerLikeValueMethodCall()));
+  EXPECT_TRUE(matches(
+      Decls,
+      "int target(const std::optional<S> &Const) { return Const.value().i; }",
+      isSmartPointerLikeValueMethodCall()));
+}
+
+} // namespace
+} // namespace clang::dataflow
\ No newline at end of file



More information about the cfe-commits mailing list