[libcxx-commits] [clang] [libcxx] Warn when unique objects might be duplicated in shared libraries (PR #117622)

Devon Loehr via libcxx-commits libcxx-commits at lists.llvm.org
Mon Dec 2 12:41:40 PST 2024


https://github.com/DKLoehr updated https://github.com/llvm/llvm-project/pull/117622

>From d944b2fde573a4fb352400ce3425121265b02685 Mon Sep 17 00:00:00 2001
From: Devon Loehr <dloehr at google.com>
Date: Thu, 21 Nov 2024 19:29:00 +0000
Subject: [PATCH 1/2] Warn when unique objects might be duplicated in shared
 libraries

When a hidden object is built into multiple shared libraries, each
instance of the library will get its own copy. If
the object was supposed to be globally unique (e.g. a global
variable or static member), this can cause very subtle bugs.

An object might be incorrectly duplicated if it:
- Is defined in a header (so it might appear in multiple TUs), and
- Has external linkage (otherwise it's supposed to be duplicated), and
- Has hidden visibility (or else the dynamic linker will handle it)

The duplication is only a problem if one of the following is true:
1. The object is mutable (the copies won't be in sync), or
2. Its initialization has side effects (it may now run more than once), or
3. The value of its address is used (which one?).

To detect this, we add a new -Wunique-object-duplication warning.
It warns on cases (1) and (2) above. To be conservative, we only
warn in case (2) if we are certain the initializer has side effects,
and we don't warn on `new` because the only side effect is some
extra memory usage.

We don't currently warn on case (3) because doing so is prone to
false positives: there are many reasons for taking the address which
aren't inherently problematic (e.g. passing to a function that expects
a pointer). We only run into problems if the code inspects the value
of the address.

The check is currently disabled for windows, which uses its own analogue
of visibility (declimport/declexport). The check is also disabled inside
templates, since it can give false positives if a template is never
instantiated.
---
 clang/include/clang/Basic/DiagnosticGroups.td |   1 +
 .../clang/Basic/DiagnosticSemaKinds.td        |   9 +
 clang/include/clang/Sema/Sema.h               |   6 +
 clang/lib/Sema/SemaDecl.cpp                   | 101 ++++++++++
 .../SemaCXX/unique_object_duplication.cpp     |  16 ++
 .../test/SemaCXX/unique_object_duplication.h  | 187 ++++++++++++++++++
 6 files changed, 320 insertions(+)
 create mode 100644 clang/test/SemaCXX/unique_object_duplication.cpp
 create mode 100644 clang/test/SemaCXX/unique_object_duplication.h

diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index df9bf94b5d0398..c8c446a7b8cfa4 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -690,6 +690,7 @@ def SuspiciousMemaccess : DiagGroup<"suspicious-memaccess",
    NonTrivialMemaccess, MemsetTransposedArgs, SuspiciousBzero]>;
 def StaticInInline : DiagGroup<"static-in-inline">;
 def StaticLocalInInline : DiagGroup<"static-local-in-inline">;
+def UniqueObjectDuplication : DiagGroup<"unique-object-duplication">;
 def GNUStaticFloatInit : DiagGroup<"gnu-static-float-init">;
 def StaticFloatInit : DiagGroup<"static-float-init", [GNUStaticFloatInit]>;
 // Allow differentiation between GNU statement expressions in a macro versus
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 6ff24c2bc8faad..b8768c28a0a3e3 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -6104,6 +6104,15 @@ def warn_static_local_in_extern_inline : Warning<
 def note_convert_inline_to_static : Note<
   "use 'static' to give inline function %0 internal linkage">;
 
+def warn_possible_object_duplication_mutable : Warning<
+  "%0 is mutable, has hidden visibility, and external linkage; it may be "
+  "duplicated when built into a shared library">,
+  InGroup<UniqueObjectDuplication>;
+def warn_possible_object_duplication_init : Warning<
+  "%0 has hidden visibility, and external linkage; its initialization may run "
+  "more than once when built into a shared library">,
+  InGroup<UniqueObjectDuplication>;
+
 def ext_redefinition_of_typedef : ExtWarn<
   "redefinition of typedef %0 is a C11 feature">,
   InGroup<DiagGroup<"typedef-redefinition"> >;
diff --git a/clang/include/clang/Sema/Sema.h b/clang/include/clang/Sema/Sema.h
index 24abd5d95dd844..124979973ad15e 100644
--- a/clang/include/clang/Sema/Sema.h
+++ b/clang/include/clang/Sema/Sema.h
@@ -3643,6 +3643,12 @@ class Sema final : public SemaBase {
                              NonTrivialCUnionContext UseContext,
                              unsigned NonTrivialKind);
 
+  /// Certain globally-unique variables might be accidentally duplicated if
+  /// built into multiple shared libraries with hidden visibility. This can
+  /// cause problems if the variable is mutable, its initialization is
+  /// effectful, or its address is taken.
+  bool GloballyUniqueObjectMightBeAccidentallyDuplicated(const VarDecl *dcl);
+
   /// AddInitializerToDecl - Adds the initializer Init to the
   /// declaration dcl. If DirectInit is true, this is C++ direct
   /// initialization rather than copy initialization.
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 74b0e5ad23bd48..01f69936fa18e3 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -13373,6 +13373,62 @@ void Sema::checkNonTrivialCUnion(QualType QT, SourceLocation Loc,
         .visit(QT, nullptr, false);
 }
 
+bool Sema::GloballyUniqueObjectMightBeAccidentallyDuplicated(
+    const VarDecl *dcl) {
+  if (!dcl || !getLangOpts().CPlusPlus)
+    return false;
+
+  // If an object is defined in a source file, its definition can't get
+  // duplicated since it will never appear in more than one TU.
+  if (dcl->getASTContext().getSourceManager().isInMainFile(dcl->getLocation()))
+    return false;
+
+  // We only need to warn if the definition is in a header file, so wait to
+  // diagnose until we've seen the definition.
+  if (!dcl->isThisDeclarationADefinition())
+    return false;
+
+  // If the variable we're looking at is a static local, then we actually care
+  // about the properties of the function containing it.
+  const ValueDecl *target = dcl;
+  // VarDecls and FunctionDecls have different functions for checking
+  // inline-ness, so we have to do it manually.
+  bool target_is_inline = dcl->isInline();
+
+  // Update the target and target_is_inline property if necessary
+  if (dcl->isStaticLocal()) {
+    const DeclContext *ctx = dcl->getDeclContext();
+    if (!ctx)
+      return false;
+
+    const FunctionDecl *f_dcl =
+        dyn_cast_if_present<FunctionDecl>(ctx->getNonClosureAncestor());
+    if (!f_dcl)
+      return false;
+
+    target = f_dcl;
+    // IsInlined() checks for the C++ inline property
+    target_is_inline = f_dcl->isInlined();
+  }
+
+  // Non-inline variables can only legally appear in one TU
+  // FIXME: This also applies to templated variables, but that can rarely lead
+  // to false positives so templates are disabled for now.
+  if (!target_is_inline)
+    return false;
+
+  // If the object isn't hidden, the dynamic linker will prevent duplication.
+  clang::LinkageInfo lnk = target->getLinkageAndVisibility();
+  if (lnk.getVisibility() != HiddenVisibility)
+    return false;
+
+  // If the obj doesn't have external linkage, it's supposed to be duplicated.
+  if (!isExternalFormalLinkage(lnk.getLinkage()))
+    return false;
+
+  return true;
+}
+
 void Sema::AddInitializerToDecl(Decl *RealDecl, Expr *Init, bool DirectInit) {
   // If there is no declaration, there was an error parsing it.  Just ignore
   // the initializer.
@@ -14777,6 +14833,51 @@ void Sema::FinalizeDeclaration(Decl *ThisDecl) {
   if (DC->getRedeclContext()->isFileContext() && VD->isExternallyVisible())
     AddPushedVisibilityAttribute(VD);
 
+  // If this object has external linkage and hidden visibility, it might be
+  // duplicated when built into a shared library, which causes problems if it's
+  // mutable (since the copies won't be in sync) or its initialization has side
+  // effects (since it will run once per copy instead of once globally)
+  // FIXME: Windows uses dllexport/dllimport instead of visibility, and we don't
+  // handle that yet. Disable the warning on Windows for now.
+  // FIXME: Checking templates can cause false positives if the template in
+  // question is never instantiated (e.g. only specialized templates are used).
+  if (!Context.getTargetInfo().shouldDLLImportComdatSymbols() &&
+      !VD->isTemplated() &&
+      GloballyUniqueObjectMightBeAccidentallyDuplicated(VD)) {
+    // Check mutability. For pointers, ensure that both the pointer and the
+    // pointee are (recursively) const.
+    QualType Type = VD->getType().getNonReferenceType();
+    if (!Type.isConstant(VD->getASTContext())) {
+      Diag(VD->getLocation(), diag::warn_possible_object_duplication_mutable)
+          << VD;
+    } else {
+      while (Type->isPointerType()) {
+        Type = Type->getPointeeType();
+        if (Type->isFunctionType())
+          break;
+        if (!Type.isConstant(VD->getASTContext())) {
+          Diag(VD->getLocation(),
+               diag::warn_possible_object_duplication_mutable)
+              << VD;
+          break;
+        }
+      }
+    }
+
+    // To keep false positives low, only warn if we're certain that the
+    // initializer has side effects. Don't warn on operator new, since a mutable
+    // pointer will trigger the previous warning, and an immutable pointer
+    // getting duplicated just results in a little extra memory usage.
+    const Expr *Init = VD->getAnyInitializer();
+    if (Init &&
+        Init->HasSideEffects(VD->getASTContext(),
+                             /*IncludePossibleEffects=*/false) &&
+        !isa<CXXNewExpr>(Init->IgnoreParenImpCasts())) {
+      Diag(Init->getExprLoc(), diag::warn_possible_object_duplication_init)
+          << VD;
+    }
+  }
+
   // FIXME: Warn on unused var template partial specializations.
   if (VD->isFileVarDecl() && !isa<VarTemplatePartialSpecializationDecl>(VD))
     MarkUnusedFileScopedDecl(VD);
diff --git a/clang/test/SemaCXX/unique_object_duplication.cpp b/clang/test/SemaCXX/unique_object_duplication.cpp
new file mode 100644
index 00000000000000..d08d053c84ae3c
--- /dev/null
+++ b/clang/test/SemaCXX/unique_object_duplication.cpp
@@ -0,0 +1,16 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=hidden -Wunique-object-duplication -fvisibility=hidden -Wno-unused-value %s
+// RUN: %clang_cc1 -fsyntax-only -verify -Wunique-object-duplication -Wno-unused-value %s
+// The check is currently disabled on windows. The test should fail because we're not getting the expected warnings.
+// XFAIL: target={{.*}}-windows{{.*}}
+
+#include "unique_object_duplication.h"
+
+// Everything in these namespaces here is defined in the cpp file,
+// so won't get duplicated
+
+namespace GlobalTest {
+  float Test::allowedStaticMember1 = 2.3;
+}
+
+bool disallowed4 = true;
+constexpr inline bool disallowed5 = true;
\ No newline at end of file
diff --git a/clang/test/SemaCXX/unique_object_duplication.h b/clang/test/SemaCXX/unique_object_duplication.h
new file mode 100644
index 00000000000000..39afa67ca2ca6c
--- /dev/null
+++ b/clang/test/SemaCXX/unique_object_duplication.h
@@ -0,0 +1,187 @@
+/**
+ * When building shared libraries, hidden objects which are defined in header
+ * files will be duplicated, with one copy in each shared library. If the object
+ * was meant to be globally unique (one copy per program), this can cause very
+ * subtle bugs. This file contains tests for the -Wunique-object-duplication
+ * warning, which is meant to detect this.
+ * 
+ * Roughly, an object might be incorrectly duplicated if:
+ * - Is defined in a header (so it might appear in multiple TUs), and
+ * - Has external linkage (otherwise it's supposed to be duplicated), and
+ * - Has hidden visibility (or else the dynamic linker will handle it)
+ * 
+ * Duplication becomes an issue only if one of the following is true:
+ * - The object is mutable (the copies won't be in sync), or
+ * - Its initialization may has side effects (it may now run more than once), or
+ * - The value of its address is used.
+ * 
+ * Currently, we only detect the first two, and only warn on effectful
+ * initialization if we're certain there are side effects. Warning if the
+ * address is taken is prone to false positives, so we don't warn for now.
+ * 
+ * The check is also disabled on Windows for now, since it uses 
+ * dllimport/dllexport instead of visibility.
+ */
+
+#define HIDDEN __attribute__((visibility("hidden")))
+#define DEFAULT __attribute__((visibility("default")))
+
+// Helper functions
+constexpr int init_constexpr(int x) { return x; };
+extern double init_dynamic(int);
+
+/******************************************************************************
+ * Case one: Static local variables in an externally-visible function
+ ******************************************************************************/
+namespace StaticLocalTest {
+
+inline void has_static_locals_external() {
+  // Mutable
+  static int disallowedStatic1 = 0; // hidden-warning {{'disallowedStatic1' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  // Initialization might run more than once
+  static const double disallowedStatic2 = disallowedStatic1++; // hidden-warning {{'disallowedStatic2' has hidden visibility, and external linkage; its initialization may run more than once when built into a shared library}}
+  
+  // OK, because immutable and compile-time-initialized
+  static constexpr int allowedStatic1 = 0;
+  static const float allowedStatic2 = 1;
+  static constexpr int allowedStatic3 = init_constexpr(2);
+  static const int allowedStatic4 = init_constexpr(3);
+}
+
+// Don't warn for non-inline functions, since they can't (legally) appear
+// in more than one TU in the first place.
+void has_static_locals_non_inline() {
+  // Mutable
+  static int allowedStatic1 = 0;
+  // Initialization might run more than once
+  static const double allowedStatic2 = allowedStatic1++;
+}
+
+// Everything in this function is OK because the function is TU-local
+static void has_static_locals_internal() {
+  static int allowedStatic1 = 0;
+  static double allowedStatic2 = init_dynamic(2);
+  static char allowedStatic3 = []() { return allowedStatic1++; }();
+
+  static constexpr int allowedStatic4 = 0;
+  static const float allowedStatic5 = 1;
+  static constexpr int allowedStatic6 = init_constexpr(2);
+  static const int allowedStatic7 = init_constexpr(3);
+}
+
+namespace {
+
+// Everything in this function is OK because the function is also TU-local
+void has_static_locals_anon() {
+  static int allowedStatic1 = 0;
+  static double allowedStatic2 = init_dynamic(2);
+  static char allowedStatic3 = []() { return allowedStatic1++; }();
+
+  static constexpr int allowedStatic4 = 0;
+  static const float allowedStatic5 = 1;
+  static constexpr int allowedStatic6 = init_constexpr(2);
+  static const int allowedStatic7 = init_constexpr(3);
+} 
+
+} // Anonymous namespace
+
+HIDDEN inline void static_local_always_hidden() {
+    static int disallowedStatic1 = 3; // hidden-warning {{'disallowedStatic1' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+                                      // expected-warning at -1 {{'disallowedStatic1' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+    {
+      static int disallowedStatic2 = 3; // hidden-warning {{'disallowedStatic2' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+                                        // expected-warning at -1 {{'disallowedStatic2' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+    }
+
+    auto lmb = []() {
+      static int disallowedStatic3 = 3; // hidden-warning {{'disallowedStatic3' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+                                        // expected-warning at -1 {{'disallowedStatic3' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+    };
+}
+
+DEFAULT void static_local_never_hidden() {
+    static int allowedStatic1 = 3; 
+
+    {
+      static int allowedStatic2 = 3; 
+    }
+
+    auto lmb = []() {
+      static int allowedStatic3 = 3;
+    };
+}
+
+// Don't warn on this because it's not in a function
+const int setByLambda = ([]() { static int x = 3; return x++; })();
+
+inline void has_extern_local() {
+  extern int allowedAddressExtern; // Not a definition
+}
+
+inline void has_regular_local() {
+  int allowedAddressLocal = 0;
+}
+
+inline void has_thread_local() {
+  // thread_local variables are static by default
+  thread_local int disallowedThreadLocal = 0; // hidden-warning {{'disallowedThreadLocal' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+}
+
+} // namespace StaticLocalTest
+
+/******************************************************************************
+ * Case two: Globals with external linkage
+ ******************************************************************************/
+namespace GlobalTest {
+  // Mutable
+  inline float disallowedGlobal1 = 3.14; // hidden-warning {{'disallowedGlobal1' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  // Same as above, but explicitly marked inline
+  inline float disallowedGlobal4 = 3.14; // hidden-warning {{'disallowedGlobal4' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  
+  // Initialization might run more than once
+  inline const double disallowedGlobal5 = disallowedGlobal1++; // hidden-warning {{'disallowedGlobal5' has hidden visibility, and external linkage; its initialization may run more than once when built into a shared library}}
+
+  // OK because internal linkage, so duplication is intended
+  static float allowedGlobal1 = 3.14;
+  const double allowedGlobal2 = init_dynamic(2);
+  static const char allowedGlobal3 = []() { return disallowedGlobal1++; }();
+  static inline double allowedGlobal4 = init_dynamic(2);
+
+  // OK, because immutable and compile-time-initialized
+  constexpr int allowedGlobal5 = 0;
+  const float allowedGlobal6 = 1;
+  constexpr int allowedGlobal7 = init_constexpr(2);
+  const int allowedGlobal8 = init_constexpr(3);
+
+  // We don't warn on this because non-inline variables can't (legally) appear
+  // in more than one TU.
+  float allowedGlobal9 = 3.14;
+  
+  // Pointers need to be double-const-qualified
+  inline float& nonConstReference = disallowedGlobal1; // hidden-warning {{'nonConstReference' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  const inline int& constReference = allowedGlobal5;
+
+  inline int* nonConstPointerToNonConst = nullptr; // hidden-warning {{'nonConstPointerToNonConst' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  inline int const* nonConstPointerToConst = nullptr; // hidden-warning {{'nonConstPointerToConst' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  inline int* const constPointerToNonConst = nullptr; // hidden-warning {{'constPointerToNonConst' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+  inline int const* const constPointerToConst = nullptr;
+  // Don't warn on new because it tends to generate false positives
+  inline int const* const constPointerToConstNew = new int(7);
+
+  inline int const * const * const * const nestedConstPointer = nullptr;
+  inline int const * const ** const * const nestedNonConstPointer = nullptr; // hidden-warning {{'nestedNonConstPointer' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+
+  struct Test {
+    static inline float disallowedStaticMember1; // hidden-warning {{'disallowedStaticMember1' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}       
+    // Defined below, in the header file
+    static float disallowedStaticMember2;                                       
+    // Defined in the cpp file, so won't get duplicated
+    static float allowedStaticMember1;
+
+    // Tests here are sparse because the AddrTest case below will define plenty
+    // more, which aren't problematic to define (because they're immutable), but
+    // may still cause problems if their address is taken.
+  };
+
+  inline float Test::disallowedStaticMember2 = 2.3; // hidden-warning {{'disallowedStaticMember2' is mutable, has hidden visibility, and external linkage; it may be duplicated when built into a shared library}}
+} // namespace GlobalTest
\ No newline at end of file

>From 2d6b13cc65d6b8c84e36ce50ff23bd31d671dbcd Mon Sep 17 00:00:00 2001
From: Devon Loehr <dloehr at google.com>
Date: Mon, 2 Dec 2024 20:39:15 +0000
Subject: [PATCH 2/2] Silence object duplication warning

The newly-added unique-object-duplication warning points out that
the libcpp debug randomization seed will be duplicated if built
into a shared library. To completely avoid the problem, the best
solution would be to make it externally visible, but that would
change the ABI. Instead, since it's only used for testing, we
silence the warning by making it const.
---
 libcxx/include/__algorithm/shuffle.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libcxx/include/__algorithm/shuffle.h b/libcxx/include/__algorithm/shuffle.h
index 7177fbb469ba7c..6f870d636c5682 100644
--- a/libcxx/include/__algorithm/shuffle.h
+++ b/libcxx/include/__algorithm/shuffle.h
@@ -56,7 +56,7 @@ class _LIBCPP_EXPORTED_FROM_ABI __libcpp_debug_randomizer {
 #ifdef _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY_SEED
     return _LIBCPP_DEBUG_RANDOMIZE_UNSPECIFIED_STABILITY_SEED;
 #else
-    static char __x;
+    static const char __x = '\0';
     return reinterpret_cast<uintptr_t>(&__x);
 #endif
   }



More information about the libcxx-commits mailing list