[clang] [analyzer] Revert #115918, so empty base class optimization works again (PR #157480)
Balazs Benics via cfe-commits
cfe-commits at lists.llvm.org
Mon Sep 8 07:49:38 PDT 2025
https://github.com/steakhal created https://github.com/llvm/llvm-project/pull/157480
Tldr;
We can't unconditionally trivially copy empty classes because that would clobber the stored entries in the object that the optimized empty class overlaps with.
This regression was introduced by #115918, which introduced other clobbering issues, like the handling of `[[no_unique_address]]` fields in #137252.
Read issue #157467 for the detailed explanation, but in short, I'd propose reverting the original patch because these was a lot of problems with it for arguably not much gain.
In particular, that patch was motivated by unifying the handling of classes so that copy events would be triggered for a class no matter if it had data members or not.
So in hindsight, it was not worth it.
I plan to backport this to clang-21 as well, and mention in the release notes that this should fix the regression from clang-20.
Fixes #157467
>From 63b1eea84ebd788a235b9e2ad87b308ded5b8254 Mon Sep 17 00:00:00 2001
From: Balazs Benics <benicsbalazs at gmail.com>
Date: Mon, 8 Sep 2025 16:47:33 +0200
Subject: [PATCH] [analyzer] Revert #115918, so empty base class optimization
works again
Tldr;
We can't unconditionally trivially copy empty classes because that would
clobber the stored entries in the object that the optimized empty class
overlaps with.
This regression was introduced by #115918, which introduced other
clobbering issues, like the handling of `[[no_unique_address]]` fields
in #137252.
Read issue #157467 for the detailed explanation, but in short, I'd
propose reverting the original patch because these was a lot of problems
with it for arguably not much gain.
In particular, that patch was motivated by unifying the handling of
classes so that copy events would be triggered for a class no matter if
it had data members or not.
So in hindsight, it was not worth it.
I plan to backport this to clang-21 as well, and mention in the release
notes that this should fix the regression from clang-20.
Fixes #157467
---
.../lib/StaticAnalyzer/Core/ExprEngineCXX.cpp | 35 ++++++++++-------
clang/test/Analysis/ctor-trivial-copy.cpp | 17 ++------
clang/test/Analysis/issue-157467.cpp | 39 +++++++++++++++++++
clang/test/Analysis/taint-generic.cpp | 3 +-
4 files changed, 66 insertions(+), 28 deletions(-)
create mode 100644 clang/test/Analysis/issue-157467.cpp
diff --git a/clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp b/clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
index c0b28d2ebb212..dee34e3e9d6a5 100644
--- a/clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
+++ b/clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp
@@ -71,21 +71,30 @@ void ExprEngine::performTrivialCopy(NodeBuilder &Bldr, ExplodedNode *Pred,
Bldr.takeNodes(Pred);
assert(ThisRD);
- SVal V = Call.getArgSVal(0);
- const Expr *VExpr = Call.getArgExpr(0);
- // If the value being copied is not unknown, load from its location to get
- // an aggregate rvalue.
- if (std::optional<Loc> L = V.getAs<Loc>())
- V = Pred->getState()->getSVal(*L);
- else
- assert(V.isUnknownOrUndef());
+ if (!ThisRD->isEmpty()) {
+ SVal V = Call.getArgSVal(0);
+ const Expr *VExpr = Call.getArgExpr(0);
- ExplodedNodeSet Tmp;
- evalLocation(Tmp, CallExpr, VExpr, Pred, Pred->getState(), V,
- /*isLoad=*/true);
- for (ExplodedNode *N : Tmp)
- evalBind(Dst, CallExpr, N, ThisVal, V, !AlwaysReturnsLValue);
+ // If the value being copied is not unknown, load from its location to get
+ // an aggregate rvalue.
+ if (std::optional<Loc> L = V.getAs<Loc>())
+ V = Pred->getState()->getSVal(*L);
+ else
+ assert(V.isUnknownOrUndef());
+
+ ExplodedNodeSet Tmp;
+ evalLocation(Tmp, CallExpr, VExpr, Pred, Pred->getState(), V,
+ /*isLoad=*/true);
+ for (ExplodedNode *N : Tmp)
+ evalBind(Dst, CallExpr, N, ThisVal, V, !AlwaysReturnsLValue);
+ } else {
+ // We can't copy empty classes because of empty base class optimization.
+ // In that case, copying the empty base class subobject would overwrite the
+ // object that it overlaps with - so let's not do that.
+ // See issue-157467.cpp for an example.
+ Dst.Add(Pred);
+ }
PostStmt PS(CallExpr, LCtx);
for (ExplodedNode *N : Dst) {
diff --git a/clang/test/Analysis/ctor-trivial-copy.cpp b/clang/test/Analysis/ctor-trivial-copy.cpp
index 45c8ca4c51776..940ff9ba3ed9c 100644
--- a/clang/test/Analysis/ctor-trivial-copy.cpp
+++ b/clang/test/Analysis/ctor-trivial-copy.cpp
@@ -46,15 +46,10 @@ void _01_empty_structs() {
empty Empty = conjure<empty>();
empty Empty2 = Empty;
empty Empty3 = Empty2;
- // All of these should refer to the exact same symbol, because all of
- // these trivial copies refer to the original conjured value.
- // There were Unknown before:
- clang_analyzer_denote(Empty, "$Empty");
- clang_analyzer_express(Empty); // expected-warning {{$Empty}}
- clang_analyzer_express(Empty2); // expected-warning {{$Empty}}
- clang_analyzer_express(Empty3); // expected-warning {{$Empty}}
- // We should have the same Conjured symbol for "Empty", "Empty2" and "Empty3".
+ // We only have binding for the original Empty object, because copying empty
+ // objects is a no-op in the performTrivialCopy. This is fine, because empty
+ // objects don't have any data members that could be accessed anyway.
clang_analyzer_printState();
// CHECK: "store": { "pointer": "0x{{[0-9a-f]+}}", "items": [
// CHECK-NEXT: { "cluster": "GlobalInternalSpaceRegion", "pointer": "0x{{[0-9a-f]+}}", "items": [
@@ -65,12 +60,6 @@ void _01_empty_structs() {
// CHECK-NEXT: ]},
// CHECK-NEXT: { "cluster": "Empty", "pointer": "0x{{[0-9a-f]+}}", "items": [
// CHECK-NEXT: { "kind": "Default", "offset": 0, "value": "[[EMPTY_CONJ:conj_\$[0-9]+{int, LC[0-9]+, S[0-9]+, #[0-9]+}]]" }
- // CHECK-NEXT: ]},
- // CHECK-NEXT: { "cluster": "Empty2", "pointer": "0x{{[0-9a-f]+}}", "items": [
- // CHECK-NEXT: { "kind": "Default", "offset": 0, "value": "[[EMPTY_CONJ]]" }
- // CHECK-NEXT: ]},
- // CHECK-NEXT: { "cluster": "Empty3", "pointer": "0x{{[0-9a-f]+}}", "items": [
- // CHECK-NEXT: { "kind": "Default", "offset": 0, "value": "[[EMPTY_CONJ]]" }
// CHECK-NEXT: ]}
// CHECK-NEXT: ]},
diff --git a/clang/test/Analysis/issue-157467.cpp b/clang/test/Analysis/issue-157467.cpp
new file mode 100644
index 0000000000000..8281ea1ee1aed
--- /dev/null
+++ b/clang/test/Analysis/issue-157467.cpp
@@ -0,0 +1,39 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core -verify %s
+// expected-no-diagnostics
+
+template <class T, int Idx, bool CanBeEmptyBase = __is_empty(T) && (!__is_final(T))>
+struct compressed_pair_elem {
+ explicit compressed_pair_elem(T u) : value(u) {}
+ T value;
+};
+
+template <class T, int Idx>
+struct compressed_pair_elem<T, Idx, /*CanBeEmptyBase=*/true> : T {
+ explicit compressed_pair_elem(T u) : T(u) {}
+};
+
+template <class T1, class T2, class Base1 = compressed_pair_elem<T1, 0>, class Base2 = compressed_pair_elem<T2, 1>>
+struct compressed_pair : Base1, Base2 {
+ explicit compressed_pair(T1 t1, T2 t2) : Base1(t1), Base2(t2) {}
+};
+
+// empty deleter object
+template <class T>
+struct default_delete {
+ void operator()(T* p) {
+ delete p;
+ }
+};
+
+template <class T, class Deleter = default_delete<T> >
+struct some_unique_ptr {
+ // compressed_pair will employ the empty base class optimization, thus overlapping
+ // the `int*` and the empty `Deleter` object, clobbering the pointer.
+ compressed_pair<int*, Deleter> ptr;
+ some_unique_ptr(int* p, Deleter d) : ptr(p, d) {}
+ ~some_unique_ptr();
+};
+
+void entry_point() {
+ some_unique_ptr<int, default_delete<int> > u3(new int(12), default_delete<int>());
+}
diff --git a/clang/test/Analysis/taint-generic.cpp b/clang/test/Analysis/taint-generic.cpp
index c080313e4d172..fc7c37300d3fc 100644
--- a/clang/test/Analysis/taint-generic.cpp
+++ b/clang/test/Analysis/taint-generic.cpp
@@ -153,8 +153,9 @@ void top() {
int Int = mySource1<int>();
clang_analyzer_isTainted(Int); // expected-warning {{YES}}
+ // It's fine to not propagate taint to empty classes, since they don't have any data members.
Empty E = mySource1<Empty>();
- clang_analyzer_isTainted(E); // expected-warning {{YES}}
+ clang_analyzer_isTainted(E); // expected-warning {{NO}}
Aggr A = mySource1<Aggr>();
clang_analyzer_isTainted(A); // expected-warning {{YES}}
More information about the cfe-commits
mailing list