[clang] [Clang][Sema] Fix crash in RemoveNestedImmediateInvocation with __builtin_dump_struct (PR #192880)
Vladislav Semykin via cfe-commits
cfe-commits at lists.llvm.org
Sun Apr 19 15:52:25 PDT 2026
https://github.com/ViNN280801 created https://github.com/llvm/llvm-project/pull/192880
## Motivation
`ComplexRemove` - the `TreeTransform` specialization used by `Sema::PopExpressionEvaluationContext` to strip nested `ConstantExpr` wrappers around immediate invocations - inherits the default `TreeTransform::TransformOpaqueValueExpr`, which asserts on any `OpaqueValueExpr` whose `SourceExpr` is non-null unless the derived class has already set up its own binding.
For `__builtin_dump_struct`, `BuiltinDumpStructGenerator` (in `clang/lib/Sema/SemaChecking.cpp`) binds the record pointer to an `OpaqueValueExpr` so that synthesized `MemberExpr(OVE, field_i)` references don't re-evaluate the base, and emits a `PseudoObjectExpr` (syntactic form = the original call; semantic form = the OVE binding followed by the generated printing calls).
When the callable argument is immediate-escalated - for example a function template whose body uses `__builtin_is_within_lifetime` - one of the synthesized `CallExpr`s inside the PSE's semantic form becomes an immediate-invocation candidate in the enclosing evaluation context.
Background: [P2641R4](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2641r4.html) (*Checking if a union alternative is active*) proposes the standard **`consteval`** library function `std::is_within_lifetime(const T*)` for C++26. The paper generalizes the original “is this the active member of a `union`?" question to “does this pointer refer to a subobject that is within its lifetime?" — the motivating use cases include constexpr-safe optional-like storage and avoiding undefined behavior from reading inactive union members during constant evaluation. Clang exposes the same semantics as the immediate builtin `__builtin_is_within_lifetime`, which participates in immediate-function / escalation rules like any other `consteval` call.
`RemoveNestedImmediateInvocation` roots `ComplexRemove` at that inner `CallExpr`, so the transform never enters the outer `PseudoObjectExpr`, and the default `TransformOpaqueValueExpr` is reached with an un-mapped OVE. Assertion at `clang/lib/Sema/TreeTransform.h:13445`:
```
Assertion `(!E->getSourceExpr() || getDerived().AlreadyTransformed(E->getType()))
&& "opaque value expression requires transformation"' failed.
```
Minimal reproducer (from #192846):
```cpp
struct S {};
template <typename... T>
constexpr void F(S &out, const char *fmt, T... args) {
while (__builtin_is_within_lifetime(&foo)) {} // foo undeclared
}
template <class T>
class C { T value = {}; };
void bar() {
S s;
__builtin_dump_struct((C<int> *)nullptr, F, s);
}
```
## Related issues
- Closes #192846
- Same class of crash, different trigger paths: #55597, clangd#1311 (this patch does not claim to close those; no reproducers from them are included here).
## Notes for reviewers
**Why not `return E` (or a fake `AlreadyTransformed`)** - Skipping `TransformExpr(SourceExpr)` would leave nested `ConstantExpr` immediate wrappers under the OVE binding in place whenever they exist, which undermines what `ComplexRemove` is supposed to do for nested immediates; the inline comment above `TransformOpaqueValueExpr` spells out the full comparison with the other rejected approaches.
**OVE / `PseudoObjectExpr` / `setSubExpr`** - `RemoveNestedImmediateInvocation` only updates the operand of the nested immediate: on success it assigns `ConstantExpr::setSubExpr(Res.get())` where `Res` is `TransformExpr` of that operand. Within that walk, the `DenseMap` memoizes each OVE so all uses **reachable from this transform root** share one mapped node. A new `OpaqueValueExpr` is allocated only when the transformed `SourceExpr` is a different `Expr*`; the motivating #192846 / `dump_struct` case usually returns the original OVE (`NewSrc == Src`). The provisional `TransformedOpaqueValues[E] = E` is inserted **before** recursing into `SourceExpr`; after recursion we only write back **by key** (`TransformedOpaqueValues[E] = Result`), never via an iterator held across `TransformExpr`, so `DenseMap` rehash cannot invalidate the update.
## What changed
* `clang/lib/Sema/SemaExpr.cpp` - added a `TransformOpaqueValueExpr` override in `ComplexRemove`. It recurses into `SourceExpr`, allocates a fresh `OpaqueValueExpr` only when the transformed source differs from the original, and memoizes via a local `DenseMap<OpaqueValueExpr*, OpaqueValueExpr*>` with a provisional self-mapping inserted before recursion to break cycles. A detailed rationale - alternatives considered and rejected (return `E` unchanged; override `AlreadyTransformed` to always return true; in-place `setSourceExpr` mutation; restructuring `__builtin_dump_struct`'s AST), the reasoning for recursing into `SourceExpr`, the known risk of replacing the OVE node, and the iterator-invalidation note for `DenseMap` - is kept in the inline comment block immediately above the new method so it stays discoverable at `git blame` / `git log -p` time instead of in this PR description.
* `clang/docs/ReleaseNotes.rst` - entry added under *Miscellaneous Clang Crashes Fixed*.
* No changes to the AST, to `OpaqueValueExpr`'s public API, or to `BuiltinDumpStructGenerator` - the fix is confined to `ComplexRemove`.
## Testing
Build (Linux x86-64, Ubuntu 24.04, Clang built from this branch with assertions enabled):
```bash
cmake -S llvm -B build -G Ninja \
-DCMAKE_BUILD_TYPE=Release \
-DLLVM_ENABLE_PROJECTS=clang \
-DLLVM_ENABLE_ASSERTIONS=ON
ninja -C build
```
### Regression testing
```bash
ninja -C build check-clang-sema
ninja -C build check-clang-sematemplate
ninja -C build check-clang-codegen
```
Output:
```log
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/Sema
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
Testing Time: 8.86s
Total Discovered Tests: 1348
Unsupported: 2 (0.15%)
Passed : 1346 (99.85%)
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/SemaTemplate
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
Testing Time: 1.35s
Total Discovered Tests: 323
Passed : 322 (99.69%)
Expectedly Failed: 1 (0.31%)
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/CodeGen
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
Testing Time: 182.36s
Total Discovered Tests: 6022
Unsupported : 5 (0.08%)
Passed : 6012 (99.83%)
Expectedly Failed: 5 (0.08%)
```
### New own tests
```bash
./build/bin/llvm-lit -v ./clang/test/SemaCXX/builtin-dump-struct-immediate.cpp && ./build/bin/llvm-lit -v ./clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
```
Output:
```log
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
-- Testing: 1 tests, 1 workers --
PASS: Clang :: SemaCXX/builtin-dump-struct-immediate.cpp (1 of 1)
Testing Time: 0.11s
Total Discovered Tests: 1
Passed: 1 (100.00%)
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
-- Testing: 1 tests, 1 workers --
PASS: Clang :: SemaCXX/builtin-dump-struct-immediate-clean.cpp (1 of 1)
Testing Time: 0.07s
Total Discovered Tests: 1
Passed: 1 (100.00%)
```
The following existing tests are specifically verified to pass unchanged:
* `clang/test/SemaCXX/builtin-dump-struct.cpp`
* `clang/test/SemaCXX/consteval*.cpp`
* `clang/test/CodeGen/dump-struct-builtin.c`
Regression tests added (names match the `llvm-lit` commands above - an earlier draft incorrectly said `builtin-dump-struct-consteval.cpp`):
* `clang/test/SemaCXX/builtin-dump-struct-immediate.cpp` - #192846-style path with `__builtin_is_within_lifetime` (well-formed `VALID_CASE` plus `-verify` invalid / undeclared `foo`; assertion must not fire).
* `clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp` - minimal well-formed case, `expected-no-diagnostics`.
Together they cover:
1. The #192846 immediate-escalated callable + `__builtin_dump_struct` scenario (invalid code must diagnose cleanly, no assert).
2. Repeated handling of the same dump-struct binding OVE along the expansion (memoization / `TransformOpaqueValueExpr` without tripping the default assert).
3. A non-regression-style well-formed case (clean variant) so the ordinary path stays healthy.
>From 122f1193a0a0c5b8d25b088ecc5499521585ae81 Mon Sep 17 00:00:00 2001
From: ViNN280801 <vladislav.semykin at gmail.com>
Date: Mon, 20 Apr 2026 01:46:40 +0300
Subject: [PATCH] [Clang][Sema] Fix crash in RemoveNestedImmediateInvocation
with __builtin_dump_struct
Signed-off-by: ViNN280801 <vladislav.semykin at gmail.com>
---
clang/docs/ReleaseNotes.rst | 3 +
clang/lib/Sema/SemaExpr.cpp | 66 +++++++++++++++++++
.../builtin-dump-struct-immediate-clean.cpp | 29 ++++++++
.../SemaCXX/builtin-dump-struct-immediate.cpp | 63 ++++++++++++++++++
4 files changed, 161 insertions(+)
create mode 100644 clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
create mode 100644 clang/test/SemaCXX/builtin-dump-struct-immediate.cpp
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 638a813ca105b..f4d2d7c63c902 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -527,6 +527,9 @@ Miscellaneous Clang Crashes Fixed
- Fixed an assertion failure when parsing an invalid out-of-line enum definition with template parameters. (#GH187909)
- Fixed an assertion failure on invalid template template parameter during typo correction. (#GH183983)
- Fixed an assertion failure in ``isAtEndOfMacroExpansion`` on macro expansions crossing the boundary of two fileIDs. (#GH115007), (#GH21755)
+- Fixed an assertion failure when ``__builtin_dump_struct`` is called with an
+ immediate-escalated callable (for example a function whose body contains
+ ``__builtin_is_within_lifetime``). (#GH192846)
OpenACC Specific Changes
------------------------
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index cf235095d489d..204888e8503e5 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -18463,6 +18463,72 @@ static void RemoveNestedImmediateInvocation(
// Lambdas have already been processed inside their eval contexts.
return E;
}
+
+ // Default TreeTransform::TransformOpaqueValueExpr requires either no
+ // SourceExpr or AlreadyTransformed(getType()). ComplexRemove inherits the
+ // base AlreadyTransformed implementation (true only for the null type), so
+ // any OpaqueValueExpr that carries a SourceExpr would assert.
+ //
+ // That arises for __builtin_dump_struct: the record pointer is bound in an
+ // OpaqueValueExpr and reused as the base of synthesized MemberExprs. When
+ // peeling nested immediate invocations we must recurse through that source
+ // as well; otherwise ConstantExpr wrappers inside the source would survive
+ // incorrectly.
+ //
+ // Alternatives considered:
+ // - Returning the original OVE without traversing SourceExpr silences the
+ // assert but violates this pass's purpose for nested immediates under the
+ // binding expression.
+ // - Overriding AlreadyTransformed to always return true skips traversal for
+ // the same reason.
+ // - Restructuring __builtin_dump_struct's AST would touch many consumers.
+ // - In-place mutation of SourceExpr would preserve node identity, but
+ // OpaqueValueExpr exposes no mutator for SourceExpr; widening the AST API
+ // for this localized issue is disproportionate.
+ //
+ // Returning E without traversing SourceExpr could also rely on the
+ // enclosing PseudoObjectExpr to evaluate nested immediates correctly, but
+ // ComplexRemove is intended to strip nested immediate-invocation wrappers
+ // throughout the transformed subtree; recursing through SourceExpr matches
+ // that contract.
+ //
+ // Risk: code that keyed on the exact OpaqueValueExpr* for a dump-struct
+ // binding could theoretically desynchronize if we replace the node. No
+ // such registry is known; we only allocate a replacement when the
+ // transformed source is a different Expr*.
+ //
+ // Note: DenseMap may rehash on insert; do not hold iterators/pointers
+ // across TransformExpr(Src), since nested OVEs may insert into this map.
+ llvm::DenseMap<OpaqueValueExpr *, OpaqueValueExpr *> TransformedOpaqueValues;
+
+ ExprResult TransformOpaqueValueExpr(OpaqueValueExpr *E) {
+ if (auto It = TransformedOpaqueValues.find(E);
+ It != TransformedOpaqueValues.end())
+ return It->second;
+
+ // Provisional self-mapping breaks cycles if SourceExpr reaches this OVE.
+ TransformedOpaqueValues[E] = E;
+
+ Expr *Src = E->getSourceExpr();
+ if (!Src)
+ return E;
+
+ ExprResult NewSrc = getDerived().TransformExpr(Src);
+ if (NewSrc.isInvalid()) {
+ TransformedOpaqueValues.erase(E);
+ return ExprError();
+ }
+
+ if (NewSrc.get() == Src)
+ return E;
+
+ auto *Result = new (SemaRef.getASTContext())
+ OpaqueValueExpr(E->getLocation(), E->getType(), E->getValueKind(),
+ E->getObjectKind(), NewSrc.get());
+ TransformedOpaqueValues[E] = Result;
+ return Result;
+ }
+
bool AlwaysRebuild() { return false; }
bool ReplacingOriginal() { return true; }
bool AllowSkippingCXXConstructExpr() {
diff --git a/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp b/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
new file mode 100644
index 0000000000000..bda7cb1a12dd1
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused -verify %s
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused -verify %s
+
+// expected-no-diagnostics
+
+// GH192846: minimal well-formed case - immediate-escalated printing function
+// with __builtin_dump_struct must not crash in ComplexRemove. See sibling
+// builtin-dump-struct-immediate.cpp for C++26 vs language-mode notes.
+
+#if !__has_builtin(__builtin_is_within_lifetime)
+#error "test requires __builtin_is_within_lifetime"
+#endif
+
+struct S {};
+
+template <typename... T>
+constexpr void F(S &out, const char *fmt, T... args) {
+ (void)__builtin_is_within_lifetime(&out);
+}
+
+template <class T>
+class C {
+ T value = {};
+};
+
+void bar() {
+ S s;
+ __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
diff --git a/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp b/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp
new file mode 100644
index 0000000000000..f094e184178c1
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp
@@ -0,0 +1,63 @@
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused %s -DVALID_CASE
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused %s -DVALID_CASE
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused -verify %s
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused -verify %s
+
+// Source: GH192846 reported by @k-arrows with reproducer: https://godbolt.org/z/9aa43GE1a
+//
+// P2641R4 proposes std::is_within_lifetime(const T*) (consteval) for C++26 -
+// see https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2641r4.html
+// ("active union member" generalized to "within lifetime" for constexpr).
+// Clang: __builtin_is_within_lifetime (immediate builtin). The crash fixed here
+// is in Sema immediate-invocation handling, not in a particular -std mode, so
+// we run under both C++23 and C++2c.
+
+// Regression test for GH192846: ComplexRemove must traverse OpaqueValueExpr
+// sources (e.g. from __builtin_dump_struct) when stripping nested immediate
+// invocations; the default TreeTransform hook would assert.
+
+#if !__has_builtin(__builtin_is_within_lifetime)
+#error "test requires __builtin_is_within_lifetime"
+#endif
+
+#ifdef VALID_CASE
+
+// dump_struct may manifestly-constant-evaluate the printing callback; the
+// consteval builtin must only read objects usable in constant expressions.
+constexpr int foo = 0;
+
+template <typename T>
+struct C {
+ int n;
+};
+
+// The dump_struct machinery invokes this with a synthesized argument list; it
+// must not be a template (template arguments are not deduced from that call).
+void F(const char *fmt, ...) {
+ (void)__builtin_is_within_lifetime(&foo);
+}
+
+void bar() {
+ constexpr char s[] = "";
+ __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
+
+#else
+
+template <typename T>
+struct C {
+ int n;
+};
+
+void F(const char *fmt, ...) {
+ // Error recovery can report both the bad identifier and an invalid use of the
+ // consteval builtin on the same expression.
+ (void)__builtin_is_within_lifetime(&foo); // expected-error {{use of undeclared identifier 'foo'}} expected-error {{cannot take address of consteval function '__builtin_is_within_lifetime' outside of an immediate invocation}}
+}
+
+void bar() {
+ constexpr char s[] = "";
+ __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
+
+#endif
More information about the cfe-commits
mailing list