[clang] [Clang][Sema] Fix crash in RemoveNestedImmediateInvocation with __builtin_dump_struct (PR #192880)

Vladislav Semykin via cfe-commits cfe-commits at lists.llvm.org
Sun Apr 19 15:52:25 PDT 2026


https://github.com/ViNN280801 created https://github.com/llvm/llvm-project/pull/192880

## Motivation

`ComplexRemove` - the `TreeTransform` specialization used by `Sema::PopExpressionEvaluationContext` to strip nested `ConstantExpr` wrappers around immediate invocations - inherits the default `TreeTransform::TransformOpaqueValueExpr`, which asserts on any `OpaqueValueExpr` whose `SourceExpr` is non-null unless the derived class has already set up its own binding.

For `__builtin_dump_struct`, `BuiltinDumpStructGenerator` (in `clang/lib/Sema/SemaChecking.cpp`) binds the record pointer to an `OpaqueValueExpr` so that synthesized `MemberExpr(OVE, field_i)` references don't re-evaluate the base, and emits a `PseudoObjectExpr` (syntactic form = the original call; semantic form = the OVE binding followed by the generated printing calls).

When the callable argument is immediate-escalated - for example a function template whose body uses `__builtin_is_within_lifetime` - one of the synthesized `CallExpr`s inside the PSE's semantic form becomes an immediate-invocation candidate in the enclosing evaluation context.

Background: [P2641R4](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2641r4.html) (*Checking if a union alternative is active*) proposes the standard **`consteval`** library function `std::is_within_lifetime(const T*)` for C++26. The paper generalizes the original “is this the active member of a `union`?" question to “does this pointer refer to a subobject that is within its lifetime?" — the motivating use cases include constexpr-safe optional-like storage and avoiding undefined behavior from reading inactive union members during constant evaluation. Clang exposes the same semantics as the immediate builtin `__builtin_is_within_lifetime`, which participates in immediate-function / escalation rules like any other `consteval` call.

`RemoveNestedImmediateInvocation` roots `ComplexRemove` at that inner `CallExpr`, so the transform never enters the outer `PseudoObjectExpr`, and the default `TransformOpaqueValueExpr` is reached with an un-mapped OVE. Assertion at `clang/lib/Sema/TreeTransform.h:13445`:

```
Assertion `(!E->getSourceExpr() || getDerived().AlreadyTransformed(E->getType()))
        && "opaque value expression requires transformation"' failed.
```

Minimal reproducer (from #192846):

```cpp
struct S {};

template <typename... T>
constexpr void F(S &out, const char *fmt, T... args) {
  while (__builtin_is_within_lifetime(&foo)) {}   // foo undeclared
}

template <class T>
class C { T value = {}; };

void bar() {
  S s;
  __builtin_dump_struct((C<int> *)nullptr, F, s);
}
```

## Related issues

- Closes #192846
- Same class of crash, different trigger paths: #55597, clangd#1311 (this patch does not claim to close those; no reproducers from them are included here).

## Notes for reviewers

**Why not `return E` (or a fake `AlreadyTransformed`)** - Skipping `TransformExpr(SourceExpr)` would leave nested `ConstantExpr` immediate wrappers under the OVE binding in place whenever they exist, which undermines what `ComplexRemove` is supposed to do for nested immediates; the inline comment above `TransformOpaqueValueExpr` spells out the full comparison with the other rejected approaches.

**OVE / `PseudoObjectExpr` / `setSubExpr`** - `RemoveNestedImmediateInvocation` only updates the operand of the nested immediate: on success it assigns `ConstantExpr::setSubExpr(Res.get())` where `Res` is `TransformExpr` of that operand. Within that walk, the `DenseMap` memoizes each OVE so all uses **reachable from this transform root** share one mapped node. A new `OpaqueValueExpr` is allocated only when the transformed `SourceExpr` is a different `Expr*`; the motivating #192846 / `dump_struct` case usually returns the original OVE (`NewSrc == Src`). The provisional `TransformedOpaqueValues[E] = E` is inserted **before** recursing into `SourceExpr`; after recursion we only write back **by key** (`TransformedOpaqueValues[E] = Result`), never via an iterator held across `TransformExpr`, so `DenseMap` rehash cannot invalidate the update.

## What changed

* `clang/lib/Sema/SemaExpr.cpp` - added a `TransformOpaqueValueExpr` override in `ComplexRemove`. It recurses into `SourceExpr`, allocates a fresh `OpaqueValueExpr` only when the transformed source differs from the original, and memoizes via a local `DenseMap<OpaqueValueExpr*, OpaqueValueExpr*>` with a provisional self-mapping inserted before recursion to break cycles. A detailed rationale - alternatives considered and rejected (return `E` unchanged; override `AlreadyTransformed` to always return true; in-place `setSourceExpr` mutation; restructuring `__builtin_dump_struct`'s AST), the reasoning for recursing into `SourceExpr`, the known risk of replacing the OVE node, and the iterator-invalidation note for `DenseMap` - is kept in the inline comment block immediately above the new method so it stays discoverable at `git blame` / `git log -p` time instead of in this PR description.
* `clang/docs/ReleaseNotes.rst` - entry added under *Miscellaneous Clang Crashes Fixed*.
* No changes to the AST, to `OpaqueValueExpr`'s public API, or to `BuiltinDumpStructGenerator` - the fix is confined to `ComplexRemove`.

## Testing

Build (Linux x86-64, Ubuntu 24.04, Clang built from this branch with assertions enabled):

```bash
cmake -S llvm -B build -G Ninja  \
  -DCMAKE_BUILD_TYPE=Release     \
  -DLLVM_ENABLE_PROJECTS=clang   \
  -DLLVM_ENABLE_ASSERTIONS=ON
ninja -C build
```

### Regression testing

```bash
ninja -C build check-clang-sema
ninja -C build check-clang-sematemplate
ninja -C build check-clang-codegen
```

Output:

```log
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/Sema
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin

Testing Time: 8.86s

Total Discovered Tests: 1348
  Unsupported:    2 (0.15%)
  Passed     : 1346 (99.85%)
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/SemaTemplate
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin

Testing Time: 1.35s

Total Discovered Tests: 323
  Passed           : 322 (99.69%)
  Expectedly Failed:   1 (0.31%)
ninja: Entering directory `build'
[0/1] Running lit suite /home/loveit0/Documents/Work/llvm-project/clang/test/CodeGen
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin

Testing Time: 182.36s

Total Discovered Tests: 6022
  Unsupported      :    5 (0.08%)
  Passed           : 6012 (99.83%)
  Expectedly Failed:    5 (0.08%)
```

### New own tests

```bash
./build/bin/llvm-lit -v ./clang/test/SemaCXX/builtin-dump-struct-immediate.cpp && ./build/bin/llvm-lit -v ./clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
```

Output:

```log
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
-- Testing: 1 tests, 1 workers --
PASS: Clang :: SemaCXX/builtin-dump-struct-immediate.cpp (1 of 1)

Testing Time: 0.11s

Total Discovered Tests: 1
  Passed: 1 (100.00%)
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/config.py:569: note: using clang: /home/loveit0/Documents/Work/llvm-project/build/bin/clang
llvm-lit: /home/loveit0/Documents/Work/llvm-project/llvm/utils/lit/lit/llvm/subst.py:130: note: Did not find cir-opt in /home/loveit0/Documents/Work/llvm-project/build/bin:/home/loveit0/Documents/Work/llvm-project/build/bin
-- Testing: 1 tests, 1 workers --
PASS: Clang :: SemaCXX/builtin-dump-struct-immediate-clean.cpp (1 of 1)

Testing Time: 0.07s

Total Discovered Tests: 1
  Passed: 1 (100.00%)
```

The following existing tests are specifically verified to pass unchanged:

* `clang/test/SemaCXX/builtin-dump-struct.cpp`
* `clang/test/SemaCXX/consteval*.cpp`
* `clang/test/CodeGen/dump-struct-builtin.c`

Regression tests added (names match the `llvm-lit` commands above - an earlier draft incorrectly said `builtin-dump-struct-consteval.cpp`):
* `clang/test/SemaCXX/builtin-dump-struct-immediate.cpp` - #192846-style path with `__builtin_is_within_lifetime` (well-formed `VALID_CASE` plus `-verify` invalid / undeclared `foo`; assertion must not fire).
* `clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp` - minimal well-formed case, `expected-no-diagnostics`.

Together they cover:
1. The #192846 immediate-escalated callable + `__builtin_dump_struct` scenario (invalid code must diagnose cleanly, no assert).
2. Repeated handling of the same dump-struct binding OVE along the expansion (memoization / `TransformOpaqueValueExpr` without tripping the default assert).
3. A non-regression-style well-formed case (clean variant) so the ordinary path stays healthy.


>From 122f1193a0a0c5b8d25b088ecc5499521585ae81 Mon Sep 17 00:00:00 2001
From: ViNN280801 <vladislav.semykin at gmail.com>
Date: Mon, 20 Apr 2026 01:46:40 +0300
Subject: [PATCH] [Clang][Sema] Fix crash in RemoveNestedImmediateInvocation
 with __builtin_dump_struct

Signed-off-by: ViNN280801 <vladislav.semykin at gmail.com>
---
 clang/docs/ReleaseNotes.rst                   |  3 +
 clang/lib/Sema/SemaExpr.cpp                   | 66 +++++++++++++++++++
 .../builtin-dump-struct-immediate-clean.cpp   | 29 ++++++++
 .../SemaCXX/builtin-dump-struct-immediate.cpp | 63 ++++++++++++++++++
 4 files changed, 161 insertions(+)
 create mode 100644 clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
 create mode 100644 clang/test/SemaCXX/builtin-dump-struct-immediate.cpp

diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 638a813ca105b..f4d2d7c63c902 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -527,6 +527,9 @@ Miscellaneous Clang Crashes Fixed
 - Fixed an assertion failure when parsing an invalid out-of-line enum definition with template parameters. (#GH187909)
 - Fixed an assertion failure on invalid template template parameter during typo correction. (#GH183983)
 - Fixed an assertion failure in ``isAtEndOfMacroExpansion`` on macro expansions crossing the boundary of two fileIDs. (#GH115007), (#GH21755)
+- Fixed an assertion failure when ``__builtin_dump_struct`` is called with an
+  immediate-escalated callable (for example a function whose body contains
+  ``__builtin_is_within_lifetime``). (#GH192846)
 
 OpenACC Specific Changes
 ------------------------
diff --git a/clang/lib/Sema/SemaExpr.cpp b/clang/lib/Sema/SemaExpr.cpp
index cf235095d489d..204888e8503e5 100644
--- a/clang/lib/Sema/SemaExpr.cpp
+++ b/clang/lib/Sema/SemaExpr.cpp
@@ -18463,6 +18463,72 @@ static void RemoveNestedImmediateInvocation(
       // Lambdas have already been processed inside their eval contexts.
       return E;
     }
+
+    // Default TreeTransform::TransformOpaqueValueExpr requires either no
+    // SourceExpr or AlreadyTransformed(getType()). ComplexRemove inherits the
+    // base AlreadyTransformed implementation (true only for the null type), so
+    // any OpaqueValueExpr that carries a SourceExpr would assert.
+    //
+    // That arises for __builtin_dump_struct: the record pointer is bound in an
+    // OpaqueValueExpr and reused as the base of synthesized MemberExprs. When
+    // peeling nested immediate invocations we must recurse through that source
+    // as well; otherwise ConstantExpr wrappers inside the source would survive
+    // incorrectly.
+    //
+    // Alternatives considered:
+    // - Returning the original OVE without traversing SourceExpr silences the
+    //   assert but violates this pass's purpose for nested immediates under the
+    //   binding expression.
+    // - Overriding AlreadyTransformed to always return true skips traversal for
+    //   the same reason.
+    // - Restructuring __builtin_dump_struct's AST would touch many consumers.
+    // - In-place mutation of SourceExpr would preserve node identity, but
+    //   OpaqueValueExpr exposes no mutator for SourceExpr; widening the AST API
+    //   for this localized issue is disproportionate.
+    //
+    // Returning E without traversing SourceExpr could also rely on the
+    // enclosing PseudoObjectExpr to evaluate nested immediates correctly, but
+    // ComplexRemove is intended to strip nested immediate-invocation wrappers
+    // throughout the transformed subtree; recursing through SourceExpr matches
+    // that contract.
+    //
+    // Risk: code that keyed on the exact OpaqueValueExpr* for a dump-struct
+    // binding could theoretically desynchronize if we replace the node. No
+    // such registry is known; we only allocate a replacement when the
+    // transformed source is a different Expr*.
+    //
+    // Note: DenseMap may rehash on insert; do not hold iterators/pointers
+    // across TransformExpr(Src), since nested OVEs may insert into this map.
+    llvm::DenseMap<OpaqueValueExpr *, OpaqueValueExpr *> TransformedOpaqueValues;
+
+    ExprResult TransformOpaqueValueExpr(OpaqueValueExpr *E) {
+      if (auto It = TransformedOpaqueValues.find(E);
+          It != TransformedOpaqueValues.end())
+        return It->second;
+
+      // Provisional self-mapping breaks cycles if SourceExpr reaches this OVE.
+      TransformedOpaqueValues[E] = E;
+
+      Expr *Src = E->getSourceExpr();
+      if (!Src)
+        return E;
+
+      ExprResult NewSrc = getDerived().TransformExpr(Src);
+      if (NewSrc.isInvalid()) {
+        TransformedOpaqueValues.erase(E);
+        return ExprError();
+      }
+
+      if (NewSrc.get() == Src)
+        return E;
+
+      auto *Result = new (SemaRef.getASTContext())
+          OpaqueValueExpr(E->getLocation(), E->getType(), E->getValueKind(),
+                          E->getObjectKind(), NewSrc.get());
+      TransformedOpaqueValues[E] = Result;
+      return Result;
+    }
+
     bool AlwaysRebuild() { return false; }
     bool ReplacingOriginal() { return true; }
     bool AllowSkippingCXXConstructExpr() {
diff --git a/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp b/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
new file mode 100644
index 0000000000000..bda7cb1a12dd1
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-dump-struct-immediate-clean.cpp
@@ -0,0 +1,29 @@
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused -verify %s
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused -verify %s
+
+// expected-no-diagnostics
+
+// GH192846: minimal well-formed case - immediate-escalated printing function
+// with __builtin_dump_struct must not crash in ComplexRemove. See sibling
+// builtin-dump-struct-immediate.cpp for C++26 vs language-mode notes.
+
+#if !__has_builtin(__builtin_is_within_lifetime)
+#error "test requires __builtin_is_within_lifetime"
+#endif
+
+struct S {};
+
+template <typename... T>
+constexpr void F(S &out, const char *fmt, T... args) {
+  (void)__builtin_is_within_lifetime(&out);
+}
+
+template <class T>
+class C {
+  T value = {};
+};
+
+void bar() {
+  S s;
+  __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
diff --git a/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp b/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp
new file mode 100644
index 0000000000000..f094e184178c1
--- /dev/null
+++ b/clang/test/SemaCXX/builtin-dump-struct-immediate.cpp
@@ -0,0 +1,63 @@
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused %s -DVALID_CASE
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused %s -DVALID_CASE
+// RUN: %clang_cc1 -std=c++23 -fsyntax-only -Wno-unused -verify %s
+// RUN: %clang_cc1 -std=c++2c -fsyntax-only -Wno-unused -verify %s
+
+// Source: GH192846 reported by @k-arrows with reproducer: https://godbolt.org/z/9aa43GE1a
+//
+// P2641R4 proposes std::is_within_lifetime(const T*) (consteval) for C++26 -
+// see https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2641r4.html
+// ("active union member" generalized to "within lifetime" for constexpr).
+// Clang: __builtin_is_within_lifetime (immediate builtin). The crash fixed here
+// is in Sema immediate-invocation handling, not in a particular -std mode, so
+// we run under both C++23 and C++2c.
+
+// Regression test for GH192846: ComplexRemove must traverse OpaqueValueExpr
+// sources (e.g. from __builtin_dump_struct) when stripping nested immediate
+// invocations; the default TreeTransform hook would assert.
+
+#if !__has_builtin(__builtin_is_within_lifetime)
+#error "test requires __builtin_is_within_lifetime"
+#endif
+
+#ifdef VALID_CASE
+
+// dump_struct may manifestly-constant-evaluate the printing callback; the
+// consteval builtin must only read objects usable in constant expressions.
+constexpr int foo = 0;
+
+template <typename T>
+struct C {
+  int n;
+};
+
+// The dump_struct machinery invokes this with a synthesized argument list; it
+// must not be a template (template arguments are not deduced from that call).
+void F(const char *fmt, ...) {
+  (void)__builtin_is_within_lifetime(&foo);
+}
+
+void bar() {
+  constexpr char s[] = "";
+  __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
+
+#else
+
+template <typename T>
+struct C {
+  int n;
+};
+
+void F(const char *fmt, ...) {
+  // Error recovery can report both the bad identifier and an invalid use of the
+  // consteval builtin on the same expression.
+  (void)__builtin_is_within_lifetime(&foo); // expected-error {{use of undeclared identifier 'foo'}} expected-error {{cannot take address of consteval function '__builtin_is_within_lifetime' outside of an immediate invocation}}
+}
+
+void bar() {
+  constexpr char s[] = "";
+  __builtin_dump_struct((C<int> *)nullptr, F, s);
+}
+
+#endif



More information about the cfe-commits mailing list