[clang] Fix Lambda Mangling in Namespace-Scope Variable Initializers. (PR #159115)

Tue Sep 23 22:20:29 PDT 2025

================
@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -fsycl-is-device -O0 -triple spirv64-unknown-unknown \
+// RUN: -emit-llvm  %s -o - | FileCheck %s --check-prefix=DEVICE
+
+// RUN: %clang_cc1 -fsycl-is-host -O0 -triple spirv64-unknown-unknown \
+// RUN: -emit-llvm  %s -o - | FileCheck %s --check-prefix=HOST
+
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm \
+// RUN: -aux-triple x86_64-pc-windows-msvc -triple spir-unknown--unknown \
+// RUN: %s -o - | FileCheck %s --check-prefix=MSVC
+
+namespace QL {
+  auto dg1 = [] { return 1; };
+  inline auto dg_inline1 = [] { return 1; };
+}
+
+namespace QL {
+  auto dg2 = [] { return 2; };
+  template<int N>
+  auto dg_template = [] { return N; };
+}
+
+using namespace QL;
+template<typename T>
+[[clang::sycl_kernel_entry_point(T)]] void f(T t) {
+  t();
+}
+
+void g() {
+  f(dg1);
+  f(dg2);
+  f(dg_inline1);
+  f(dg_template<3>);
+}
+
+// HOST: @_ZN2QL3dg1E = internal global %class.anon undef, align 1
+// HOST: @_ZN2QL3dg2E = internal global %class.anon.0 undef, align 1
+// HOST: @_ZN2QL10dg_inline1E = linkonce_odr global %class.anon.2 undef, comdat, align 1
+// HOST: @_ZN2QL11dg_templateILi3EEE = linkonce_odr global %class.anon.4 undef, comdat, align 1
+
+// DEVICE: define spir_kernel void @_ZTSN2QL3dg1MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// DEVICE: define internal spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL3dg2MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// DEVICE: define internal spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL10dg_inline1MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// DEVICE: define linkonce_odr spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL11dg_templateILi3EEMUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+// DEVICE: define linkonce_odr spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+
+// HOST: define spir_func void @_Z1gv
+// HOST: call spir_func void @_Z1fIN2QL3dg1MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL3dg2MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL10dg_inline1MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL11dg_templateILi3EEMUlvE_EEvT_
+// HOST: define internal spir_func void @_Z1fIN2QL3dg1MUlvE_EEvT
+// HOST: define internal spir_func void @_Z1fIN2QL3dg2MUlvE_EEvT_
+// HOST: define linkonce_odr spir_func void @_Z1fIN2QL10dg_inline1MUlvE_EEvT_
+// HOST: define linkonce_odr spir_func void @_Z1fIN2QL11dg_templateILi3EEMUlvE_EEvT_
+
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL3dg1MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// MSVC: define internal spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL3dg2MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// MSVC: define internal spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL10dg_inline1MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// MSVC: define linkonce_odr spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL11dg_templateILi3EEMUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+// MSVC: define linkonce_odr spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
----------------
tahonermann wrote:

Thanks @zygoloid!

> That lambda should have internal linkage (the relevant part of the ODR [does not apply to lambdas in default template arguments](https://eel.is/c++draft/basic.def.odr#18)), so that seems fine. Except that Clang fails to actually give the lambda internal linkage, which looks like a bug to me.

I see, ok, we can follow up on that.

> That's correct. Lambdas in default template arguments are not "owned" by the template and so shouldn't allocate mangling numbers within it. Note that their existence can differ across redeclarations in different TUs, so numbering them would mean the same entity could get different manglings in different TUs.

Thank you! This is a very helpful insight that I was missing and failed to pick up from the standard wording.

Assuming I'm reading things correctly, [\[basic.def.odr\]p(16.11)](https://eel.is/c++draft/basic.def.odr#16.11) also makes it clear that uses of such default template arguments result in distinct closure types in the contexts where those uses occur.

The example in [\[basic.def.odr\]p19](https://eel.is/c++draft/basic.def.odr#19) makes it clear that correspondence across TUs is required for default arguments of function parameters in member functions. The preceding text appears to treat default template arguments and function default arguments the same, so I presume the example is applicable to default template arguments as well. I think the example is more subtle than the description suggests though. Where it states, "The definition of `X` can appear in multiple translation units of a valid program; the [lambda-expression](https://eel.is/c++draft/expr.prim.lambda.general#nt:lambda-expression)s defined within the default argument of `X::h` within the definition of `X` denote the same closure type in each translation unit.", I think this is referring only to the use of that lambda within the body of `X::h` as opposed to other hypothetical uses elsewhere outside the class definition. Am I reading that correctly?

Back to the ABI, my take away from this is that discriminators allocated from namespace and class contexts need not correspond across TUs; when correspondence is required, discriminators are allocated from more local declaration contexts (except in cases like those reported in https://github.com/itanium-cxx-abi/cxx-abi/issues/165). If so, then it seems I need not worry (too much) about discriminators being allocated and unused or about the order in which they are allocated in those contexts.

Another example to check what I'm learning; https://godbolt.org/z/M4fY4e3Px:
```
extern int x;

struct S {
  template<auto F = [] { return x; }>
  static inline auto vt = F;

  // GCC:   _ZNK1SUlvE0_clEv
  // Clang: _ZNK1SUlvE_clEv
  // SYCL:  _ZNK1SUlvE0_clEv
  static inline auto sdm = vt<>;
};

// GCC:   _ZZ6usevt0vENKUlvE1_clEv
// Clang: _ZNK1SUlvE0_clEv
// SYCL:  _ZNK1SUlvE1_clEv
inline int usevt0() { return S::vt<>(); }

// GCC:   _ZZ6usevt1vENKUlvE2_clEv
// Clang: _ZNK1SUlvE1_clEv
// SYCL:  _ZNK1SUlvE2_clEv
inline int usevt1() { return S::vt<>(); }

int f() {
  return usevt0() + usevt1() + S::sdm();
}
```

Per your reference to [\[basic.def.odr\]p18](https://eel.is/c++draft/basic.def.odr#18), if `usevt0()` and/or `usevt1()` are defined in multiple TUs, the program is IFNDR because the uses of the lambda in the default template argument do not correspond. However, the same use in the initializer of `S::sdm` does correspond because `S::sdm` is defined within the definition of `S`. From an ABI perspective, the `sdm` case is the only one that matters, but a stable name isn't produced by Clang or GCC.

GCC includes a `<closure-prefix>` corresponding to the point of use of the lambda for `usevt0()` and `usevt1()`, which is good. However, I would have expected `_ZZ6usevt0vENKUlvE_clEv` and `_ZZ6usevt1vENKUlvE_clEv` for those cases; it looks like discriminators might be being allocated from the class context, but the function context is used for naming. I think the use in the initializer of `S::sdm` should result in a name with a `<closure-prefix>` that incorporates `sdm` too.

For Clang, the situation looks much like the last case. Hmm, I think I'm rediscovering #143218.

> We'd previously agreed (on a different github issue, I could probably dig it up if needed) that we want to stop using the `$n` mangling for lambdas in Clang entirely, and always produce manglings that actually match the ABI, can be demangled, etc. We should do that -- which hopefully would mean that there's no divergence between the SYCL and non-SYCL manglings after this PR, at least for ABI-mandated manglings.

That would be great.

> But separately from that, in the` v2` case, the lambda should be mangled with a `<closure-prefix>` naming the inline variable `v2`. Looks like GCC gets this right, and Clang gets it wrong in both modes.

Ok, we can follow up on that.

> What new requirements does SYCL have compared to C++ here? It looks like at a minimum we effectively want to treat all entities as visible-across-TUs for lambda mangling purposes, but as I noted above, we want to do that anyway. And those manglings should presumably exactly match the Itanium ones.

That is correct. The extra requirements for CUDA/HIP/SYCL stem from the need for types used in kernel invocations to correspond across the host target TU and each of the device target TUs, even for types that don't have external linkage. [SYCL issue 454](https://github.com/KhronosGroup/SYCL-Docs/issues/454) seeks to clarify what kinds of heroics are required; at least in the face of preprocessor shenanigans.  For SYCL, the correspondence concerns are limited to the type used to name the kernel, the lambda/class type used to define the kernel, and the types of its captures/data members. For CUDA/HIP, I think the correspondence concerns are limited to the types used as kernel arguments. I can provide examples if it would be helpful.

https://github.com/llvm/llvm-project/pull/159115