[libcxx-commits] [libcxx] [libc++] Protect the libc++ implementation from CUDA SDK's `__noinline__` macro (PR #73838)

Artem Belevich via libcxx-commits libcxx-commits at lists.llvm.org
Fri Dec 1 10:34:51 PST 2023


Artem-B wrote:

Looks like `__noinline__` as a macro, as it's defined right now is going to be problematic.

Perhaps the best choice is to permanently undef CUDA's `__noinline__` or redefine it as __cuda_noinline__ and let users deal with that if they want/need to. Otherwise we'll be playing this game of whack-an-__inline__-conflict forever.

`__noinline__` appears to be relatively rarely used. I see a single reference in [pytorch](https://github.com/pytorch/pytorch/blob/32b928e582f90978c6e647eb1f27831506dc1d9b/aten/src/ATen/native/cuda/IGammaKernel.cu#L373)  and there are a handful in NVIDIA's [core libraries' tests](https://github.com/search?q=repo%3ANVIDIA%2Fcccl%20__noinline__&type=code) . I think getting rid of them should be doable.

At the moment, only few libc++ headers use _LIBCPP_NOINLINE, so we can introduce a temporary build knob which would temporarily  undef `__inline__` in the wrappers for `__config` and `string`, but which would need to be explicitly enabled by the user.

So, transition would work roughly like this:
* temporary undef `__noinline__` in wrappers for `__config` and `string`
* Introduce a knob to completely undef `__noinline__` and disable the workarounds. Keep it enabled initially.
* Switch the knob default to off for the next major clang release.
* Remove the knob in the next major release.

If it's too much hassle, we can keep adding the wrappers. It's a hassle, but it would probably work OK, considering that it's not used all that often.

WDYT, all?

https://github.com/llvm/llvm-project/pull/73838


More information about the libcxx-commits mailing list