[PATCH] D146973: [Clang] Implicitly include LLVM libc headers for the GPU

Mon Mar 27 17:52:59 PDT 2023

jhuber6 added a comment.

In D146973#4225983 <https://reviews.llvm.org/D146973#4225983>, @jdoerfert wrote:

> I said this before, many times:
>
> We don't want to have different host and device libraries that are incompatible.
> Effectively, what we really want, is the host environment to just work on the GPU.
> That includes extensions in the host headers, macros, taking the address of stuff, etc.
> This became clear when we made (c)math.h available on the GPU (for OpenMP).

The problem is that we cannot control the system headers, they are not expected to work with `llvm-libc`. For example: the GNU `ctype.h` includes `features.h` which will attempt to include the 32-bit stubs file because the GPU is not a recognized target on the host. If you work around that, like we do in OpenMP, then you will realize that `isalnum` is actually a macro to `__isctype` which references and external table called `__ctype_b_loc` which isn't defined in the C standard. So, now we have a header that causes `isalnum` to not longer call the implementation in LLVM's `libc`, it also fails at link time because there is no reference to `__ctype_b_loc` in LLVM's `libc`. What is the solution here? Do we implement `libc` in LLVM with a workaround for every internal implementation in the GNU `libc`?

> For most of libc, we might get away with custom GPU headers but eventually it will break "expected to work" user code, at the latest when we arrive at libc++.
> A user can, right now, map a std::vector from the host to the device, and, assuming they properly did the deep copy, it will work.
> If we end up with different sizes, alignments, layouts, this will not only break, but we will also break any structure that depends on those sizes, e.g., mapping an object with a std::map inside even if it is not accessed will cause problems.
>
> In addition, systems are never "vanilla". We want to include the system headers to get the extensions users might rely on. Providing only alternative headers even breaks working code (in the OpenMP case), e.g., when we auto-translate definitions in the header to the device (not a CUDA thing, I think).

Using custom generated headers is the only approach that is guaranteed to actually work when we compile this. We cannot sanely implement a library using headers unique to another implementation targeting an entirely different machine, we will endlessly be chasing implementation details like above. This works in OpenMP currently because we've chosen a handful of headers that this doesn't completely break for.

> I strongly suggest to include our GPU headers first, in them we setup the overlays for the system headers, and then we include the system versions.
> This works for (c)math.h, complex, and other parts of libc and libc++ already, even though we don't ship them as libraries.

The wrapper approach works fine for the ones we've selected. And in the GPU `libc` we could generate our own headers that have `#include_next` in them if we verify that it works for that header. I think in general though, we need to work with custom headers first, and implement a set of features we know to work.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146973/new/

https://reviews.llvm.org/D146973