[PATCH] D146973: [Clang] Implicitly include LLVM libc headers for the GPU

Mon Mar 27 13:28:47 PDT 2023

jhuber6 added inline comments.

================
Comment at: clang/lib/Driver/ToolChains/Clang.cpp:1230
+      llvm::sys::path::append(P, "llvm-libc");
+      CmdArgs.push_back("-c-isystem");
+      CmdArgs.push_back(Args.MakeArgString(P));
----------------
sivachandra wrote:
> tra wrote:
> > sivachandra wrote:
> > > jhuber6 wrote:
> > > > tra wrote:
> > > > > jhuber6 wrote:
> > > > > > tra wrote:
> > > > > > > Ensuring the right include order will be tricky. Interaction between the headers provided by llvm-libc vs the system headers will be interesting if we end up mixing the headers. It may be less problematic compared to the C++ standard library, but I doubt that mixing two implementations would work well here, either. 
> > > > > > > 
> > > > > > > So, the major question I have -- does adding include path here guarantees that we're not picking up the host headers? I do not see any changes that would exclude system headers from include paths.
> > > > > > > If we do have include paths leading to both llvm-libc and the host headers, what's expected to happen if user code ends up including a header that's not present in  llvm-libc? Is that possible?
> > > > > > > 
> > > > > > Right now I'm just kind of relying on an expectation that since this will be the first `c-isystem` path set, then it will pull in these libraries first if they exist. It's not captured by the tests, but compiling with `-v` shows this path being used first in my experience. So, theoretically, if there is an implementation of said header in this location, it will be picked up before anything else. Otherwise it'll just search the other standard locations.
> > > > > I think this will be a problem. We're cross-compiling here and for that to work reliably we need to make sure that only target headers are in effect. The problem is that we likely just do not have sufficiently complete set of headers for the GPU. Do we? I have no idea what exactly llvm-libc provides and whether it is sufficient for normal user code to cross-compile for a GPU. 
> > > > > 
> > > > > It would be interesting to try to compile some C++ code which would include commonly used, but generally target-agnostic, headers like <vector> <complex> <algorithm>, etc and check whether we end up pulling in any system headers. Then check what happens if we do not have system headers available at all.
> > > > No, it's definitely not complete. Some headers work on the GPU, most break in some way or another. The only ones `llvm-libc` provides currently is `string.h` and `ctype.h`. But, I figured this wouldn't be a problem since it would just go to the system headers anyway if we didn't provide them. So we are merely replacing maybe broken with probably works.
> > > > 
> > > > I was talking with Johannes and he brings up other issues about the idea of host-device compatibility between these headers. Since, fundamentally, right now `libc` generates its own headers and needs to generate its own headers to function. But there can be a problem when migrating data between the host and the device is the headers on the host differ somewhat to those on the device. I'm not sure what a good overall solution to that problem is.
> > > Normally, one should not expect target and host headers to be compatible. So, if you are building for the host, you should use host headers and if you are building for the target, you should use target headers. Does general GPU build not follow this approach? May be there are some common headers but I do not expect them to be from the standard libraries.
> > We can generally assume that the GPU and the host do have largely identical types. At least the subset of the types we expect to exchange between host and GPU.
> > CUDA compilation cheats, and allows the host to provide most of the headers, with clang and CUDA SDK providing a subset of GPU-side overloads. This way, if GPU-side functions do implicitly rely on the code nominally provided for the host by the host headers, but if we need to code-gen it, we can only do so for a subset that's actually compileable for the GPU -- either constexpr functions, lambdas or `__device__` overloads provided by us.
> > 
> > Standalone compilation does not have the benefit of having the cake and being able to eat it. It has to be all or nothing, as we do not have the ability to separate the host and GPU code we pull in via headers, nor can we provide a GPU-side overloads. In a way injecting llvm-libc path is a crude attempt to do that by providing GPU-side implementations before we get to include host-side ehaders that would provide the host versions. While it may sometimes work, I do not think it's a robust solution.
> > 
> > The only sound(-ish) ways out I see are:
> > - implement sufficiently complete set of headers for the GPU.
> > - provide GPU-side overloads which would allow us to pull in system headers and augment them with GPU-specific implementations when necessary. 
> > 
> > The former is quite a bit of work. The latter is easier, but gets us away from "it's just a plain C compilation, just for a GPU", as we'll grow yet another C dialect and the libc headers will need to have additional glue to mark functions as "for GPU".
> > 
> > 
> > 
> > We can generally assume that the GPU and the host do have largely identical types. At least the subset of the types we expect to exchange between host and GPU.
> 
> Types for communication have to be of similar topology I would think and putting them in common header is expected. But, would standard library types have match with the host?
> 
> > CUDA compilation cheats, and allows the host to provide most of the headers, with clang and CUDA SDK providing a subset of GPU-side overloads. This way, if GPU-side functions do implicitly rely on the code nominally provided for the host by the host headers, but if we need to code-gen it, we can only do so for a subset that's actually compileable for the GPU -- either constexpr functions, lambdas or `__device__` overloads provided by us.
> > 
> > In a way injecting llvm-libc path is a crude attempt to do that by providing GPU-side implementations before we get to include host-side ehaders that would provide the host versions. While it may sometimes work, I do not think it's a robust solution.
> 
> Will it still be considered crude if the new path being proposed here contains only standard library headers?
> 
> > The only sound(-ish) ways out I see are:
> > - implement sufficiently complete set of headers for the GPU.
> 
> What does "sufficiently complete" mean? Are you referring to an incomplete header file or an incomplete set of header files? Assuming it is the latter, is it problematic if we fallback to host-headers (or headers which are currently used for the GPU build)?
> 
> - implement sufficiently complete set of headers for the GPU.

This is what I was going for, we slowly implement standard C headers and provide them as a list of functions and interfaces that are expected to work on the GPU. Otherwise we default to the support we already have. Given that LLVM `libc` is already aiming to provide its own complete set of headers we should be able to reuse a lot of work to this effect.

> - provide GPU-side overloads which would allow us to pull in system headers and augment them with GPU-specific implementations when necessary.

I don't think this is a robust solution to implementing a `libc`. The problem is that the system headers can pull in implementation details that we don't necessarily want to recreate. Things like functions or symbols that aren't defined in our `libc` that we have no way of wrapping around if we use the system header. I think the only reasonable way is to try to keep them completely separate and add some glue to the generated headers for the GPU as-needed.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146973/new/

https://reviews.llvm.org/D146973