[all-commits] [llvm/llvm-project] 2bef46: [libc] Add a loader utility for NVPTX architecture...

Fri Mar 24 18:05:00 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 2bef46d2ad872794c83a49f1da12b1b20835f75d
      https://github.com/llvm/llvm-project/commit/2bef46d2ad872794c83a49f1da12b1b20835f75d
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-03-24 (Fri, 24 Mar 2023)

  Changed paths:
    M libc/utils/gpu/loader/CMakeLists.txt
    M libc/utils/gpu/loader/Loader.h
    M libc/utils/gpu/loader/amdgpu/Loader.cpp
    A libc/utils/gpu/loader/nvptx/CMakeLists.txt
    A libc/utils/gpu/loader/nvptx/Loader.cpp

  Log Message:
  -----------
  [libc] Add a loader utility for NVPTX architectures for testing

This patch adds a loader utility targeting the CUDA driver API to launch
NVPTX images called `nvptx_loader`. This takes a GPU image on the
command line and launches the `_start` kernel with the appropriate
arguments. The `_start` kernel is provided by the already implemented
`nvptx/start.cpp`. So, an application with a `main` function can be
compiled and run as follows.

```
clang++ --target=nvptx64-nvidia-cuda main.cpp crt1.o -march=sm_70 -o image
./nvptx_loader image args to kernel
```

This implementation is not tested and does not yet support RPC. This
requires further development to work around NVIDIA specific limitations
in atomics and linking.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D146681

  Commit: 58f5e5e6b00e5dd674d6e37ed651bc996a397cc3
      https://github.com/llvm/llvm-project/commit/58f5e5e6b00e5dd674d6e37ed651bc996a397cc3
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-03-24 (Fri, 24 Mar 2023)

  Changed paths:
    M libc/startup/gpu/nvptx/CMakeLists.txt
    M libc/startup/gpu/nvptx/start.cpp
    M libc/utils/gpu/loader/nvptx/Loader.cpp

  Log Message:
  -----------
  [libc] Implement the RPC client / server for NVPTX

This patch adds the necessary code to impelement the existing RPC client
/ server interface when targeting NVPTX GPUs. This follows closely to
the implementation in the AMDGPU version. This does not yet enable unit
testing as the `nvlink` linker does not support static libraries. So
that will need to be worked around.

I am ignoring the RPC duplication between the AMDGPU and NVPTX loaders. This
will be changed completely later so there's no point unifying the code at this
stage. The implementation was tested manually with the following file and
compilation flags.

```
namespace __llvm_libc {
void write_to_stderr(const char *msg);
void quick_exit(int);
} // namespace __llvm_libc

using namespace __llvm_libc;

int main(int argc, char **argv, char **envp) {
  for (int i = 0; i < argc; ++i) {
    write_to_stderr(argv[i]);
    write_to_stderr("\n");
  }
  quick_exit(255);
}
```

```
$ clang++ crt1.o rpc_client.o quick_exit.o io.o main.cpp --target=nvptx64-nvidia-cuda -march=sm_70 -o image
$ ./nvptx_loader image 1 2 3
image
1
2
3
$ echo $?
255
```

Depends on D146681

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D146846

Compare: https://github.com/llvm/llvm-project/compare/583120642694...58f5e5e6b00e