[PATCH] D140433: [Clang] Add `nvptx-arch` tool to query installed NVIDIA GPUs

Tue Jan 3 16:55:23 PST 2023

tra added inline comments.

================
Comment at: clang/tools/nvptx-arch/NVPTXArch.cpp:34
+  const char *ErrStr = nullptr;
+  CUresult Result = cuGetErrorString(Err, &ErrStr);
+  if (Result != CUDA_SUCCESS)
----------------
jhuber6 wrote:
> tra wrote:
> > One problem with this approach is that `nvptx-arch` will fail to run on a machine without NVIDIA drivers installed because dynamic linker will not find `libcuda.so.1`.
> > 
> > Ideally we want it to run on any machine and fail the way we want.
> > 
> > A typical way to achieve that is to dlopen("libcuda.so.1"), and obtain the pointers to the functions we need via `dlsym()`.
> > 
> > 
> We do this in the OpenMP runtime. I mostly copied this approach from the existing `amdgpu-arch` but we could change both to use this method.
An alternative would be to enumerate GPUs using CUDA runtime API, and link statically with libcudart_static.a

CUDA runtime will take care of finding libcuda.so and will return an error if it fails, so you do not need to mess with dlopen, etc.

E.g. this could be used as a base:
https://github.com/NVIDIA/cuda-samples/blob/master/Samples/1_Utilities/deviceQuery/deviceQuery.cpp

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140433/new/

https://reviews.llvm.org/D140433