[PATCH] D42800: Let CUDA toolchain support amdgpu target

Thu Feb 1 10:16:01 PST 2018

tra added a comment.

I don't have enough knowledge about compute on AMD's GPU and would appreciate if you could share your thoughts on how you think CUDA on AMD should work. Is there a good document describing how compute currently works (how do I launch a kernel using rough equivalent of nvidia's driver API <http://docs.nvidia.com/cuda/cuda-driver-api/index.html> ) on AMD GPUs?

- Headers. clang pre-includes *a lot* of headers from NVidia's CUDA SDK. Some of them may work for AMD, but some certainly will not -- there are a lot of headers with nvidia-specific inline assembly or things that rely on nvidia-specific functionality. In the end, I think, we'll need some sort of CUDA SDK for AMD which would implement (possibly with asserts for unsupported functions) existing CUDA APIs. Or, perhaps the plan is to just use CUDA **syntax** only without providing complete API compatibility with nvidia.

- How will GPU-side object file be incorporated into the final executable? I believe OpenMP has a fairly generic way to deal with it in clang. I'm not sure if that would be suitable for use with AMD's runtime (whatever we need to use to launch the kernels).

- Launching kernels. Will it be similar to the way kernel launches are configured on NVidia? I.e. grid of blocks of threads with per-block shared memory.

https://reviews.llvm.org/D42800