[PATCH] D42800: Let CUDA toolchain support amdgpu target

Thu Feb 1 18:42:18 PST 2018

gregrodgers requested changes to this revision.
gregrodgers added a comment.
This revision now requires changes to proceed.

Thanks to everyone for the reviews.   I hope I replied to all inline comments.  Since I sent this to Sam to post, we discovered a major shortcoming.  As tra points out, there is a lot of cuda headers in the cuda sdk that are processed.  We are able to override asm() expansions with #undef and redefine as an equivalent amdgpu component so the compiler never sees the asm().  I am sure we will need to add more redefines as we broaden our testing.  But that is not the big problem.  We would like to be able to run cudaclang for AMD GPUs without an install of cuda.   Of course you must always install cuda if any of your targeted GPUs are NVidia GPUs.  To run cudaclang without cuda when only non-NVidia gpus are specified, we need an open set of headers and we must replace the fatbin tools used in the toolchain.  The later can be addressed by using the libomptarget methods for embedding multiple target GPU objects.  The former is going to take a lot of work.   I am going to be sending an updated patch that has the stubs for the open headers noted in __clang_cuda_runtime_wrapper.h.   They will be included with the CC1 flag -D__USE_OPEN_HEADERS__.  This will be generated by the cuda driver when it finds no cuda installation and all target GPUs are not NVidia.

https://reviews.llvm.org/D42800