[PATCH] D45277: [CUDA] Add amdgpu sub archs
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Apr 5 10:10:13 PDT 2018
tra added a comment.
I didn't get a chance to review the patch before it got committed.
================
Comment at: lib/Basic/Targets.cpp:161
+ case CudaArch::GFX902:
+ return "320";
+ case CudaArch::UNKNOWN:
----------------
Unless you're planning to guarantee 1:1 match to functionality provided by nvidia's sm_32, it would be prudent to use some other value for the macro so the source code has a way to tell these GPUs apart.
Another issue with this approach is that typical use pattern for __CUDA_ARCH__ is
`#if __CUDA_ARCH__ >= XXX`. I don't expect that we'll always be able to maintain order across GPU architectures among NVIDIA and AMD GPUs. Perhaps for HIP compilation it would make more sense to define __CUDA_ARCH__ as 1 (this should serve as a legacy indication of device-side compilation) and define __HIP_ARCH__ to indicate which AMD GPU we're compiling for without accidentally enabling something that was intended for NVIDIA's GPUs only.
Repository:
rC Clang
https://reviews.llvm.org/D45277
More information about the cfe-commits
mailing list