[PATCH] D84743: [Clang][AMDGCN] Universal device offloading macros header
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Tue Jul 28 11:11:11 PDT 2020
tra added a comment.
I'm not sure it's particularly useful, to be honest. CUDA code still needs to be compatible with NVCC so it can't be used in portable code like TF or other currently used CUDA libraries.
It could be useful internally, though, so I'm fine with it for that purpose.
================
Comment at: clang/lib/Headers/offload_macros.h:28
+#undef _DEVICE_GPU
+#undef _DEVICE_ARCH
+
----------------
The naming seems to conflict with the current notation that GPU `arch` is the specific GPU variant. E.g. `gfx900` or `sm_60`.
Perhaps we should use a higher level term. `kind`, `vendor`?
================
Comment at: clang/lib/Headers/offload_macros.h:33
+// So if either set, this is an OpenMP device pass.
+#if defined(__AMDGCN__) || defined(__NVPTX__)
+#if defined(__AMDGCN__)
----------------
I'd just split it into separate `if` sections for AMDGCN and NVPTX. One less nesting level for preprocessor conditionals would be easier to follow.
================
Comment at: clang/lib/Headers/offload_macros.h:35
+#if defined(__AMDGCN__)
+#define _DEVICE_ARCH amdgcn
+// _DEVICE_GPU set below
----------------
What exactly is `amdgcn` and how can it be used in practice? I.e. I can't use it in preprocessor conditionals.
I think you need to have numberic constants defined for the different `ARCH` variants.
================
Comment at: clang/lib/Headers/offload_macros.h:37
+// _DEVICE_GPU set below
+#else
+#define _DEVICE_ARCH nvptx64
----------------
Please add a comment tracking which conditional this `else` is for. E.g. `// __AMDGCN__`
================
Comment at: clang/lib/Headers/offload_macros.h:38
+#else
+#define _DEVICE_ARCH nvptx64
+#define _DEVICE_GPU __CUDA_ARCH__
----------------
Nit -- there's techically 32-bit nvptx, even though it's getting obsolete.
================
Comment at: clang/lib/Headers/offload_macros.h:71
+
+#if defined(_DEVICE_ARCH) && (_DEVICE_ARCH == amdgcn)
+// AMD uses binary macros only, so create a value for _DEVICE_GPU
----------------
This does not work, does it? https://godbolt.org/z/Kn3r4x
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D84743/new/
https://reviews.llvm.org/D84743
More information about the cfe-commits
mailing list