[Openmp-commits] [PATCH] D88185: [OpenMP] cmake option LIBOMPTARGET_NVPTX_MAX_SM for nvptx device RTL

Ye Luo via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Sep 23 17:11:15 PDT 2020


ye-luo added inline comments.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:62
+// GA102 design has a maxinum of 84 SMs
+#define MAX_SM 108
+#elif __CUDA_ARCH__ >= 700
----------------
JonChesterfield wrote:
> Can we distinguish between GA100 and GA102? This structure is large so oversizing wastes significant memory.
GA100 is __CUDA_ARCH__ 800. GA102 is 860.
There are also 700, 720, 750
I don't really feel the necessity to add more resolution because LIBOMPTARGET_NVPTX_MAX_SM can be leveraged.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88185/new/

https://reviews.llvm.org/D88185



More information about the Openmp-commits mailing list