[Openmp-commits] [PATCH] D88185: [OpenMP] cmake option LIBOMPTARGET_NVPTX_MAX_SM for nvptx device RTL

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Sep 23 18:03:00 PDT 2020


JonChesterfield added inline comments.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:62
+// GA102 design has a maxinum of 84 SMs
+#define MAX_SM 108
+#elif __CUDA_ARCH__ >= 700
----------------
ye-luo wrote:
> JonChesterfield wrote:
> > Can we distinguish between GA100 and GA102? This structure is large so oversizing wastes significant memory.
> GA100 is __CUDA_ARCH__ 800. GA102 is 860.
> There are also 700, 720, 750
> I don't really feel the necessity to add more resolution because LIBOMPTARGET_NVPTX_MAX_SM can be leveraged.
It could matter to someone with a GA102 who hasn't read the cmake. Back of envelope math suggests there's a little under a gigabyte of allocated but unused memory between 84 and 108.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88185/new/

https://reviews.llvm.org/D88185



More information about the Openmp-commits mailing list