[PATCH] D101595: [Clang][OpenMP] Allow unified_shared_memory for Pascal-generation GPUs.

Thu Apr 29 21:32:36 PDT 2021

Meinersbur created this revision.
Meinersbur added reviewers: ye-luo, JonChesterfield, ABataev, patricklyster, kkwli0.
Meinersbur added projects: OpenMP, clang.
Herald added subscribers: guansong, yaxunl.
Meinersbur requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added a subscriber: sstefan1.

The Pascal architecture supports the page migration engine required for unified_shared_memory, as indicated by NVIDIA:

- https://developer.nvidia.com/blog/unified-memory-cuda-beginners/
- https://developer.nvidia.com/blog/beyond-gpu-memory-limits-unified-memory-pascal/
- https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements

The limitation was introduced in D54493 <https://reviews.llvm.org/D54493> which justified the cut-off by the requirement for unified addressing. However,  Unified Virtual Addressing (UVA) is already available with sm20 (Fermi. Kepler, Maxwell) <https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#basics-of-uva-cuda-memory-management>. Unified shared memory might even be possible with these, but with migration of entire allocations on kernel startup.

To be sure, I enabled the tests for a Pascal GPU which finish successfully <http://meinersbur.de:8011/#/builders/143/builds/345>.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D101595

Files:
  clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp


Index: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
===================================================================

--- clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -4441,10 +4441,7 @@
       case CudaArch::SM_37:
       case CudaArch::SM_50:
       case CudaArch::SM_52:
-      case CudaArch::SM_53:
-      case CudaArch::SM_60:
-      case CudaArch::SM_61:
-      case CudaArch::SM_62: {
+      case CudaArch::SM_53: {
         SmallString<256> Buffer;
         llvm::raw_svector_ostream Out(Buffer);
         Out << "Target architecture " << CudaArchToString(Arch)
@@ -4452,6 +4449,9 @@
         CGM.Error(Clause->getBeginLoc(), Out.str());
         return;
       }
+      case CudaArch::SM_60:
+      case CudaArch::SM_61:
+      case CudaArch::SM_62:
       case CudaArch::SM_70:
       case CudaArch::SM_72:
       case CudaArch::SM_75:


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D101595.341762.patch
Type: text/x-patch
Size: 888 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20210430/ae5560bd/attachment.bin>