[Openmp-commits] [PATCH] D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC.

Alexey Bataev via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Jul 22 13:50:51 PDT 2019


ABataev created this revision.
ABataev added a reviewer: grokos.
Herald added subscribers: jdoerfert, guansong.
Herald added a project: OpenMP.

Due to architectural changes, the minimal chunk of threads that can
executed separately without undefined bahavior on Cuda9+ is a warp, i.e.
the whole warp must be executed in convergence. Otherwise, the runtime
produces either incorrect results, or (in most cases) just hangs because
of the synchronous nature of new shuffle, ballot, warpsync etc.
functions.
Reworked the test to match these limitations to make it work on cuda9+
(critical regions require warp size minimal blocks). Also, the code of
the critical sections must be fixed to make it finally work in Cuda9+
(currently it still does not work correctly, some extra changes to the runtime and compiler are required).


Repository:
  rOMP OpenMP

https://reviews.llvm.org/D65112

Files:
  libomptarget/deviceRTLs/nvptx/test/parallel/spmd_parallel_regions.cpp


Index: libomptarget/deviceRTLs/nvptx/test/parallel/spmd_parallel_regions.cpp
===================================================================
--- libomptarget/deviceRTLs/nvptx/test/parallel/spmd_parallel_regions.cpp
+++ libomptarget/deviceRTLs/nvptx/test/parallel/spmd_parallel_regions.cpp
@@ -10,7 +10,7 @@
 
 #pragma omp target parallel for map(tofrom                                     \
                                     : isHost, ParallelLevel1, ParallelLevel2), reduction(+: Count) schedule(static, 1)
-  for (int J = 0; J < 10; ++J) {
+  for (int J = 0; J < 64; ++J) {
 #pragma omp critical
     {
       isHost = (isHost < 0 || isHost == 0) ? omp_is_initial_device() : isHost;
@@ -18,18 +18,18 @@
                            ? omp_get_level()
                            : ParallelLevel1;
     }
-    if (omp_get_thread_num() > 5) {
+    if (omp_get_thread_num() > 31) {
       int L2;
 #pragma omp parallel for schedule(dynamic) lastprivate(L2) reduction(+: Count)
       for (int I = 0; I < 10; ++I) {
         L2 = omp_get_level();
-        Count += omp_get_level(); // (10-6)*10*2 = 80
+        Count += omp_get_level(); // (64-32)*10*2 = 640
       }
 #pragma omp critical
       ParallelLevel2 =
           (ParallelLevel2 < 0 || ParallelLevel2 == 2) ? L2 : ParallelLevel2;
     } else {
-      Count += omp_get_level(); // 6 * 1 = 6
+      Count += omp_get_level(); // 32 * 1 = 32
     }
   }
 
@@ -42,9 +42,9 @@
   // CHECK: Parallel level in SPMD mode: L1 is 1, L2 is 2
   printf("Parallel level in SPMD mode: L1 is %d, L2 is %d\n", ParallelLevel1,
          ParallelLevel2);
-  // Final result of Count is (10-6)(num of loops)*10(num of iterations)*2(par
-  // level) + 6(num of iterations) * 1(par level)
-  // CHECK: Expected count = 86
+  // Final result of Count is (64-32)(num of loops)*10(num of iterations)*2(par
+  // level) + 32(num of iterations) * 1(par level)
+  // CHECK: Expected count = 672
   printf("Expected count = %d\n", Count);
 
   return isHost;


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D65112.211181.patch
Type: text/x-patch
Size: 1994 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20190722/cacde627/attachment-0001.bin>


More information about the Openmp-commits mailing list