[Openmp-commits] [PATCH] D46185: [OpenMP] Allow nvptx sm_30 to be used as an offloading device

Guansong Zhang via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Apr 27 06:02:17 PDT 2018

guansong created this revision.
guansong added a reviewer: grokos.
guansong added a project: OpenMP.

Patched by Gregory Rodgers.

It uses 64bit atomicCAS to achieve the atomicMax on sm_30. The 64 bit atomicMax is not available on sm_30.

  rOMP OpenMP



Index: libomptarget/deviceRTLs/nvptx/src/loop.cu
--- libomptarget/deviceRTLs/nvptx/src/loop.cu
+++ libomptarget/deviceRTLs/nvptx/src/loop.cu
@@ -757,8 +757,16 @@
     // Atomic max of iterations.
     uint64_t *varArray = (uint64_t *)array;
     uint64_t elem = varArray[i];
+#if defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 350
     (void)atomicMax((unsigned long long int *)Buffer,
                     (unsigned long long int)elem);
+    uint64_t old_value = *Buffer;
+    while (old_value < elem && !atomicCAS((unsigned long long *)Buffer,
+                                          (unsigned long long)old_value,
+                                          (unsigned long long)elem)) {
+    };
     // Barrier.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D46185.144320.patch
Type: text/x-patch
Size: 841 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20180427/8a8cea39/attachment.bin>

More information about the Openmp-commits mailing list