[Openmp-commits] [PATCH] D55440: [OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9.

Alexey Bataev via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Mon Dec 10 06:32:11 PST 2018


This revision was automatically updated to reflect the committed changes.
Closed by commit rOMP348758: [OPENMP][NVPTX]Enable fast shuffles on 64bit values only if CUDA >= 9. (authored by ABataev, committed by ).

Changed prior to commit:
  https://reviews.llvm.org/D55440?vs=177233&id=177500#toc

Repository:
  rOMP OpenMP

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55440/new/

https://reviews.llvm.org/D55440

Files:
  libomptarget/deviceRTLs/nvptx/src/reduction.cu


Index: libomptarget/deviceRTLs/nvptx/src/reduction.cu
===================================================================
--- libomptarget/deviceRTLs/nvptx/src/reduction.cu
+++ libomptarget/deviceRTLs/nvptx/src/reduction.cu
@@ -76,7 +76,17 @@
 }
 
 EXTERN int64_t __kmpc_shuffle_int64(int64_t val, int16_t delta, int16_t size) {
-  return __SHFL_DOWN_SYNC(0xFFFFFFFFFFFFFFFFL, val, delta, size);
+#if defined(CUDART_VERSION) && CUDART_VERSION >= 9000
+  return __SHFL_DOWN_SYNC(0xFFFFFFFFFFFFFFFFLL, (long long)val, (unsigned)delta,
+                          (int)size);
+#else
+   int lo, hi;
+   asm volatile("mov.b64 {%0,%1}, %2;" : "=r"(lo), "=r"(hi) : "l"(val));
+   hi = __SHFL_DOWN_SYNC(0xFFFFFFFF, hi, delta, size);
+   lo = __SHFL_DOWN_SYNC(0xFFFFFFFF, lo, delta, size);
+   asm volatile("mov.b64 %0, {%1,%2};" : "=l"(val) : "r"(lo), "r"(hi));
+   return val;
+#endif
 }
 
 static INLINE void gpu_regular_warp_reduce(void *reduce_data,


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D55440.177500.patch
Type: text/x-patch
Size: 946 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/openmp-commits/attachments/20181210/b7326679/attachment.bin>


More information about the Openmp-commits mailing list