[Openmp-commits] [PATCH] D95752: [OpenMP][DeviceRTL] Extract shuffle idiom and port it to declare variant

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Sat Jan 30 17:43:48 PST 2021


JonChesterfield added a comment.

Comments inline. Not totally sure this is better, code seems longer and more complicated than it was before. Aim is merging the two GPU implementations?



================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:27
+///{
+unsigned GetLaneId();
+unsigned GetWarpSize();
----------------
Extern C? Also, why forward declare instead of include header?


================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:40
+
+inline int32_t __kmpc_impl_shfl_sync(int64_t Mask, int32_t Var,
+                                     int32_t SrcLane) {
----------------
uint64_t for mask. Sign bit is just one of the lanes.


================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:88
+                                     int32_t SrcLane) {
+// In Cuda 9.0, the *_sync() version takes an extra argument 'mask'.
+#if CUDA_VERSION >= 9000
----------------
Seems bad, both because it's a macro instead of variant, and because I thought we'd already got rid of that macro


================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:117
+inline int32_t __kmpc_shuffle_int32(int32_t val, int16_t delta, int16_t size) {
+  return __kmpc_impl_shfl_down_sync(-1, val, delta, size);
+}
----------------
Not sure about int constants near places where the 32/64 bit distinction is important


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95752/new/

https://reviews.llvm.org/D95752



More information about the Openmp-commits mailing list