[Openmp-commits] [PATCH] D95752: [OpenMP][DeviceRTL] Extract shuffle idiom and port it to declare variant
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Sat Jan 30 17:43:48 PST 2021
JonChesterfield added a comment.
Comments inline. Not totally sure this is better, code seems longer and more complicated than it was before. Aim is merging the two GPU implementations?
================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:27
+///{
+unsigned GetLaneId();
+unsigned GetWarpSize();
----------------
Extern C? Also, why forward declare instead of include header?
================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:40
+
+inline int32_t __kmpc_impl_shfl_sync(int64_t Mask, int32_t Var,
+ int32_t SrcLane) {
----------------
uint64_t for mask. Sign bit is just one of the lanes.
================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:88
+ int32_t SrcLane) {
+// In Cuda 9.0, the *_sync() version takes an extra argument 'mask'.
+#if CUDA_VERSION >= 9000
----------------
Seems bad, both because it's a macro instead of variant, and because I thought we'd already got rid of that macro
================
Comment at: openmp/libomptarget/deviceRTLs/common/include/shuffle.h:117
+inline int32_t __kmpc_shuffle_int32(int32_t val, int16_t delta, int16_t size) {
+ return __kmpc_impl_shfl_down_sync(-1, val, delta, size);
+}
----------------
Not sure about int constants near places where the 32/64 bit distinction is important
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95752/new/
https://reviews.llvm.org/D95752
More information about the Openmp-commits
mailing list