[Openmp-commits] [PATCH] D65836: Factor architecture dependent code out of loop.cu
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Aug 7 13:01:31 PDT 2019
JonChesterfield marked 2 inline comments as done.
JonChesterfield added inline comments.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/loop.cu:392
INLINE static uint64_t NextIter() {
- unsigned int active = __ACTIVEMASK();
- int leader = __ffs(active) - 1;
----------------
jdoerfert wrote:
> I would prefer `__SHFL_SYNC` `__ACTIVEMASK` etc. also to be function calls to `__kmpc_XXXX` functions but I won't require it.
Likewise. They're currently defined in `omptarget-nvptx.h`. I'd like to move them into target_impl and replace the macros with inline functions.
That'll raise the question of how to handle implementations for different cuda versions, which I'd like to avoid for this first patch.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:39
+
+INLINE int __kmpc_impl_popc(uint32_t x) { return __popc(x); }
+
----------------
jdoerfert wrote:
> `int` -> `(u)int32_t` ?
The cuda functions (plus `__ffsll` etc) return `int`. I'd slightly prefer `uint32_t` (on the basis that these can't return a negative integer). I'm happy either way.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65836/new/
https://reviews.llvm.org/D65836
More information about the Openmp-commits
mailing list