[Openmp-commits] [PATCH] D65836: Factor architecture dependent code out of loop.cu

Wed Aug 7 13:41:13 PDT 2019

jdoerfert added inline comments.

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/loop.cu:392
   INLINE static uint64_t NextIter() {
-    unsigned int active = __ACTIVEMASK();
-    int leader = __ffs(active) - 1;
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > I would prefer `__SHFL_SYNC` `__ACTIVEMASK` etc. also to be function calls to `__kmpc_XXXX` functions but I won't require it.
> Likewise. They're currently defined in `omptarget-nvptx.h`. I'd like to move them into target_impl and replace the macros with inline functions.
> 
> That'll raise the question of how to handle implementations for different cuda versions, which I'd like to avoid for this first patch.
Agreed, let's do it later but eventually.

We can have the #ifdef cascade for cuda versions but that should be in the cuda subfolder (target_impl.h for example) and the "general logic" contains proper calls.

================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:39
+
+INLINE int __kmpc_impl_popc(uint32_t x) { return __popc(x); }
+
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > `int` -> `(u)int32_t` ?
> The cuda functions (plus `__ffsll` etc) return `int`. I'd slightly prefer `uint32_t` (on the basis that these can't return a negative integer). I'm happy either way.
I think we should make it clear what we expect wrt. bit-width if we have an expectation. Given the `(u)int32` floating around I'd say we do have some expectations/restrictions.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65836/new/

https://reviews.llvm.org/D65836