[Openmp-commits] [PATCH] D65836: Factor architecture dependent code out of loop.cu

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Thu Aug 8 03:16:19 PDT 2019


JonChesterfield marked an inline comment as done.
JonChesterfield added inline comments.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:39
+
+INLINE int __kmpc_impl_popc(uint32_t x) { return __popc(x); }
+
----------------
jdoerfert wrote:
> JonChesterfield wrote:
> > jdoerfert wrote:
> > > `int` -> `(u)int32_t` ?
> > The cuda functions (plus `__ffsll` etc) return `int`. I'd slightly prefer `uint32_t` (on the basis that these can't return a negative integer). I'm happy either way.
> I think we should make it clear what we expect wrt. bit-width if we have an expectation. Given the `(u)int32` floating around I'd say we do have some expectations/restrictions.
I think unsigned is clearer for bit manipulation functions. Width is mostly determined by warp/wavefront width, at least for these functions. Return width, inclined to follow IR intrinsics and match the argument.

There's a mix of signed and unsigned in the existing code, with implicit conversions between them. Amdgcn happens to use a slightly different mix. Considering that NextIter returns a uint64_t, what would you think of using uint32/64 for the functions in loop as well as in this header?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D65836/new/

https://reviews.llvm.org/D65836





More information about the Openmp-commits mailing list