[Openmp-commits] [PATCH] D65836: Factor architecture dependent code out of loop.cu
Johannes Doerfert via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Thu Aug 8 07:18:35 PDT 2019
jdoerfert added a comment.
@ABataev, others, any concerns? If not, let's go ahead with this.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:39
+
+INLINE int __kmpc_impl_popc(uint32_t x) { return __popc(x); }
+
----------------
JonChesterfield wrote:
> jdoerfert wrote:
> > JonChesterfield wrote:
> > > jdoerfert wrote:
> > > > `int` -> `(u)int32_t` ?
> > > The cuda functions (plus `__ffsll` etc) return `int`. I'd slightly prefer `uint32_t` (on the basis that these can't return a negative integer). I'm happy either way.
> > I think we should make it clear what we expect wrt. bit-width if we have an expectation. Given the `(u)int32` floating around I'd say we do have some expectations/restrictions.
> I think unsigned is clearer for bit manipulation functions. Width is mostly determined by warp/wavefront width, at least for these functions. Return width, inclined to follow IR intrinsics and match the argument.
>
> There's a mix of signed and unsigned in the existing code, with implicit conversions between them. Amdgcn happens to use a slightly different mix. Considering that NextIter returns a uint64_t, what would you think of using uint32/64 for the functions in loop as well as in this header?
> what would you think of using uint32/64 for the functions in loop as well as in this header?
I would prefer that. We might need a device specific type but (unsigned) int does make me worry every time.
Let's table this discussion for now to make progress on this.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D65836/new/
https://reviews.llvm.org/D65836
More information about the Openmp-commits
mailing list