[Openmp-commits] [PATCH] D95060: [libomptarget][devicertl][nfc] Remove some cuda intrinsics, simplify

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Wed Jan 20 09:45:57 PST 2021


JonChesterfield added inline comments.


================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h:68
 
+enum : __kmpc_impl_lanemask_t {
+  __kmpc_impl_all_lanes = ~(__kmpc_impl_lanemask_t)0
----------------
reordered to closer match nvptx


================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:187
 
+INLINE static uint32_t kmpcMin(uint32_t x, uint32_t y) {
+  return x < y ? x : y;
----------------
Also considered 'min', but expect it to collide with one of the included headers, and 'min_uint32_t' but couldn't see a pretty way to write that inTheNamingConvention.

INLINE static is a slightly strange construct, chosen to match the functions above.


================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:268
   // Check if this is the very last team.
-  unsigned NumRecs = __kmpc_impl_min(NumTeams, uint32_t(num_of_records));
+  unsigned NumRecs = kmpcMin(NumTeams, uint32_t(num_of_records));
   if (ChunkTeamCount == NumTeams - Bound - 1) {
----------------
Three call sites, all uint32_t. Don't need a template.


================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:95
 DEVICE double __kmpc_impl_get_wtime();
 
+INLINE uint32_t __kmpc_impl_ffs(uint32_t x) { return __builtin_ffs(x); }
----------------
These might be better changing typed as `__kmpc_impl_lanemask_t`, implemented in a common header that picks builtin_ffs or builtin_ffs based on sizeof(`__kmpc_impl_lanemask_t`). This keeps the current name mangling dispatch and separate implementations.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95060/new/

https://reviews.llvm.org/D95060



More information about the Openmp-commits mailing list