[Openmp-commits] [PATCH] D95060: [libomptarget][devicertl][nfc] Remove some cuda intrinsics, simplify
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Jan 20 09:45:57 PST 2021
JonChesterfield added inline comments.
================
Comment at: openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.h:68
+enum : __kmpc_impl_lanemask_t {
+ __kmpc_impl_all_lanes = ~(__kmpc_impl_lanemask_t)0
----------------
reordered to closer match nvptx
================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:187
+INLINE static uint32_t kmpcMin(uint32_t x, uint32_t y) {
+ return x < y ? x : y;
----------------
Also considered 'min', but expect it to collide with one of the included headers, and 'min_uint32_t' but couldn't see a pretty way to write that inTheNamingConvention.
INLINE static is a slightly strange construct, chosen to match the functions above.
================
Comment at: openmp/libomptarget/deviceRTLs/common/src/reduction.cu:268
// Check if this is the very last team.
- unsigned NumRecs = __kmpc_impl_min(NumTeams, uint32_t(num_of_records));
+ unsigned NumRecs = kmpcMin(NumTeams, uint32_t(num_of_records));
if (ChunkTeamCount == NumTeams - Bound - 1) {
----------------
Three call sites, all uint32_t. Don't need a template.
================
Comment at: openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h:95
DEVICE double __kmpc_impl_get_wtime();
+INLINE uint32_t __kmpc_impl_ffs(uint32_t x) { return __builtin_ffs(x); }
----------------
These might be better changing typed as `__kmpc_impl_lanemask_t`, implemented in a common header that picks builtin_ffs or builtin_ffs based on sizeof(`__kmpc_impl_lanemask_t`). This keeps the current name mangling dispatch and separate implementations.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D95060/new/
https://reviews.llvm.org/D95060
More information about the Openmp-commits
mailing list