[libc-commits] [PATCH] D157548: [libc] Silence integer shortening warnings on NVPTX masks

Joseph Huber via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed Aug 9 14:02:31 PDT 2023


jhuber6 created this revision.
jhuber6 added a reviewer: JonChesterfield.
Herald added subscribers: libc-commits, mattd, gchakrabarti, asavonic.
Herald added projects: libc-project, All.
jhuber6 requested review of this revision.
Herald added a subscriber: wangpc.

Nvidia uses a 32-bit mask, but we store it in a common 64-bit integer to
provide it with a compatible ABI with the AMD implementaiton which may
use a 64-bit mask. Silence these warnings by explicitly casting to the
smaller value, we know this is always legal as the result will always
fit into the smaller value if it was generated on NVPTX.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D157548

Files:
  libc/src/__support/GPU/nvptx/utils.h


Index: libc/src/__support/GPU/nvptx/utils.h
===================================================================
--- libc/src/__support/GPU/nvptx/utils.h
+++ libc/src/__support/GPU/nvptx/utils.h
@@ -115,8 +115,8 @@
   // NOTE: This is not sufficient in all cases on Volta hardware or later. The
   // lane mask returned here is not always the true lane mask used by the
   // intrinsics in cases of incedental or enforced divergence by the user.
-  uint64_t lane_mask = get_lane_mask();
-  uint64_t id = __builtin_ffsl(lane_mask) - 1;
+  uint32_t lane_mask = static_cast<uint32_t>(get_lane_mask());
+  uint32_t id = __builtin_ffs(lane_mask) - 1;
 #if __CUDA_ARCH__ >= 600
   return __nvvm_shfl_sync_idx_i32(lane_mask, x, id, get_lane_size() - 1);
 #else
@@ -127,9 +127,9 @@
 /// Returns a bitmask of threads in the current lane for which \p x is true.
 [[clang::convergent]] LIBC_INLINE uint64_t ballot(uint64_t lane_mask, bool x) {
 #if __CUDA_ARCH__ >= 600
-  return __nvvm_vote_ballot_sync(lane_mask, x);
+  return __nvvm_vote_ballot_sync(static_cast<uint32_t>(lane_mask), x);
 #else
-  return lane_mask & __nvvm_vote_ballot(x);
+  return static_cast<uint32_t>(lane_mask) & __nvvm_vote_ballot(x);
 #endif
 }
 /// Waits for all the threads in the block to converge and issues a fence.
@@ -137,7 +137,7 @@
 
 /// Waits for all threads in the warp to reconverge for independent scheduling.
 [[clang::convergent]] LIBC_INLINE void sync_lane(uint64_t mask) {
-  __nvvm_bar_warp_sync(mask);
+  __nvvm_bar_warp_sync(static_cast<uint32_t>(mask));
 }
 
 /// Returns the current value of the GPU's processor clock.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D157548.548764.patch
Type: text/x-patch
Size: 1607 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libc-commits/attachments/20230809/829ab467/attachment-0001.bin>


More information about the libc-commits mailing list