[libc-commits] [clang] [libc] [Clang] Add width handling for <gpuintrin.h> shuffle helper (PR #125896)

Joseph Huber via libc-commits libc-commits at lists.llvm.org
Wed Feb 5 12:18:51 PST 2025


================
@@ -149,22 +149,23 @@ _DEFAULT_FN_ATTRS static __inline__ void __gpu_sync_lane(uint64_t __lane_mask) {
 
 // Shuffles the the lanes inside the warp according to the given index.
 _DEFAULT_FN_ATTRS static __inline__ uint32_t
-__gpu_shuffle_idx_u32(uint64_t __lane_mask, uint32_t __idx, uint32_t __x) {
+__gpu_shuffle_idx_u32(uint64_t __lane_mask, uint32_t __idx, uint32_t __x,
+                      uint32_t __width) {
   uint32_t __mask = (uint32_t)__lane_mask;
-  return __nvvm_shfl_sync_idx_i32(__mask, __x, __idx, __gpu_num_lanes() - 1u);
+  return __nvvm_shfl_sync_idx_i32(__mask, __x, __idx,
+                                  ((__gpu_num_lanes() - __width) << 8u) | 0x1f);
----------------
jhuber6 wrote:

Do you know if this is okay so I can approve the backport? (Also below is my punishment for forgetting to commit the fix the clang test.)

https://github.com/llvm/llvm-project/pull/125896


More information about the libc-commits mailing list