[Openmp-commits] [openmp] [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (PR #84928)

Joseph Huber via Openmp-commits openmp-commits at lists.llvm.org
Tue Mar 12 08:06:10 PDT 2024


https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/84928

Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.


>From e0b06ef2cb91cf779dfcaab0d78135569afc49da Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 12 Mar 2024 10:04:57 -0500
Subject: [PATCH] [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL

Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
---
 openmp/libomptarget/DeviceRTL/src/Mapping.cpp | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
index 31dd8054dec33e..b2028a8fb4f506 100644
--- a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
+++ b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
@@ -172,10 +172,7 @@ uint32_t getThreadIdInBlock(int32_t Dim) {
   UNREACHABLE("Dim outside range!");
 }
 
-uint32_t getThreadIdInWarp() {
-  return impl::getThreadIdInBlock(mapping::DIM_X) &
-         (mapping::getWarpSize() - 1);
-}
+uint32_t getThreadIdInWarp() { return __nvvm_read_ptx_sreg_laneid(); }
 
 uint32_t getBlockIdInKernel(int32_t Dim) {
   switch (Dim) {



More information about the Openmp-commits mailing list