[Openmp-commits] [openmp] [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (PR #84928)
Joseph Huber via Openmp-commits
openmp-commits at lists.llvm.org
Tue Mar 12 08:06:10 PDT 2024
https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/84928
Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
>From e0b06ef2cb91cf779dfcaab0d78135569afc49da Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Tue, 12 Mar 2024 10:04:57 -0500
Subject: [PATCH] [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL
Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
---
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
index 31dd8054dec33e..b2028a8fb4f506 100644
--- a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
+++ b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
@@ -172,10 +172,7 @@ uint32_t getThreadIdInBlock(int32_t Dim) {
UNREACHABLE("Dim outside range!");
}
-uint32_t getThreadIdInWarp() {
- return impl::getThreadIdInBlock(mapping::DIM_X) &
- (mapping::getWarpSize() - 1);
-}
+uint32_t getThreadIdInWarp() { return __nvvm_read_ptx_sreg_laneid(); }
uint32_t getBlockIdInKernel(int32_t Dim) {
switch (Dim) {
More information about the Openmp-commits
mailing list