[Openmp-commits] [openmp] 9f69d3c - [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (#84928)
via Openmp-commits
openmp-commits at lists.llvm.org
Tue Mar 12 08:39:44 PDT 2024
Author: Joseph Huber
Date: 2024-03-12T10:39:40-05:00
New Revision: 9f69d3cf88905df5006f93dce536b7e73c0b1735
URL: https://github.com/llvm/llvm-project/commit/9f69d3cf88905df5006f93dce536b7e73c0b1735
DIFF: https://github.com/llvm/llvm-project/commit/9f69d3cf88905df5006f93dce536b7e73c0b1735.diff
LOG: [Libomptarget] Use NVPTX lane id intrinsic in DeviceRTL (#84928)
Summary:
We are currently taking the lower 5 bites of the thread ID as the warp
ID. This doesn't work in non-1D grids and is also slower than just using
the dedicated hardware register.
Added:
Modified:
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
Removed:
################################################################################
diff --git a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
index 31dd8054dec33e..b2028a8fb4f506 100644
--- a/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
+++ b/openmp/libomptarget/DeviceRTL/src/Mapping.cpp
@@ -172,10 +172,7 @@ uint32_t getThreadIdInBlock(int32_t Dim) {
UNREACHABLE("Dim outside range!");
}
-uint32_t getThreadIdInWarp() {
- return impl::getThreadIdInBlock(mapping::DIM_X) &
- (mapping::getWarpSize() - 1);
-}
+uint32_t getThreadIdInWarp() { return __nvvm_read_ptx_sreg_laneid(); }
uint32_t getBlockIdInKernel(int32_t Dim) {
switch (Dim) {
More information about the Openmp-commits
mailing list