[Openmp-commits] [openmp] fd70f70 - [OpenMP][NVPTX] Replaced CUDA builtin vars with LLVM intrinsics
Shilei Tian via Openmp-commits
openmp-commits at lists.llvm.org
Wed Jan 20 09:02:13 PST 2021
Author: Shilei Tian
Date: 2021-01-20T12:02:06-05:00
New Revision: fd70f70d1e02752f411fcf923fddda31cce376ae
URL: https://github.com/llvm/llvm-project/commit/fd70f70d1e02752f411fcf923fddda31cce376ae
DIFF: https://github.com/llvm/llvm-project/commit/fd70f70d1e02752f411fcf923fddda31cce376ae.diff
LOG: [OpenMP][NVPTX] Replaced CUDA builtin vars with LLVM intrinsics
Replaced CUDA builtin vars with LLVM intrinsics such that we don't need
definitions of those intrinsics.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D95013
Added:
Modified:
openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
Removed:
################################################################################
diff --git a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
index 8052b92a7dee..b5ef549ece57 100644
--- a/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
+++ b/openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
@@ -115,10 +115,12 @@ DEVICE void __kmpc_impl_threadfence_block() { __threadfence_block(); }
DEVICE void __kmpc_impl_threadfence_system() { __threadfence_system(); }
// Calls to the NVPTX layer (assuming 1D layout)
-DEVICE int GetThreadIdInBlock() { return threadIdx.x; }
-DEVICE int GetBlockIdInKernel() { return blockIdx.x; }
-DEVICE int GetNumberOfBlocksInKernel() { return gridDim.x; }
-DEVICE int GetNumberOfThreadsInBlock() { return blockDim.x; }
+DEVICE int GetThreadIdInBlock() { return __nvvm_read_ptx_sreg_tid_x(); }
+DEVICE int GetBlockIdInKernel() { return __nvvm_read_ptx_sreg_ctaid_x(); }
+DEVICE int GetNumberOfBlocksInKernel() {
+ return __nvvm_read_ptx_sreg_nctaid_x();
+}
+DEVICE int GetNumberOfThreadsInBlock() { return __nvvm_read_ptx_sreg_ntid_x(); }
DEVICE unsigned GetWarpId() { return GetThreadIdInBlock() / WARPSIZE; }
DEVICE unsigned GetLaneId() { return GetThreadIdInBlock() & (WARPSIZE - 1); }
More information about the Openmp-commits
mailing list