[Openmp-commits] [openmp] 7634c64 - [OpenMP][AMDGPU] Use DS_Max_Warp_Number instead of WARPSIZE

Pushpinder Singh via Openmp-commits openmp-commits at lists.llvm.org
Mon Sep 7 02:15:35 PDT 2020


Author: Pushpinder Singh
Date: 2020-09-07T05:15:21-04:00
New Revision: 7634c64b6121ba61a6c72c6b45e3561ad8cf345e

URL: https://github.com/llvm/llvm-project/commit/7634c64b6121ba61a6c72c6b45e3561ad8cf345e
DIFF: https://github.com/llvm/llvm-project/commit/7634c64b6121ba61a6c72c6b45e3561ad8cf345e.diff

LOG: [OpenMP][AMDGPU] Use DS_Max_Warp_Number instead of WARPSIZE

The size of worker_rootS should have been DS_Max_Warp_Number.
This reduces memory usage by deviceRTL on AMDGPU from around 2.3GB
to around 770MB.

Reviewed By: JonChesterfield, jdoerfert

Differential Revision: https://reviews.llvm.org/D87084

Added: 
    

Modified: 
    openmp/libomptarget/deviceRTLs/common/omptarget.h
    openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu

Removed: 
    


################################################################################
diff  --git a/openmp/libomptarget/deviceRTLs/common/omptarget.h b/openmp/libomptarget/deviceRTLs/common/omptarget.h
index 88807de4e19c..6d5d6cd19bd6 100644
--- a/openmp/libomptarget/deviceRTLs/common/omptarget.h
+++ b/openmp/libomptarget/deviceRTLs/common/omptarget.h
@@ -252,7 +252,7 @@ class omptarget_nvptx_TeamDescr {
       workDescrForActiveParallel; // one, ONLY for the active par
 
   ALIGN(16)
-  __kmpc_data_sharing_worker_slot_static worker_rootS[WARPSIZE];
+  __kmpc_data_sharing_worker_slot_static worker_rootS[DS_Max_Warp_Number];
   ALIGN(16) __kmpc_data_sharing_master_slot_static master_rootS[1];
 };
 

diff  --git a/openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu b/openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu
index ca2fd1d30754..9b116aba2fc3 100644
--- a/openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu
+++ b/openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu
@@ -26,7 +26,7 @@ INLINE static void data_sharing_init_stack_common() {
   omptarget_nvptx_TeamDescr *teamDescr =
       &omptarget_nvptx_threadPrivateContext->TeamContext();
 
-  for (int WID = 0; WID < WARPSIZE; WID++) {
+  for (int WID = 0; WID < DS_Max_Warp_Number; WID++) {
     __kmpc_data_sharing_slot *RootS = teamDescr->GetPreallocatedSlotAddr(WID);
     DataSharingState.SlotPtr[WID] = RootS;
     DataSharingState.StackPtr[WID] = (void *)&RootS->Data[0];


        


More information about the Openmp-commits mailing list