[Openmp-commits] [PATCH] D55773: [OpenMP][libomptarget] Use shared memory variable for tracking parallel level
Gheorghe-Teodor Bercea via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Wed Jan 9 06:46:31 PST 2019
gtbercea marked an inline comment as done.
gtbercea added inline comments.
================
Comment at: libomptarget/deviceRTLs/nvptx/src/omptarget-nvptx.cu:98
setExecutionParameters(Spmd, RuntimeUninitialized);
- if (GetThreadIdInBlock() == 0) {
- int slot = smid() % MAX_SM;
- usedSlotIdx = slot;
- omptarget_nvptx_simpleThreadPrivateContext =
- omptarget_nvptx_device_simpleState[slot].Dequeue();
- }
- // FIXME: use __syncthreads instead when the function copy is fixed in LLVM.
- __SYNCTHREADS();
- omptarget_nvptx_simpleThreadPrivateContext->Init();
+ parallelLevel = 0;
return;
----------------
ABataev wrote:
> 1. I think it is better to do this initialization in the single thread and you still need to synchronize the threads even in this situation to avoid data race.
> 2. Keep the initialization of the `usedSlotIdx`, it is used in other places.
Increments/decrements to this variable are guarded by syncthreads so I don't think we need any additional synchronization here. As for the slot I think we can leave that to have the default value for this case and rely on the compiler to perform the optimization of the remainder operations.
Repository:
rOMP OpenMP
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D55773/new/
https://reviews.llvm.org/D55773
More information about the Openmp-commits
mailing list