[Openmp-dev] OpenMP GPU shared memory
ichbinwu via Openmp-dev
openmp-dev at lists.llvm.org
Sat Apr 18 03:37:54 PDT 2020
hello everybody,
I have a question about GPU shared memory in the OpenMP implementation
in LLVM.
In the paper by Grinberg, Bertolli, and Haque (Hands on with OpenMP 4.5
and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU
systems (Part II), IWOMP 2017) I found "3. Clang's Extension for OpenMP
4.5 for device On-chip Memory Allocation" and learnt that the GPU shared
memory can be used in a tricky manner with OpenMP directives. In order
to find the compiler limit for this static memory allocation I looked at
the source code files under `openmp`. It seems the relevant files are:
1. openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
* commit: 197b7b24
* line: DS_Slot_Size = 256,
2. openmp/libomptarget/deviceRTLs/common/omptarget.h
* commit: d0b9ed5c
* line: char Data[DS_Slot_Size];
My questions are:
1. Is the hard-coded limit for GPU shared memory 256 Bytes or (256 * 4)
Bytes? Because I see the comment in
`openmp/libomptarget/deviceRTLs/common/omptarget.h`
// Additional master slot type which is initialized with the default
master slot
// size of 4 bytes.
2. Could we enlarge this limit to, e.g. 512 Bytes or even 1024 Bytes?
Concerning the hardware specification of green GPUs, if we assume the
shared memory per multiprocessor is 48 KB and at most 32 thread blocks
(or contention groups) reside on one multiprocessor, this limit can be
as large as 1536 Bytes, isn't it?
3. How could we check/verify that the static memory allocation is on GPU
shared memory (not on global memory), when an OpenMP source file is
compiled by Clang/LLVM? My current approach is to look at the generated
assembly code (`-S`), which is not really convenient. It would be good,
if the compiler can print some message or give a short report during
compilation.
Thank you in advance!
Best wishes!
Xin
More information about the Openmp-dev
mailing list