[Openmp-dev] OpenMP GPU shared memory

ichbinwu via Openmp-dev openmp-dev at lists.llvm.org
Sat Apr 18 03:37:54 PDT 2020


hello everybody,

I have a question about GPU shared memory in the OpenMP implementation 
in LLVM.

In the paper by Grinberg, Bertolli, and Haque (Hands on with OpenMP 4.5 
and Unified Memory: Developing Applications for IBM's Hybrid CPU + GPU 
systems (Part II), IWOMP 2017) I found "3. Clang's Extension for OpenMP 
4.5 for device On-chip Memory Allocation" and learnt that the GPU shared 
memory can be used in a tricky manner with OpenMP directives. In order 
to find the compiler limit for this static memory allocation I looked at 
the source code files under `openmp`. It seems the relevant files are:

1. openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
     * commit: 197b7b24
     * line: DS_Slot_Size = 256,

2. openmp/libomptarget/deviceRTLs/common/omptarget.h
     * commit: d0b9ed5c
     * line: char Data[DS_Slot_Size];

My questions are:

1. Is the hard-coded limit for GPU shared memory 256 Bytes or (256 * 4) 
Bytes? Because I see the comment in 
`openmp/libomptarget/deviceRTLs/common/omptarget.h`

// Additional master slot type which is initialized with the default 
master slot
// size of 4 bytes.

2. Could we enlarge this limit to, e.g. 512 Bytes or even 1024 Bytes? 
Concerning the hardware specification of green GPUs, if we assume the 
shared memory per multiprocessor is 48 KB and at most 32 thread blocks 
(or contention groups) reside on one multiprocessor, this limit can be 
as large as 1536 Bytes, isn't it?

3. How could we check/verify that the static memory allocation is on GPU 
shared memory (not on global memory), when an OpenMP source file is 
compiled by Clang/LLVM? My current approach is to look at the generated 
assembly code (`-S`), which is not really convenient. It would be good, 
if the compiler can print some message or give a short report during 
compilation.

Thank you in advance!

Best wishes!

Xin


More information about the Openmp-dev mailing list