<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<p><font face="Hack Nerd Font Mono">Hi Xin,</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono">I think what you found is some
runtime code that lives in shared memory. This is not to be
confused with user data put into shared memory.</font></p>
<p><font face="Hack Nerd Font Mono">To do the latter, you can use
the allocate directive, e.g.,</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono">int Global[32];</font></p>
<p>#pragma <font face="Hack Nerd Font Mono"><span class="cp">omp
allocate(Global) allocator(omp_pteam_mem_alloc)</span></font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono">Wrt. to the feedback I don't
think there is anything in place. You could use nvprof if you
run it maybe. However, I agree we should have a</font></p>
<p><font face="Hack Nerd Font Mono">flag that provides better
information.</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono">I hope this helps.<br>
</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono">Cheers,</font></p>
<p><font face="Hack Nerd Font Mono"> Johannes</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<p><font face="Hack Nerd Font Mono"><br>
</font></p>
<div class="moz-cite-prefix">On 4/18/20 5:37 AM, ichbinwu via
Openmp-dev wrote:<br>
</div>
<blockquote type="cite"
cite="mid:190015da-1d84-e303-f29e-57125e3c36cb@gmail.com">hello
everybody,
<br>
<br>
I have a question about GPU shared memory in the OpenMP
implementation in LLVM.
<br>
<br>
In the paper by Grinberg, Bertolli, and Haque (Hands on with
OpenMP 4.5 and Unified Memory: Developing Applications for IBM's
Hybrid CPU + GPU systems (Part II), IWOMP 2017) I found "3.
Clang's Extension for OpenMP 4.5 for device On-chip Memory
Allocation" and learnt that the GPU shared memory can be used in a
tricky manner with OpenMP directives. In order to find the
compiler limit for this static memory allocation I looked at the
source code files under `openmp`. It seems the relevant files are:
<br>
<br>
1. openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.h
<br>
* commit: 197b7b24
<br>
* line: DS_Slot_Size = 256,
<br>
<br>
2. openmp/libomptarget/deviceRTLs/common/omptarget.h
<br>
* commit: d0b9ed5c
<br>
* line: char Data[DS_Slot_Size];
<br>
<br>
My questions are:
<br>
<br>
1. Is the hard-coded limit for GPU shared memory 256 Bytes or (256
* 4) Bytes? Because I see the comment in
`openmp/libomptarget/deviceRTLs/common/omptarget.h`
<br>
<br>
// Additional master slot type which is initialized with the
default master slot
<br>
// size of 4 bytes.
<br>
<br>
2. Could we enlarge this limit to, e.g. 512 Bytes or even 1024
Bytes? Concerning the hardware specification of green GPUs, if we
assume the shared memory per multiprocessor is 48 KB and at most
32 thread blocks (or contention groups) reside on one
multiprocessor, this limit can be as large as 1536 Bytes, isn't
it?
<br>
<br>
3. How could we check/verify that the static memory allocation is
on GPU shared memory (not on global memory), when an OpenMP source
file is compiled by Clang/LLVM? My current approach is to look at
the generated assembly code (`-S`), which is not really
convenient. It would be good, if the compiler can print some
message or give a short report during compilation.
<br>
<br>
Thank you in advance!
<br>
<br>
Best wishes!
<br>
<br>
Xin
<br>
_______________________________________________
<br>
Openmp-dev mailing list
<br>
<a class="moz-txt-link-abbreviated" href="mailto:Openmp-dev@lists.llvm.org">Openmp-dev@lists.llvm.org</a>
<br>
<a class="moz-txt-link-freetext" href="https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev">https://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a>
<br>
</blockquote>
</body>
</html>