[PATCH] D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory

Artem Belevich via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 5 10:09:03 PST 2018


tra added a comment.

In https://reviews.llvm.org/D38978#968222, @gtbercea wrote:

> > I'm still curious to hear what do you plan to do when your depot use grows beyond certain limit. At the very least there's the physical limit on shared memory size. Shared memory use also affects how many threads can be launched which has large impact on performance. IMO having some sort of user-controllable threshold would be very desirable.
>
> When shared memory isn't enough to hold the shared depot, global memory will be used instead. That is a scheme which will be covered by a future patch.


Good luck with that. IMO if your kernel requires all shared memory available per multiprocessor, you are almost guaranteed suboptimal performance because you will not have enough threads running -- neither for peak compute, nor to hide global memory access latency. My bet that you will eventually end up limiting shared memory use to a fairly small fraction of it.

Given that impact is limited to explicitly annotated functions only,  this lack of tune-ability is OK with me for now. I'd add a TODO item somewhere to describe that tuning specific limits is WIP.



================
Comment at: test/CodeGen/NVPTX/insert-shared-depot.ll:29
+define void @kernel() #0 {
+; LABEL: @linsert_shared_depot
+  %A = alloca i32, align 4
----------------
gtbercea wrote:
> tra wrote:
> > 'LABEL' is not a check-prefix and `@linsert_shared_depot` is not this function's name, so I'm puzzled what this line is supposed to do.  Did you intend `<prefix>-LABEL: @kernel` ?
> > 
> > This appears in all the test cases in the patch.
> This is modeled after the lower-alloca.ll test which has a similar label. The label is always equal to the name of the test file. In this particular case there is a typo, it should be "insert_shared_depot" not "linsert_shared_depot"
> This is modeled after the lower-alloca.ll test which has a similar label. 
lower-alloca.ll indeed has the same problem.

> The label is always equal to the name of the test file. 

I don't think FileCheck has such a feature.  Nor do I see anything matching this description in the FileCheck documentation. Nor does it work. See below.

> In this particular case there is a typo, it should be "insert_shared_depot" not "linsert_shared_depot" 

The line does not check *anything* right now.  In this test FileCheck only pays attention to lines that have CHECK or PTX64/PTX32.  This line contains neither and is ignored. You can do an experiment  -- replace the line with `; LABEL: this should never match` and run the test. 

I've tried that on lower-alloca.ll and the test, as expected, passes regardless of the nonsense I put after the `LABEL:`.



https://reviews.llvm.org/D38978





More information about the llvm-commits mailing list