[Openmp-commits] [PATCH] D103600: [AMDGPU][Libomptarget] Rework logic for locating kernarg pools

Jon Chesterfield via Phabricator via Openmp-commits openmp-commits at lists.llvm.org
Fri Jun 4 02:12:21 PDT 2021


JonChesterfield added a comment.

Took a while to make sense of this. I think you've preserved the existing strategy of building a collection of kernarg memory pools, while moving the global into a class member and adding some more tests on whether a pool should be added to the collection.

Some tests are done when adding a memory pool to the collection, and some are done when considering whether to take a pool out of the collection. Also, we only ever use a single memory pool. So I think things will be simpler if we go a step further and change the logic to 'try to find exactly one memory_pool that we will use for kernarg on this gpu', and put all the sanity checking (allocatable, size>0, fine grain, kernarg ok etc) behind that one call, which returns a single memory pool (or pair {pool, rc}) instead of a collection.

The proc->addMemory(new_mem); interface survives this change, so I'd guess some of the noise is from pulling kernarg allocation out of the ATMI infra while leaving GPU allocation within it.



================
Comment at: openmp/libomptarget/plugins/amdgpu/src/rtl.cpp:75
 int print_kernel_trace;
+extern ATLMachine g_atl_machine;
 
----------------
not keen on extern declarations in source (I know there's a print kernel trace one there already...), can we move this global definition into rtl.cpp, or failing that put the extern declaration in some header shared with the file that does define it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103600/new/

https://reviews.llvm.org/D103600



More information about the Openmp-commits mailing list