[Openmp-commits] [PATCH] D103600: [AMDGPU][Libomptarget] Rework logic for locating kernarg pools
Jon Chesterfield via Phabricator via Openmp-commits
openmp-commits at lists.llvm.org
Fri Jun 4 02:12:21 PDT 2021
JonChesterfield added a comment.
Took a while to make sense of this. I think you've preserved the existing strategy of building a collection of kernarg memory pools, while moving the global into a class member and adding some more tests on whether a pool should be added to the collection.
Some tests are done when adding a memory pool to the collection, and some are done when considering whether to take a pool out of the collection. Also, we only ever use a single memory pool. So I think things will be simpler if we go a step further and change the logic to 'try to find exactly one memory_pool that we will use for kernarg on this gpu', and put all the sanity checking (allocatable, size>0, fine grain, kernarg ok etc) behind that one call, which returns a single memory pool (or pair {pool, rc}) instead of a collection.
The proc->addMemory(new_mem); interface survives this change, so I'd guess some of the noise is from pulling kernarg allocation out of the ATMI infra while leaving GPU allocation within it.
================
Comment at: openmp/libomptarget/plugins/amdgpu/src/rtl.cpp:75
int print_kernel_trace;
+extern ATLMachine g_atl_machine;
----------------
not keen on extern declarations in source (I know there's a print kernel trace one there already...), can we move this global definition into rtl.cpp, or failing that put the extern declaration in some header shared with the file that does define it?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D103600/new/
https://reviews.llvm.org/D103600
More information about the Openmp-commits
mailing list