[Openmp-commits] [PATCH] D132660: [openmp][amdgpu] Implement target_alloc_host as fine grain HSA memory

Thu Aug 25 07:23:54 PDT 2022

jhuber6 added a comment.

In D132660#3748940 <https://reviews.llvm.org/D132660#3748940>, @JonChesterfield wrote:

> The semantics of 'fine grain host memory' are read/write from host or from a GPU. This uses a memory pool that will work from any GPU, we could instead use the (currently dead? weird) DeviceFineGrainedMemoryPools field if the localised-to-gpu behaviour is preferred, that's probably more efficient.
>
> There are some other memory interfaces, e.g. llvm_omp_target_alloc_shared, but I'm having a rough time working out what the behaviour of each is supposed to be. This patch is therefore based on nvptx mapping TARGT_ALLOC_HOST onto cuMemAllocHost. There's also ALLOC_SHARED, which maps to cuMemAllocManaged, which doesn't obviously have anything to do with cuda shared memory. I can't tell what the difference between cuMemAllocHost and cuMemAllocManaged is so maybe 'fine grain hsa' will work for 'TARGET_ALLOC_SHARED' as well.

Currently,

- `llvm_target_alloc`: default allocation strategy, device memory.
- `llvm_target_alloc_host`: allocates pinned memory on the host.
- `llvm_target_alloc_devce`: allocates device memory.
- `llvm_target_alloc_shared`: allocates memory that can be shared between the host and device, .e.g. CUDA managed memory.

There are some tests showing existing usage in `test/api`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132660/new/

https://reviews.llvm.org/D132660