[all-commits] [llvm/llvm-project] fb3297: [Libomptarget] Fix RPC-based malloc on NVPTX (#7...

Joseph Huber via All-commits all-commits at lists.llvm.org
Tue Jan 2 14:54:07 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: fb32977ac768f27890af28308a6968c30af2aa3e
      https://github.com/llvm/llvm-project/commit/fb32977ac768f27890af28308a6968c30af2aa3e
  Author: Joseph Huber <huberjn at outlook.com>
  Date:   2024-01-02 (Tue, 02 Jan 2024)

  Changed paths:
    M openmp/libomptarget/include/omptarget.h
    M openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp
    M openmp/libomptarget/plugins-nextgen/common/src/RPC.cpp
    M openmp/libomptarget/plugins-nextgen/cuda/dynamic_cuda/cuda.cpp
    M openmp/libomptarget/plugins-nextgen/cuda/dynamic_cuda/cuda.h
    M openmp/libomptarget/plugins-nextgen/cuda/src/rtl.cpp
    M openmp/libomptarget/plugins-nextgen/generic-elf-64bit/src/rtl.cpp
    M openmp/libomptarget/test/libc/malloc.c

  Log Message:
  -----------
   [Libomptarget] Fix RPC-based malloc on NVPTX  (#72440)

Summary:
The device allocator on NVPTX architectures is enqueued to a stream that
the kernel is potentially executing on. This can lead to deadlocks as
the kernel will not proceed until the allocation is complete and the
allocation will not proceed until the kernel is complete. CUDA 11.2
introduced async allocations that we can manually place on separate
streams to combat this. This patch makes a new allocation type that's
guaranteed to be non-blocking so it will actually make progress, only
Nvidia needs to care about this as the others are not blocking in this
way by default.

I had originally tried to make the `alloc` and `free` methods take a
`__tgt_async_info`. However, I observed that with the large volume of
streams being created by a parallel test it quickly locked up the system
as presumably too many streams were being created. This implementation
not just creates a new stream and immediately destroys it. This
obviously isn't very fast, but it at least gets the cases to stop
deadlocking for now.




More information about the All-commits mailing list