[all-commits] [llvm/llvm-project] 29d3da: [libc] Fix the `send_n` and `recv_n` utilities und...

Joseph Huber via All-commits all-commits at lists.llvm.org
Tue May 23 09:00:03 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 29d3da3b86cc4e8d5025602e0d7a290609f44f45
      https://github.com/llvm/llvm-project/commit/29d3da3b86cc4e8d5025602e0d7a290609f44f45
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-05-23 (Tue, 23 May 2023)

  Changed paths:
    M libc/src/__support/GPU/amdgpu/utils.h
    M libc/src/__support/GPU/nvptx/utils.h
    M libc/src/__support/RPC/rpc.h
    M libc/test/integration/startup/gpu/CMakeLists.txt
    M libc/test/integration/startup/gpu/rpc_stream_test.cpp

  Log Message:
  -----------
  [libc] Fix the `send_n` and `recv_n` utilities under divergent lanes

We provide the `send_n` and `recv_n` utilities as a generic way to
stream data between both sides of the process. This was previously
tested and performed as expected when using a string of constant size.
However, when the size was allowed to diverge between the threads in the
warp or wavefront this could deadlock. This did not occur on NVPTX
because of the use of the explicit warp sync. However, on AMD one of the
work items in the wavefront could continue executing and hit the next
`recv` call before the other threads, then we would deadlock as we
violated the RPC invariants.

This patch replaces the for loop with a thread ballot. This will cause
every thread in the warp or wavefront to continue executing the loop
until all of them can exit. This acts as a more explicit wavefront sync.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D150992


  Commit: e826762a0826c11dc62696e46068c61c57a00aa9
      https://github.com/llvm/llvm-project/commit/e826762a0826c11dc62696e46068c61c57a00aa9
  Author: Joseph Huber <jhuber6 at vols.utk.edu>
  Date:   2023-05-23 (Tue, 23 May 2023)

  Changed paths:
    M libc/src/__support/OSUtil/gpu/io.cpp
    M libc/src/__support/RPC/rpc.h
    M libc/src/__support/RPC/rpc_util.h
    M libc/utils/gpu/loader/Server.h

  Log Message:
  -----------
  [libc] More efficiently send bytes via `send_n` and `recv_n`

Currently we have the `send_n` and `recv_n` routines to stream data,
such as a string to print, to the other side. The first operation is to
send the size so the other side knows the number of bytes to recieve.
However, this wasted 56 bytes that could've been sent. This meant that
small values, like the arguments to a function to call on the host for
example, needed to perform an extra send. This patch sends the first 56
bytes in the first packet and continues if necessary.

Depends on D150992

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D151041


Compare: https://github.com/llvm/llvm-project/compare/75eb3bd1a43d...e826762a0826


More information about the All-commits mailing list