[libc-commits] [PATCH] D159276: [libc][gpu] Thread divergence fix on volta, WIP

Jon Chesterfield via Phabricator via libc-commits libc-commits at lists.llvm.org
Thu Aug 31 05:49:37 PDT 2023


JonChesterfield created this revision.
JonChesterfield added reviewers: jhuber6, jdoerfert.
Herald added subscribers: libc-commits, tpr.
Herald added projects: libc-project, All.
JonChesterfield requested review of this revision.

WIP because broadcast_value is incorrect on volta (with a comment?),
need to fix that. Posting this diff to describe the bug.

The inbox/outbox loads are performed by the current warp, not a single thread.

The outbox load indicates whether a port has been successfully opened. If some
lanes in the warp think it has and others think the port open failed, as the
warp happened to be diverged when the load occurred, all the subsequent control
flow will be incorrect.

The inbox load indicates whether the machine on the other side of the RPC channel
has progressed. If lanes in the warp have different ideas about that, some will
try to progress their state transition while others won't. As far as the RPC layer
is concerned this is a performance problem and not a correctness one - none of the lanes
can start the transition early, only miss it and start late - but in practice the calls
layered on top of RPC do not have the interface required to detect this event and retry
the load on the stalled lanes, so the calls layered on top will be broken.

None of this is broken on amdgpu, but it's likely that the readfirstlane will have
beneficial performance properties there. Possible significant enough that it's
worth landing this ahead of fixing gpu::broadcast_value on volta.

Essentially volta wasn't adequately considered when writing this part of the protocol.
It's a bug present in the intial prototype and propagated thus far, because none of
the test cases push volta into a warp diverged state in the middle of the RPC sequence.

We should have some test cases for volta where port_open and equivalent are called
from diverged warps.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D159276

Files:
  libc/src/__support/RPC/rpc.h


Index: libc/src/__support/RPC/rpc.h
===================================================================
--- libc/src/__support/RPC/rpc.h
+++ libc/src/__support/RPC/rpc.h
@@ -117,12 +117,12 @@
 
   /// Retrieve the inbox state from memory shared between processes.
   LIBC_INLINE uint32_t load_inbox(uint32_t index) {
-    return inbox[index].load(cpp::MemoryOrder::RELAXED);
+    return gpu::broadcast_value(inbox[index].load(cpp::MemoryOrder::RELAXED));
   }
 
   /// Retrieve the outbox state from memory shared between processes.
   LIBC_INLINE uint32_t load_outbox(uint32_t index) {
-    return outbox[index].load(cpp::MemoryOrder::RELAXED);
+    return gpu::broadcast_value(outbox[index].load(cpp::MemoryOrder::RELAXED));
   }
 
   /// Signal to the other process that this one is finished with the buffer.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D159276.555002.patch
Type: text/x-patch
Size: 812 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libc-commits/attachments/20230831/29e3edd0/attachment.bin>


More information about the libc-commits mailing list