[libc-commits] [PATCH] D149598: [libc] Support concurrent RPC port access on the GPU

Jon Chesterfield via Phabricator via libc-commits libc-commits at lists.llvm.org
Wed May 3 10:25:33 PDT 2023


JonChesterfield added inline comments.


================
Comment at: libc/src/__support/RPC/rpc.h:353
+  // Perform a naive linear scan for a port that can be opened to send data.
+  for (uint64_t index = 0; index < port_size; ++index) {
+    // Attempt to acquire the lock on this index.
----------------
Your data representation induces overhead here.

inbox.load hits the bus to retrieve one bit of information. If it fails, next step through the loop starts over from scratch.

If you change to packed bits, the first atomic load gets you the state of ~32 mailboxes in one tick. If your current index doesn't work out, you now have enough information to pick the next credible candidate without hitting the bus at all.

Also you're loading from the outbox despite the value of that being known already. Maybe the latency on that redundant load is hidden by the latency on loading from the inbox.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D149598/new/

https://reviews.llvm.org/D149598



More information about the libc-commits mailing list