[libc-commits] [PATCH] D159276: [libc][gpu] Thread divergence fix on volta, WIP

Jon Chesterfield via Phabricator via libc-commits libc-commits at lists.llvm.org
Thu Aug 31 06:13:31 PDT 2023


JonChesterfield added a comment.

In D159276#4631066 <https://reviews.llvm.org/D159276#4631066>, @jhuber6 wrote:

> I think the best we can do is just maintain the divergence that we know of when we open the RPC interface. That is, when we broadcast the value we should write it to the mask we know of, since that's always a subset of the "true mask" right?

The path on volta is heavily constrained by the architecture. We cannot know the 'true' lane mask because the architecture doesn't expose it. The general workarounds for handling warp level divergence on volta are:

1/ conjure the currently active lanes once and hope that's close enough
2/ pass ~0 into all the intrinsics and hope that's OK
3/ require the application to pass the lanemask around

Fortunately, the RPC layer is structured so that a single RPC call as far as the GPU logic is concerned can be handled by multiple calls on the server. We should probably document that design constraint somewhere if it isn't in any of the papers. That means a wholly converged warp makes a single trip to the server, and a worst-case diverged warp might make 32 trips to the server, but the application logic is unchanged and all that happens is a performance slowdown. Which we sometime observe in testing.

That gives us workaround option 4/, transparently deal with partial lanemask implementation. That mostly makes the port manipulation logic harder to write, try_lock and unlock are where this surfaces in our implementation. It also means it's _really_ important that we use the same lane_mask value we opened a port with consistently, which is hopefully constrained by the type system and our tendency to store it in Process.

This particular bug is that I missed the race condition on loads over pcie from a diverged warp.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D159276/new/

https://reviews.llvm.org/D159276



More information about the libc-commits mailing list