[libc-commits] [PATCH] D151282: [libc] Add initial support for 'puts' and 'fputs' to the GPU

Fri May 26 12:22:17 PDT 2023

jhuber6 added a comment.

In D151282#4376876 <https://reviews.llvm.org/D151282#4376876>, @tra wrote:

>> In general locks on the GPU cannot be safely implemented due to lack of forward progress guaruntees.
>
> I think sm_70+ does provide forward progress guarantees. This was one of the major architectural changes introduced on Volta.
> https://docs.nvidia.com/cuda/volta-tuning-guide/index.html#independent-thread-scheduling

So, I believe what Volta added was forward progress guarantees within a warp. E.g. a warp cannot deadlock on itself because the threads within a warp can make independent progress. The OpenCL model is extremely pessimistic and offers absolutely zero forward progress guarantees. As far as I understand the underlying hardware is a different story. I haven't heard of any vendors **guaranteeing** forward progress, or a fair scheduler on the GPU globally. However, I think that Nvidia in general is on firmer ground here than AMD so it's less likely to be an issue there. The typical way to bypass this is to provide enough locks in parallel that an active warp / wavefront won't get blocked on any other active one. It's very wasteful but it's "safe". At least for the AMD case.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151282/new/

https://reviews.llvm.org/D151282