[clang] [llvm] [NVPTX][Draft] Make `__nvvm_nanosleep` a no-op if unsupported (PR #81033)

Joseph Huber via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 7 12:50:35 PST 2024


jhuber6 wrote:

> > This patch, which simply makes it legal on all architectures but do nothing is it's older than sm_70.
> 
> I do not think this is the right thing to do. "do nothing" is not what one would expect from a `nanosleep`.

Thanks, I made this a draft because I figured it wasn't the correct thing to do but wanted to pose the question.

> Let's unpack your problem a bit.
> 
> __nvvm_reflect() is probably closest to what you would need. However, IIUIC, if you use it to provide nanosleep-based variant and an alternative for the older GPUs, the `nanosleep` variant code will still hang off the dead branch of if(__nvvm_reflect()) and if it's not eliminated by DCE (which it would not if optimizations are off), the resulting PTX will be invalid for the older GPUs.
> 
> In other words, pushing nanosleep implementation into an intrinsic makes things compile everywhere at the expense of doing a wrong thing on the older GPUs. I do not think it's a good trade-off.
> 
> Perhaps a better approach would be to incorporate dead branch elimination onto NVVMReflect pass itself. We do know that it is the explicit intent of `__nvvm_reflect()`. If NVVMReflect explicitly guarantees that the dead branch will be gone, it should allow you to use approach `#1` w/o concerns for whether optimizations are enabled and you should be able to provide whatever alternative implementation you need (even if it's a null one), without affecting correctness of LLVM itself.

I think that would be a good solution if possible. Would this simply mean scheduling a global DCE pass right after the NVVM reflect pass? Since that seems to be run at `O0` that seems like the easiest solution, though it somewhat breaks `O0` semantics.

Or, maybe we just have a really shallow implementation in the NVVM reflect pass that collapses the branch?

https://github.com/llvm/llvm-project/pull/81033


More information about the cfe-commits mailing list