[clang] [llvm] [NVPTX][Draft] Make `__nvvm_nanosleep` a no-op if unsupported (PR #81033)
Artem Belevich via cfe-commits
cfe-commits at lists.llvm.org
Wed Feb 7 12:38:45 PST 2024
Artem-B wrote:
> This patch, which simply makes it legal on all architectures but do nothing is it's older than sm_70.
I do not think this is the right thing to do. "do nothing" is not what one would expect from a `nanosleep`.
Let's unpack your problem a bit.
__nvvm_reflect() is probably closest to what you would need. However, IIUIC, if you use it to provide nanosleep-based variant and an alternative for the older GPUs, the `nanosleep` variant code will still hang off the dead branch of if(__nvvm_reflect()) and if it's not eliminated by DCE (which it would not if optimizations are off), the resulting PTX will be invalid for the older GPUs.
In other words, pushing nanosleep implementation into an intrinsic makes things compile everywhere at the expense of doing a wrong thing on the older GPUs. I do not think it's a good trade-off.
Perhaps a better approach would be to incorporate dead branch elimination onto NVVMReflect pass itself. We do know that it is the explicit intent of `__nvvm_reflect()`. If NVVMReflect explicitly guarantees that the dead branch will be gone, it should allow you to use approach `#1` w/o concerns for whether optimizations are enabled and you should be able to provide whatever alternative implementation you need (even if it's a null one), without affecting correctness of LLVM itself.
https://github.com/llvm/llvm-project/pull/81033
More information about the cfe-commits
mailing list