[libc-commits] [libc] [libc] Remove remaining GPU architecture dependent instructions (PR #81612)

Tue Feb 13 07:48:17 PST 2024

================
@@ -22,14 +22,15 @@ LLVM_LIBC_FUNCTION(int, nanosleep,
   uint64_t nsecs = req->tv_nsec + req->tv_sec * TICKS_PER_NS;
 
   uint64_t start = gpu::fixed_frequency_clock();
-#if defined(LIBC_TARGET_ARCH_IS_NVPTX) && __CUDA_ARCH__ >= 700
+#if defined(LIBC_TARGET_ARCH_IS_NVPTX)
   uint64_t end = start + nsecs / (TICKS_PER_NS / GPU_CLOCKS_PER_SEC);
   uint64_t cur = gpu::fixed_frequency_clock();
   // The NVPTX architecture supports sleeping and guaruntees the actual time
   // slept will be somewhere between zero and twice the requested amount. Here
   // we will sleep again if we undershot the time.
   while (cur < end) {
-    __nvvm_nanosleep(static_cast<uint32_t>(nsecs));
+    if (__nvvm_reflect("__CUDA_ARCH") >= 700)
+      LIBC_INLINE_ASM("nanosleep.u32 %0;" ::"r"(nsecs));
----------------
jhuber6 wrote:

No, I made some changes to make this well-formed. The `__nvvm_reflect` pass returns the backend's value of the `sm` it's compiling with. That used to be an issue if the ASM was invalid and `O0` was run, but I made a patch that runs `__nvvm_reflect` in the backend to trim these branches even at O0 in https://github.com/llvm/llvm-project/pull/81253. So, basically this will *only* make it to PTX if the backend is compiled with sm_70 or greater.

https://github.com/llvm/llvm-project/pull/81612