[libc-commits] [libc] [libc] Remove remaining GPU architecture dependent instructions (PR #81612)
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Tue Feb 13 07:48:17 PST 2024
================
@@ -22,14 +22,15 @@ LLVM_LIBC_FUNCTION(int, nanosleep,
uint64_t nsecs = req->tv_nsec + req->tv_sec * TICKS_PER_NS;
uint64_t start = gpu::fixed_frequency_clock();
-#if defined(LIBC_TARGET_ARCH_IS_NVPTX) && __CUDA_ARCH__ >= 700
+#if defined(LIBC_TARGET_ARCH_IS_NVPTX)
uint64_t end = start + nsecs / (TICKS_PER_NS / GPU_CLOCKS_PER_SEC);
uint64_t cur = gpu::fixed_frequency_clock();
// The NVPTX architecture supports sleeping and guaruntees the actual time
// slept will be somewhere between zero and twice the requested amount. Here
// we will sleep again if we undershot the time.
while (cur < end) {
- __nvvm_nanosleep(static_cast<uint32_t>(nsecs));
+ if (__nvvm_reflect("__CUDA_ARCH") >= 700)
+ LIBC_INLINE_ASM("nanosleep.u32 %0;" ::"r"(nsecs));
----------------
jhuber6 wrote:
No, I made some changes to make this well-formed. The `__nvvm_reflect` pass returns the backend's value of the `sm` it's compiling with. That used to be an issue if the ASM was invalid and `O0` was run, but I made a patch that runs `__nvvm_reflect` in the backend to trim these branches even at O0 in https://github.com/llvm/llvm-project/pull/81253. So, basically this will *only* make it to PTX if the backend is compiled with sm_70 or greater.
https://github.com/llvm/llvm-project/pull/81612
More information about the libc-commits
mailing list