[clang] [NVPTX][clang] Ensure CLZ(0) is defined on NVPTX (PR #185630)

via cfe-commits cfe-commits at lists.llvm.org
Tue Mar 10 05:10:53 PDT 2026


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang

Author: Lewis Crawford (LewisCrawford)

<details>
<summary>Changes</summary>

CUDA semantics specify that clz(0) = bitwidth, so clang should emit clz / ctz intrinsics for NVPTX with zero-is-poison = false.

---
Full diff: https://github.com/llvm/llvm-project/pull/185630.diff


2 Files Affected:

- (modified) clang/lib/Basic/Targets/NVPTX.h (+2) 
- (added) clang/test/CodeGenCUDA/builtin-count-zeros-nvptx.cu (+12) 


``````````diff
diff --git a/clang/lib/Basic/Targets/NVPTX.h b/clang/lib/Basic/Targets/NVPTX.h
index 6f8df323f379c..7921a042e9e9b 100644
--- a/clang/lib/Basic/Targets/NVPTX.h
+++ b/clang/lib/Basic/Targets/NVPTX.h
@@ -83,6 +83,8 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo {
 
   bool useFP16ConversionIntrinsics() const override { return false; }
 
+  bool isCLZForZeroUndef() const override { return false; }
+
   bool
   initFeatureMap(llvm::StringMap<bool> &Features, DiagnosticsEngine &Diags,
                  StringRef CPU,
diff --git a/clang/test/CodeGenCUDA/builtin-count-zeros-nvptx.cu b/clang/test/CodeGenCUDA/builtin-count-zeros-nvptx.cu
new file mode 100644
index 0000000000000..f003b32ca73b2
--- /dev/null
+++ b/clang/test/CodeGenCUDA/builtin-count-zeros-nvptx.cu
@@ -0,0 +1,12 @@
+// REQUIRES: nvptx-registered-target
+// RUN: %clang_cc1 -x cuda -triple nvptx64-unknown-unknown -fcuda-is-device -emit-llvm %s -o - | FileCheck %s
+//
+// Ensure NVPTX uses isCLZForZeroUndef() = false (CUDA semantics: CLZ(i32 0) == 32).
+
+#include "Inputs/cuda.h"
+
+__device__ int f(int x) {
+  return __builtin_ctz(x) + __builtin_clz(x);
+}
+// CHECK: call i32 @llvm.cttz.i32({{.*}}, i1 false)
+// CHECK: call i32 @llvm.ctlz.i32({{.*}}, i1 false)

``````````

</details>


https://github.com/llvm/llvm-project/pull/185630


More information about the cfe-commits mailing list