[llvm] Add constant-folding for unary NVVM intrinsics (PR #141233)

Lewis Crawford via llvm-commits llvm-commits at lists.llvm.org
Tue Jul 15 02:20:45 PDT 2025


================
@@ -2548,6 +2653,170 @@ static Constant *ConstantFoldScalarCall1(StringRef Name,
         return ConstantFoldFP(atan, APF, Ty);
       case Intrinsic::sqrt:
         return ConstantFoldFP(sqrt, APF, Ty);
+
+      // NVVM Intrinsics:
+      case Intrinsic::nvvm_ceil_ftz_f:
+      case Intrinsic::nvvm_ceil_f:
+      case Intrinsic::nvvm_ceil_d:
+        return ConstantFoldFP(
+            ceil, APF, Ty,
+            nvvm::GetNVVMDenromMode(
+                nvvm::UnaryMathIntrinsicShouldFTZ(IntrinsicID)));
+
+      case Intrinsic::nvvm_cos_approx_ftz_f:
+      case Intrinsic::nvvm_cos_approx_f:
+        return ConstantFoldFP(
+            cos, APF, Ty,
+            nvvm::GetNVVMDenromMode(
+                nvvm::UnaryMathIntrinsicShouldFTZ(IntrinsicID)));
+
+      case Intrinsic::nvvm_ex2_approx_ftz_f:
+      case Intrinsic::nvvm_ex2_approx_d:
+      case Intrinsic::nvvm_ex2_approx_f:
+        return ConstantFoldFP(
+            exp2, APF, Ty,
+            nvvm::GetNVVMDenromMode(
+                (nvvm::UnaryMathIntrinsicShouldFTZ(IntrinsicID))));
----------------
LewisCrawford wrote:

It looks like the code for folding `amdgcn_sin` and `amdgcn_cos` ends up calling the host sin/cos implementation. 
If they're based on the OpenCL spec, they're error should be <= 4 ulp and:
>For x in the domain [-π, π], the maximum absolute error is <= 2-11 and larger otherwise.

However, the actual AMDGCN implementation might not make use of this leeway, and might be just as precise as the host anyway, so I'm not sure whether it is an example of an approximate intrinsic or not.

The PTX `sin.approx` max error of 2^-14.7 in the range [ -100pi .. +100pi ] is smaller than the max OpenCL limits for `sin` here, and the error of 2^-20.5 in range [ -2pi. . 2pi ] is significantly smaller.

https://github.com/llvm/llvm-project/pull/141233


More information about the llvm-commits mailing list