[llvm] [NVPTX] Improve support for {ex2,lg2}.approx (PR #120519)
Alex MacLean via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 2 20:18:30 PST 2025
================
@@ -1255,18 +1255,33 @@ def INT_NVVM_EX2_APPROX_F : F_MATH_1<"ex2.approx.f32 \t$dst, $src0;",
Float32Regs, Float32Regs, int_nvvm_ex2_approx_f>;
def INT_NVVM_EX2_APPROX_D : F_MATH_1<"ex2.approx.f64 \t$dst, $src0;",
Float64Regs, Float64Regs, int_nvvm_ex2_approx_d>;
+
def INT_NVVM_EX2_APPROX_F16 : F_MATH_1<"ex2.approx.f16 \t$dst, $src0;",
Int16Regs, Int16Regs, int_nvvm_ex2_approx_f16, [hasPTX<70>, hasSM<75>]>;
def INT_NVVM_EX2_APPROX_F16X2 : F_MATH_1<"ex2.approx.f16x2 \t$dst, $src0;",
Int32Regs, Int32Regs, int_nvvm_ex2_approx_f16x2, [hasPTX<70>, hasSM<75>]>;
+def : Pat<(fexp2 f32:$a),
+ (INT_NVVM_EX2_APPROX_FTZ_F Float32Regs:$a)>, Requires<[doF32FTZ]>;
+def : Pat<(fexp2 f32:$a),
+ (INT_NVVM_EX2_APPROX_F Float32Regs:$a)>, Requires<[doNoF32FTZ]>;
----------------
AlexMaclean wrote:
Please remove the redundant register class from the output of these patterns. (ie, `(INT_NVVM_EX2_APPROX_F $a)`)
https://github.com/llvm/llvm-project/pull/120519
More information about the llvm-commits
mailing list