[llvm] AMDGPU: Fix broken exp10 lowering for f16 (PR #170582)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 4 13:07:33 PST 2025
================
@@ -5877,22 +5877,37 @@ define float @v_exp10_f32_from_fpext_math_f16_daz(i16 %src0.i, i16 %src1.i) #0 {
; FIXME: Fold out fp16_to_fp (FP_TO_FP16) on no-f16 targets
define half @v_exp10_f16(half %in) {
-; GCN-LABEL: v_exp10_f16:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GCN-NEXT: v_cvt_f32_f16_e32 v0, v0
-; GCN-NEXT: v_mul_f32_e32 v0, 0x3fb8aa3b, v0
-; GCN-NEXT: v_exp_f32_e32 v0, v0
-; GCN-NEXT: v_cvt_f16_f32_e32 v0, v0
-; GCN-NEXT: s_setpc_b64 s[30:31]
+; GCN-SDAG-LABEL: v_exp10_f16:
+; GCN-SDAG: ; %bb.0:
+; GCN-SDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-SDAG-NEXT: v_cvt_f32_f16_e32 v0, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v1, 0x3a2784bc, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v0, 0x40549000, v0
+; GCN-SDAG-NEXT: v_exp_f32_e32 v1, v1
+; GCN-SDAG-NEXT: v_exp_f32_e32 v0, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v0, v0, v1
----------------
b-sumner wrote:
> This was ported from the library code:
>
> https://github.com/ROCm/llvm-project/blob/aa47a98736a519de6fc08485a30b1b362097a6a6/amd/device-libs/ocml/src/expF_base.h#L67
Thanks @arsenm. I'm not clear on the accuracy requirement here. If it is lower than what we were aiming at in the library code, then we can drop more instructions.
https://github.com/llvm/llvm-project/pull/170582
More information about the llvm-commits
mailing list