[llvm] AMDGPU: Fix broken exp10 lowering for f16 (PR #170582)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 4 13:01:56 PST 2025
================
@@ -5877,22 +5877,37 @@ define float @v_exp10_f32_from_fpext_math_f16_daz(i16 %src0.i, i16 %src1.i) #0 {
; FIXME: Fold out fp16_to_fp (FP_TO_FP16) on no-f16 targets
define half @v_exp10_f16(half %in) {
-; GCN-LABEL: v_exp10_f16:
-; GCN: ; %bb.0:
-; GCN-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GCN-NEXT: v_cvt_f32_f16_e32 v0, v0
-; GCN-NEXT: v_mul_f32_e32 v0, 0x3fb8aa3b, v0
-; GCN-NEXT: v_exp_f32_e32 v0, v0
-; GCN-NEXT: v_cvt_f16_f32_e32 v0, v0
-; GCN-NEXT: s_setpc_b64 s[30:31]
+; GCN-SDAG-LABEL: v_exp10_f16:
+; GCN-SDAG: ; %bb.0:
+; GCN-SDAG-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GCN-SDAG-NEXT: v_cvt_f32_f16_e32 v0, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v1, 0x3a2784bc, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v0, 0x40549000, v0
+; GCN-SDAG-NEXT: v_exp_f32_e32 v1, v1
+; GCN-SDAG-NEXT: v_exp_f32_e32 v0, v0
+; GCN-SDAG-NEXT: v_mul_f32_e32 v0, v0, v1
----------------
arsenm wrote:
f16 case should use https://github.com/ROCm/llvm-project/blob/aa47a98736a519de6fc08485a30b1b362097a6a6/amd/device-libs/ocml/src/expH.cl#L15
https://github.com/llvm/llvm-project/pull/170582
More information about the llvm-commits
mailing list