[llvm] [AMDGPU] Enable CodeGen for v_pk_fma_bf16 (PR #152578)
Shilei Tian via llvm-commits
llvm-commits at lists.llvm.org
Thu Aug 7 16:16:58 PDT 2025
================
@@ -410,11 +412,12 @@ define amdgpu_ps void @v_test_mul_add_v2bf16_vvv(ptr addrspace(1) %out, <2 x bfl
define amdgpu_ps void @v_test_mul_add_v2bf16_vss(ptr addrspace(1) %out, <2 x bfloat> %a, <2 x bfloat> inreg %b, <2 x bfloat> inreg %c) {
; GCN-LABEL: v_test_mul_add_v2bf16_vss:
; GCN: ; %bb.0:
-; GCN-NEXT: v_pk_mul_bf16 v2, v2, s0
-; GCN-NEXT: s_delay_alu instid0(VALU_DEP_1)
-; GCN-NEXT: v_pk_add_bf16 v2, v2, s1
+; GCN-NEXT: v_pk_fma_bf16 v2, v2, s0, s1
; GCN-NEXT: global_store_b32 v[0:1], v2, off
; GCN-NEXT: s_endpgm
+
----------------
shiltian wrote:
what happens to these gaps?
https://github.com/llvm/llvm-project/pull/152578
More information about the llvm-commits
mailing list