[llvm] [AMDGPU] Allow dpp in v_pk_fmac_f16 for GFX9 and GFX10 (PR #144782)
Jun Wang via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 18 17:33:03 PDT 2025
================
@@ -2172,6 +2172,7 @@ defm V_LDEXP_F16 : VOP2_Real_gfx10<0x03b>;
let IsSingle = 1 in {
defm V_PK_FMAC_F16 : VOP2_Real_e32_gfx10<0x03c>;
}
+defm V_PK_FMAC_F16 : VOP2_Real_dpp_gfx10<0x03c>, VOP2_Real_dpp8_gfx10<0x03c>;
----------------
jwanggit86 wrote:
> No need to define separate dpp real, VOP2_Real_gfx10 should do it for you.
This would create some additional instructions though, i.e., for GFX9, `V_PK_FMAC_F16_e32_gfx9`, `V_PK_FMAC_F16_e64_gfx9`, `V_PK_FMAC_F16_sdwa_gfx9`, and for GFX10, `V_PK_FMAC_F16_e64_gfx10`, `V_PK_FMAC_F16_sdwa_gfx10`. Would this be a problem?
Also, here (Line 2173) `VOP2_Real_e32_gfx10` is instantiated with a predicate on Line 2172, but `VOP2_Real_gfx10` does not have this predicate.
https://github.com/llvm/llvm-project/pull/144782
More information about the llvm-commits
mailing list