[llvm] [AMDGPU] Allow folding FMA with SGPR and immediate operand on GFX10+ (PR #72258)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 14 05:57:02 PST 2023
================
@@ -715,12 +715,33 @@ define amdgpu_ps i32 @s_mul_fma_32_f32(float inreg %x, float inreg %y) {
; GFX9-NEXT: v_readfirstlane_b32 s0, v0
; GFX9-NEXT: ; return to shader part epilog
;
-; GFX1011-LABEL: s_mul_fma_32_f32:
-; GFX1011: ; %bb.0:
-; GFX1011-NEXT: v_mov_b32_e32 v0, s1
-; GFX1011-NEXT: v_fmac_f32_e64 v0, 0x42000000, s0
-; GFX1011-NEXT: v_readfirstlane_b32 s0, v0
-; GFX1011-NEXT: ; return to shader part epilog
+; GFX10-SDAG-LABEL: s_mul_fma_32_f32:
+; GFX10-SDAG: ; %bb.0:
+; GFX10-SDAG-NEXT: v_mov_b32_e32 v0, s1
+; GFX10-SDAG-NEXT: v_fmamk_f32 v0, s0, 0x42000000, v0
+; GFX10-SDAG-NEXT: v_readfirstlane_b32 s0, v0
+; GFX10-SDAG-NEXT: ; return to shader part epilog
+;
+; GFX10-GISEL-LABEL: s_mul_fma_32_f32:
+; GFX10-GISEL: ; %bb.0:
+; GFX10-GISEL-NEXT: v_mov_b32_e32 v0, s1
+; GFX10-GISEL-NEXT: v_fmac_f32_e64 v0, 0x42000000, s0
----------------
jayfoad wrote:
We are only successfully folding to fmamk for sdag. To make it work for gisel we need something like #72128.
https://github.com/llvm/llvm-project/pull/72258
More information about the llvm-commits
mailing list