[llvm] [AMDGPU][True16][CodeGen] true16 codegen pattern for fma (PR #122950)
Brox Chen via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 24 08:17:38 PST 2025
================
@@ -107,11 +108,18 @@ define half @v_fma_f16(half %x, half %y, half %z) {
; GFX10-NEXT: v_fma_f16 v0, v0, v1, v2
; GFX10-NEXT: s_setpc_b64 s[30:31]
;
-; GFX11-LABEL: v_fma_f16:
-; GFX11: ; %bb.0:
-; GFX11-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT: v_fma_f16 v0, v0, v1, v2
-; GFX11-NEXT: s_setpc_b64 s[30:31]
+; GFX11-TRUE16-LABEL: v_fma_f16:
+; GFX11-TRUE16: ; %bb.0:
+; GFX11-TRUE16-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-TRUE16-NEXT: v_fmac_f16_e32 v2.l, v0.l, v1.l
----------------
broxigarchen wrote:
Hi Joe. I ran a quick check on this and it seems there is a problem in the two-address convert pass that it failed to map the dst register and thus failed to convert 2 address to 3 address.
It seems it's related with the register class setting for vgpr_16. Since gisel change is not upstreamed, it's better to fix this i the downstream branch. I'll file a case to track this and we can just merge it as it now in the upstream
https://github.com/llvm/llvm-project/pull/122950
More information about the llvm-commits
mailing list