[llvm] [AMDGPU][True16][CodeGen] true16 codegen pattern for fma (PR #122950)

Brox Chen via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 24 08:17:38 PST 2025


================
@@ -107,11 +108,18 @@ define half @v_fma_f16(half %x, half %y, half %z) {
 ; GFX10-NEXT:    v_fma_f16 v0, v0, v1, v2
 ; GFX10-NEXT:    s_setpc_b64 s[30:31]
 ;
-; GFX11-LABEL: v_fma_f16:
-; GFX11:       ; %bb.0:
-; GFX11-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; GFX11-NEXT:    v_fma_f16 v0, v0, v1, v2
-; GFX11-NEXT:    s_setpc_b64 s[30:31]
+; GFX11-TRUE16-LABEL: v_fma_f16:
+; GFX11-TRUE16:       ; %bb.0:
+; GFX11-TRUE16-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
+; GFX11-TRUE16-NEXT:    v_fmac_f16_e32 v2.l, v0.l, v1.l
----------------
broxigarchen wrote:

Hi Joe. I ran a quick check on this and it seems there is a problem in the two-address convert pass that it failed to map the dst register and thus failed to convert 2 address to 3 address.

It seems it's related with the register class setting for vgpr_16. Since gisel change is not upstreamed, it's better to fix this i the downstream branch. I'll file a case to track this and we can just merge it as it now in the upstream

https://github.com/llvm/llvm-project/pull/122950


More information about the llvm-commits mailing list