[llvm] [DAG] Reassociate (add (add X, Y), X) --> add(add(X, X), Y) (PR #162242)
Simon Pilgrim via llvm-commits
llvm-commits at lists.llvm.org
Tue Oct 7 01:39:29 PDT 2025
================
@@ -1811,12 +1817,14 @@ define amdgpu_kernel void @udot2_MultipleUses_add1(ptr addrspace(1) %src1,
; GFX10-DL-NEXT: v_lshrrev_b32_e32 v0, 16, v1
; GFX10-DL-NEXT: s_waitcnt vmcnt(0)
; GFX10-DL-NEXT: v_lshrrev_b32_e32 v3, 16, v2
-; GFX10-DL-NEXT: v_mul_u32_u24_sdwa v1, v2, v1 dst_sel:DWORD dst_unused:UNUSED_PAD src0_sel:WORD_0 src1_sel:WORD_0
-; GFX10-DL-NEXT: v_mov_b32_e32 v2, 0
+; GFX10-DL-NEXT: v_and_b32_e32 v1, 0xffff, v1
+; GFX10-DL-NEXT: v_and_b32_e32 v2, 0xffff, v2
; GFX10-DL-NEXT: s_waitcnt lgkmcnt(0)
; GFX10-DL-NEXT: v_mad_u32_u24 v0, v3, v0, s0
-; GFX10-DL-NEXT: v_add3_u32 v0, v0, v1, v0
-; GFX10-DL-NEXT: global_store_dword v2, v0, s[6:7]
+; GFX10-DL-NEXT: v_mov_b32_e32 v3, 0
+; GFX10-DL-NEXT: v_add_nc_u32_e32 v0, v0, v0
+; GFX10-DL-NEXT: v_mad_u32_u24 v0, v2, v1, v0
----------------
RKSimon wrote:
@jayfoad @arsenm It looks like we're missing demanded bits handling for MAD24 instructions - but I haven't found much in the DAG that handles the MAD24 opcodes at all - is this all currently done with isel patterns? Is it going to cause problems if I try to add MAD24 DAG lowering?
https://github.com/llvm/llvm-project/pull/162242
More information about the llvm-commits
mailing list