[llvm] [AMDGPU] Legalize 64bit elements for BUILD_VECTOR on gfx942 (PR #145052)

Janek van Oirschot via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 20 08:36:11 PDT 2025


================
@@ -7,11 +7,11 @@ define <2 x i32> @uniform_masked_load_ptr1_mask_v2i32(ptr addrspace(1) inreg noc
 ; GFX942-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; GFX942-NEXT:    v_and_b32_e32 v0, 1, v0
 ; GFX942-NEXT:    v_cmp_eq_u32_e32 vcc, 1, v0
-; GFX942-NEXT:    v_mov_b32_e32 v0, 0
-; GFX942-NEXT:    v_mov_b32_e32 v1, v0
+; GFX942-NEXT:    v_mov_b64_e32 v[0:1], 0
 ; GFX942-NEXT:    s_and_saveexec_b64 s[2:3], vcc
 ; GFX942-NEXT:    s_cbranch_execz .LBB0_2
 ; GFX942-NEXT:  ; %bb.1: ; %cond.load
+; GFX942-NEXT:    v_mov_b32_e32 v0, 0
----------------
JanekvO wrote:

Not convinced we can elide this superfluous v_mov easily with current llvm. Previously this would've been CSE'd with the other v_mov_b32 that materializes 0 but now it requires MachinseCSE to become aware of subregisters which I don't think is as trivial (Unless there's some low hanging fruit I'm unaware of).

https://github.com/llvm/llvm-project/pull/145052


More information about the llvm-commits mailing list