[llvm] [AMDGPU] Add IR LiveReg type-based optimization (PR #66838)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 12 14:43:08 PDT 2024


================
@@ -63,29 +61,29 @@ define <4 x i16> @vec_8xi16_extract_4xi16(ptr addrspace(1) %p0, ptr addrspace(1)
 ; SI-NEXT:    buffer_load_ushort v0, v[0:1], s[4:7], 0 addr64 offset:14 glc
 ; SI-NEXT:    s_waitcnt vmcnt(0)
 ; SI-NEXT:    v_lshlrev_b32_e32 v0, 16, v5
-; SI-NEXT:    v_lshlrev_b32_e32 v1, 16, v4
-; SI-NEXT:    v_or_b32_e32 v2, v2, v0
-; SI-NEXT:    v_or_b32_e32 v3, v3, v1
+; SI-NEXT:    v_lshlrev_b32_e32 v1, 16, v3
----------------
jrbyrnes wrote:

i16 is not legal for the generic subtarget, so the LiveReg optimization gets triggered. However, when legalizing, the v8i16 loads are scalarized into 8 i32 (ext from i16) loads which negates the benefit.

https://github.com/llvm/llvm-project/pull/66838


More information about the llvm-commits mailing list