[llvm] [AMDGPU] Add IR LiveReg type-based optimization (PR #66838)
Jeffrey Byrnes via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 12 14:43:08 PDT 2024
================
@@ -63,29 +61,29 @@ define <4 x i16> @vec_8xi16_extract_4xi16(ptr addrspace(1) %p0, ptr addrspace(1)
; SI-NEXT: buffer_load_ushort v0, v[0:1], s[4:7], 0 addr64 offset:14 glc
; SI-NEXT: s_waitcnt vmcnt(0)
; SI-NEXT: v_lshlrev_b32_e32 v0, 16, v5
-; SI-NEXT: v_lshlrev_b32_e32 v1, 16, v4
-; SI-NEXT: v_or_b32_e32 v2, v2, v0
-; SI-NEXT: v_or_b32_e32 v3, v3, v1
+; SI-NEXT: v_lshlrev_b32_e32 v1, 16, v3
----------------
jrbyrnes wrote:
i16 is not legal for the generic subtarget, so the LiveReg optimization gets triggered. However, when legalizing, the v8i16 loads are scalarized into 8 i32 (ext from i16) loads which negates the benefit.
https://github.com/llvm/llvm-project/pull/66838
More information about the llvm-commits
mailing list