[llvm] [AMDGPU] Enable more consecutive load folding during aggressive-instcombine (PR #158036)
Juan Manuel Martinez CaamaƱo via llvm-commits
llvm-commits at lists.llvm.org
Thu Sep 11 07:32:48 PDT 2025
================
@@ -7,23 +7,25 @@ define void @memcpy_fixed_align(ptr addrspace(5) %dst, ptr addrspace(1) %src) {
; MUBUF-LABEL: memcpy_fixed_align:
; MUBUF: ; %bb.0:
; MUBUF-NEXT: s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
-; MUBUF-NEXT: global_load_dwordx2 v[11:12], v[1:2], off offset:32
; MUBUF-NEXT: global_load_dwordx4 v[3:6], v[1:2], off
; MUBUF-NEXT: global_load_dwordx4 v[7:10], v[1:2], off offset:16
+; MUBUF-NEXT: global_load_dwordx4 v[11:14], v[1:2], off offset:24
----------------
jmmartinez wrote:
I'm a bit confused by this test case. The values that are stored in some registers will now overlap right?
`v[7:10]` will contain `%src[16...32)`, and `v[11:14]` will contain `%src[24...40)`. So `v[11:12]` should be equal to `v[9:10]` right ?
https://github.com/llvm/llvm-project/pull/158036
More information about the llvm-commits
mailing list