[llvm] [AMDGPU] Enable unaligned scratch accesses (PR #110219)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Wed Oct 2 00:10:44 PDT 2024
================
@@ -2432,23 +2432,93 @@ define void @store_load_i64_unaligned(ptr addrspace(5) nocapture %arg) {
; GFX12-NEXT: s_wait_samplecnt 0x0
; GFX12-NEXT: s_wait_bvhcnt 0x0
; GFX12-NEXT: s_wait_kmcnt 0x0
-; GFX12-NEXT: v_mov_b32_e32 v1, 15
-; GFX12-NEXT: v_mov_b32_e32 v2, 0
+; GFX12-NEXT: v_dual_mov_b32 v1, 15 :: v_dual_mov_b32 v2, 0
; GFX12-NEXT: s_wait_storecnt 0x0
-; GFX12-NEXT: scratch_store_b64 v0, v[1:2], off scope:SCOPE_SYS
+; GFX12-NEXT: scratch_store_b8 v0, v1, off scope:SCOPE_SYS
; GFX12-NEXT: s_wait_storecnt 0x0
-; GFX12-NEXT: scratch_load_b64 v[0:1], v0, off scope:SCOPE_SYS
+; GFX12-NEXT: scratch_store_b8 v0, v2, off offset:1 scope:SCOPE_SYS
----------------
arsenm wrote:
I don't think there was any change here in gfx12
https://github.com/llvm/llvm-project/pull/110219
More information about the llvm-commits
mailing list