[PATCH] D74524: [Scheduling] Improve memory ops cluster preparation

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 12 23:11:58 PST 2020


rampitec added inline comments.


================
Comment at: llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll:494
 ; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x3e7{{$}}
+; GCN: s_add_u32 s32, s33, 0x400{{$}}
 ; GCN: buffer_store_dword [[K]], off, s[0:3], s33 offset:4
----------------
This is a regression I guess. A memory operation should always go before an independent ALU as it has higher latency.


================
Comment at: llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll:524
 ; GCN: buffer_store_dword v0, off, s[0:3], s32 offset:4
+; GCN: buffer_load_dword [[RELOAD_BYVAL:v[0-9]+]], off, s[0:3], s34{{$}}
 ; GCN: buffer_store_dword [[RELOAD_BYVAL]], off, s[0:3], s32{{$}}
----------------
And then loads are preferably go before stores. Loads have higher latency and their results needs to be consumed by some other instruction. So it looks like a regression to me either.


================
Comment at: llvm/test/CodeGen/AMDGPU/captured-frame-index.ll:55
 ; GCN-DAG: v_mov_b32_e32 [[ZERO:v[0-9]+]], 4{{$}}
+; GCN: buffer_store_dword [[K]], off, s{{\[[0-9]+:[0-9]+\]}}, s{{[0-9]+}} offset:4{{$}}
 ; GCN: buffer_store_dword [[ZERO]], off, s{{\[[0-9]+:[0-9]+\]}}, s{{[0-9]+}} offset:4{{$}}
----------------
And then this is a progression, as two stores are scheduled together. It would be nice to understand if they were clustered or is it a coincidence.


================
Comment at: llvm/test/CodeGen/AMDGPU/captured-frame-index.ll:71
+
+; GCN: buffer_store_dword [[K0]], off, s{{\[[0-9]+:[0-9]+\]}}, s{{[0-9]+}} offset:4{{$}}
 ; GCN: buffer_store_dword [[K1]], off, s{{\[[0-9]+:[0-9]+\]}}, s{{[0-9]+}} offset:2052{{$}}
----------------
Same here. It seems to help store clustering.


================
Comment at: llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll:64
+; GFX7-ALIGNED-NEXT:    v_mov_b32_e32 v3, s3
+; GFX7-ALIGNED-NEXT:    flat_store_short v[0:1], v4
+; GFX7-ALIGNED-NEXT:    flat_store_short v[2:3], v5
----------------
Stores are clustered again, here and below, which is nice improvement.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74524/new/

https://reviews.llvm.org/D74524





More information about the llvm-commits mailing list