[PATCH] D74524: [Scheduling] Improve memory ops cluster preparation

Qiu Chaofan via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Feb 16 19:12:02 PST 2020


qiucf marked 2 inline comments as done.
qiucf added inline comments.


================
Comment at: llvm/test/CodeGen/AArch64/overeager_mla_fusing.ll:12
+; CHECK-NEXT:    ldr q2, [x1, #96]
 ; CHECK-NEXT:    ldr q3, [x0, #96]
 ; CHECK-NEXT:    ldr x8, [x2, #48]
----------------
foad wrote:
> This looks suspicious. Why has it stopped clustering the two loads from x0, and the two loads from x1?
AArch64's `shouldClusterMemOps` requires the offset should be `N` and `N+1`. Here is `2` and `6`. They are not clustered in the log.

But the scheduler really has some issues about clustering. It doesn't know `SU(13)` and `SU(15)` can't be scheduled together (`SU(14)` must be in). This can be fixed by not creating this edge when `SUa` and `SUb` can't be scheduled together. We can do it in future patches.


================
Comment at: llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll:494
 ; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x3e7{{$}}
+; GCN: s_add_u32 s32, s33, 0x400{{$}}
 ; GCN: buffer_store_dword [[K]], off, s[0:3], s33 offset:4
----------------
rampitec wrote:
> This is a regression I guess. A memory operation should always go before an independent ALU as it has higher latency.
Yes. This can and another regression in this file will be eliminated if we prevent cluster edge from being created when they can't be scheduled together, as above says.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74524/new/

https://reviews.llvm.org/D74524





More information about the llvm-commits mailing list