[PATCH] D74524: [Scheduling] Improve memory ops cluster preparation
Qiu Chaofan via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 16 19:12:02 PST 2020
qiucf marked 2 inline comments as done.
qiucf added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/overeager_mla_fusing.ll:12
+; CHECK-NEXT: ldr q2, [x1, #96]
; CHECK-NEXT: ldr q3, [x0, #96]
; CHECK-NEXT: ldr x8, [x2, #48]
----------------
foad wrote:
> This looks suspicious. Why has it stopped clustering the two loads from x0, and the two loads from x1?
AArch64's `shouldClusterMemOps` requires the offset should be `N` and `N+1`. Here is `2` and `6`. They are not clustered in the log.
But the scheduler really has some issues about clustering. It doesn't know `SU(13)` and `SU(15)` can't be scheduled together (`SU(14)` must be in). This can be fixed by not creating this edge when `SUa` and `SUb` can't be scheduled together. We can do it in future patches.
================
Comment at: llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll:494
; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x3e7{{$}}
+; GCN: s_add_u32 s32, s33, 0x400{{$}}
; GCN: buffer_store_dword [[K]], off, s[0:3], s33 offset:4
----------------
rampitec wrote:
> This is a regression I guess. A memory operation should always go before an independent ALU as it has higher latency.
Yes. This can and another regression in this file will be eliminated if we prevent cluster edge from being created when they can't be scheduled together, as above says.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D74524/new/
https://reviews.llvm.org/D74524
More information about the llvm-commits
mailing list