[PATCH] D74524: [Scheduling] Improve memory ops cluster preparation
    Qiu Chaofan via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Sun Feb 16 19:12:02 PST 2020
    
    
  
qiucf marked 2 inline comments as done.
qiucf added inline comments.
================
Comment at: llvm/test/CodeGen/AArch64/overeager_mla_fusing.ll:12
+; CHECK-NEXT:    ldr q2, [x1, #96]
 ; CHECK-NEXT:    ldr q3, [x0, #96]
 ; CHECK-NEXT:    ldr x8, [x2, #48]
----------------
foad wrote:
> This looks suspicious. Why has it stopped clustering the two loads from x0, and the two loads from x1?
AArch64's `shouldClusterMemOps` requires the offset should be `N` and `N+1`. Here is `2` and `6`. They are not clustered in the log.
But the scheduler really has some issues about clustering. It doesn't know `SU(13)` and `SU(15)` can't be scheduled together (`SU(14)` must be in). This can be fixed by not creating this edge when `SUa` and `SUb` can't be scheduled together. We can do it in future patches.
================
Comment at: llvm/test/CodeGen/AMDGPU/callee-special-input-vgprs.ll:494
 ; GCN-DAG: v_mov_b32_e32 [[K:v[0-9]+]], 0x3e7{{$}}
+; GCN: s_add_u32 s32, s33, 0x400{{$}}
 ; GCN: buffer_store_dword [[K]], off, s[0:3], s33 offset:4
----------------
rampitec wrote:
> This is a regression I guess. A memory operation should always go before an independent ALU as it has higher latency.
Yes. This can and another regression in this file will be eliminated if we prevent cluster edge from being created when they can't be scheduled together, as above says.
Repository:
  rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74524/new/
https://reviews.llvm.org/D74524
    
    
More information about the llvm-commits
mailing list