[PATCH] D19401: MachineScheduler: Fully compare top/bottom candidates

Fri Jun 24 13:32:29 PDT 2016

MatzeB updated this revision to Diff 61825.
MatzeB added a comment.

Investigating the scheduler some more, I came to the realization that when scheduling bidirectional we often have situations in which there is no clear good choice at the top or bottom boundary. Picking something from the top boundary because some heuristic rule with a "tie-breaking nature" tells us to is often contraproductive as we often get into a situation later where the node would have been useful to reduce register pressure at the bottom boundary.

Tweaking the heuristic in the sense that we only pick something from the top boundary if it is a good choice in itself (= only 1 choice, register pressure improves, cluster edges are respected, keeping physreg copies early). In cases where we only have "tie-breaking" rules (= register pressure is increased less than with another node, weak edges are not fulfilled, nodes are in program order, ...) we stay with the choice from the bottom boundary.

This restores the quality of the reg-usage.ll example from Tom, and seems to be beneficial in general. On AArch64 (the only other target with bidirectonal scheduling enabled) I measure a 2% improvement in 252.eon, 1% in 401.bzip2 and improvements in a handful of smaller benchmarks without any regressions (outside of noise).

Repository:
  rL LLVM

http://reviews.llvm.org/D19401

Files:
  include/llvm/CodeGen/MachineScheduler.h
  lib/CodeGen/MachineScheduler.cpp
  test/CodeGen/AArch64/arm64-convert-v4f64.ll
  test/CodeGen/AArch64/bitreverse.ll
  test/CodeGen/AArch64/cxx-tlscc.ll
  test/CodeGen/AArch64/vcvt-oversize.ll
  test/CodeGen/AArch64/vector-fcopysign.ll
  test/CodeGen/AMDGPU/and.ll
  test/CodeGen/AMDGPU/atomic_cmp_swap_local.ll
  test/CodeGen/AMDGPU/ctpop64.ll
  test/CodeGen/AMDGPU/ds_read2_offset_order.ll
  test/CodeGen/AMDGPU/ds_read2st64.ll
  test/CodeGen/AMDGPU/fneg-fabs.f64.ll
  test/CodeGen/AMDGPU/indirect-addressing-si.ll
  test/CodeGen/AMDGPU/insert_vector_elt.ll
  test/CodeGen/AMDGPU/llvm.AMDGPU.rsq.clamped.f64.ll
  test/CodeGen/AMDGPU/llvm.amdgcn.rsq.clamp.ll
  test/CodeGen/AMDGPU/local-memory-two-objects.ll
  test/CodeGen/AMDGPU/move-addr64-rsrc-dead-subreg-writes.ll
  test/CodeGen/AMDGPU/sra.ll
  test/CodeGen/PowerPC/ppc-shrink-wrapping.ll
  test/CodeGen/PowerPC/ppc64-byval-align.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D19401.61825.patch
Type: text/x-patch
Size: 31946 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160624/28f36a51/attachment.bin>