[PATCH] D38164: [MachineScheduler] Favor instructions that do not increase pressure.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 22 00:22:21 PDT 2017


fhahn created this revision.
Herald added subscribers: javed.absar, kristof.beyls, tpr, aemerson.

I think all other things being equal, it could be beneficial to prioritize
candidates that do not increase register pressure, as this may uncover
new candidates that reduce register pressure.

I have been experimenting with using the MachineScheduler on ARM with
Cortex-A57, and this change helps to reduce spilling in some cases.
For the LLVM test-suite this change helps to reduce spilling in some cases.For
the test-suite and SPEC2k, it improves the geomean of the execution times
compared to the old scheduler from -0.82% (MachinScheduler for ARM on
Cortex-A57) to -1.03% (same as before, but with this patch) and eliminates a
few big outliners. For SPEC2k6, it improves the geomean of execution
times from -0.24% to -0.29%.

For AArch64 I did some benchmark runs comparing master against this
patch.
Cortex-A57:

- -0.62% geomean exec time for test-suite and SPEC2k
- -0.29% for SPEC2k6
- +0.19% SPEC2017 score

Cortex-A72:

- -0.29% geomean exec time for test-suite and SPEC2k

The impact of this change on the unit tests is quite small. The
modified test case in this patch uses 1 register less now, with the
order of instructions slightly changed. The earliest fcvt gets still
scheduled after 2 instructions, so latency wise the schedule should not
have regressed.

The following ADMGPU tests also fail:

- CodeGen/AMDGPU/attr-amdgpu-num-sgpr.ll (with  -mtriple=amdgcn--amdhsa -mcpu=fiji -amdgpu-spill-sgpr-to-smem=1) Expected: NumSGPRsForWavesPerEU: 9 Actual:   NumSGPRsForWavesPerEU: 14
- CodeGen/AMDGPU/schedule-regpressure-limit2.ll (with -march=amdgcn -mcpu=fiji -misched=gcn-max-occupancy-experimental Expected:  NumSgprs: {{[1-5][0-9]$}} Actual:    NumSGPRsForWavesPerEU: 78

I do not know anything about AMDGPU, so I cannot really say if that is
good or bad?

@jonpa this might be interesting to you too, as I gathered that you are
working on switching SystemZ to the MachineScheduler.


https://reviews.llvm.org/D38164

Files:
  lib/CodeGen/MachineScheduler.cpp
  test/CodeGen/AArch64/vector-fcopysign.ll


Index: test/CodeGen/AArch64/vector-fcopysign.ll
===================================================================
--- test/CodeGen/AArch64/vector-fcopysign.ll
+++ test/CodeGen/AArch64/vector-fcopysign.ll
@@ -94,22 +94,22 @@
 define <4 x float> @test_copysign_v4f32_v4f64(<4 x float> %a, <4 x double> %b) #0 {
 ; CHECK-LABEL: test_copysign_v4f32_v4f64:
 ; CHECK-NEXT:    mov s3, v0[1]
-; CHECK-NEXT:    movi.4s v4, #128, lsl #24
-; CHECK-NEXT:    fcvt s5, d1
-; CHECK-NEXT:    mov s6, v0[2]
-; CHECK-NEXT:    mov s7, v0[3]
-; CHECK-NEXT:    bit.16b v0, v5, v4
-; CHECK-NEXT:    fcvt s5, d2
-; CHECK-NEXT:    bit.16b v6, v5, v4
-; CHECK-NEXT:    mov d1, v1[1]
+; CHECK-NEXT:    mov d4, v1[1]
+; CHECK-NEXT:    fcvt s4, d4
+; CHECK-NEXT:    movi.4s v5, #128, lsl #24
 ; CHECK-NEXT:    fcvt s1, d1
-; CHECK-NEXT:    bit.16b v3, v1, v4
+; CHECK-NEXT:    bit.16b v3, v4, v5
+; CHECK-NEXT:    mov s4, v0[2]
+; CHECK-NEXT:    mov s6, v0[3]
+; CHECK-NEXT:    bit.16b v0, v1, v5
+; CHECK-NEXT:    fcvt s1, d2
+; CHECK-NEXT:    bit.16b v4, v1, v5
 ; CHECK-NEXT:    mov d1, v2[1]
 ; CHECK-NEXT:    fcvt s1, d1
 ; CHECK-NEXT:    ins.s v0[1], v3[0]
-; CHECK-NEXT:    ins.s v0[2], v6[0]
-; CHECK-NEXT:    bit.16b v7, v1, v4
-; CHECK-NEXT:    ins.s v0[3], v7[0]
+; CHECK-NEXT:    ins.s v0[2], v4[0]
+; CHECK-NEXT:    bit.16b v6, v1, v5
+; CHECK-NEXT:    ins.s v0[3], v6[0]
 ; CHECK-NEXT:    ret
   %tmp0 = fptrunc <4 x double> %b to <4 x float>
   %r = call <4 x float> @llvm.copysign.v4f32(<4 x float> %a, <4 x float> %tmp0)
Index: lib/CodeGen/MachineScheduler.cpp
===================================================================
--- lib/CodeGen/MachineScheduler.cpp
+++ lib/CodeGen/MachineScheduler.cpp
@@ -2774,6 +2774,19 @@
                  Reason)) {
     return true;
   }
+
+  // If one of the candidates does neither increase or decrease pressure, but
+  // the other does increase the pressure, go with it.
+  if (TryP.getUnitInc() == 0 && CandP.getUnitInc() > 0) {
+    TryCand.Reason = Reason;
+    return true;
+  }
+  if (TryP.getUnitInc() > 0 && CandP.getUnitInc() == 0) {
+    if (Cand.Reason > Reason)
+      Cand.Reason = Reason;
+    return true;
+  }
+
   // Do not compare the magnitude of pressure changes between top and bottom
   // boundary.
   if (Cand.AtTop != TryCand.AtTop)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D38164.116307.patch
Type: text/x-patch
Size: 2292 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170922/0828dd08/attachment.bin>


More information about the llvm-commits mailing list