[PATCH] D38164: [MachineScheduler] Favor instructions that do not increase pressure.
Florian Hahn via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Sep 22 00:22:21 PDT 2017
fhahn created this revision.
Herald added subscribers: javed.absar, kristof.beyls, tpr, aemerson.
I think all other things being equal, it could be beneficial to prioritize
candidates that do not increase register pressure, as this may uncover
new candidates that reduce register pressure.
I have been experimenting with using the MachineScheduler on ARM with
Cortex-A57, and this change helps to reduce spilling in some cases.
For the LLVM test-suite this change helps to reduce spilling in some cases.For
the test-suite and SPEC2k, it improves the geomean of the execution times
compared to the old scheduler from -0.82% (MachinScheduler for ARM on
Cortex-A57) to -1.03% (same as before, but with this patch) and eliminates a
few big outliners. For SPEC2k6, it improves the geomean of execution
times from -0.24% to -0.29%.
For AArch64 I did some benchmark runs comparing master against this
patch.
Cortex-A57:
- -0.62% geomean exec time for test-suite and SPEC2k
- -0.29% for SPEC2k6
- +0.19% SPEC2017 score
Cortex-A72:
- -0.29% geomean exec time for test-suite and SPEC2k
The impact of this change on the unit tests is quite small. The
modified test case in this patch uses 1 register less now, with the
order of instructions slightly changed. The earliest fcvt gets still
scheduled after 2 instructions, so latency wise the schedule should not
have regressed.
The following ADMGPU tests also fail:
- CodeGen/AMDGPU/attr-amdgpu-num-sgpr.ll (with -mtriple=amdgcn--amdhsa -mcpu=fiji -amdgpu-spill-sgpr-to-smem=1) Expected: NumSGPRsForWavesPerEU: 9 Actual: NumSGPRsForWavesPerEU: 14
- CodeGen/AMDGPU/schedule-regpressure-limit2.ll (with -march=amdgcn -mcpu=fiji -misched=gcn-max-occupancy-experimental Expected: NumSgprs: {{[1-5][0-9]$}} Actual: NumSGPRsForWavesPerEU: 78
I do not know anything about AMDGPU, so I cannot really say if that is
good or bad?
@jonpa this might be interesting to you too, as I gathered that you are
working on switching SystemZ to the MachineScheduler.
https://reviews.llvm.org/D38164
Files:
lib/CodeGen/MachineScheduler.cpp
test/CodeGen/AArch64/vector-fcopysign.ll
Index: test/CodeGen/AArch64/vector-fcopysign.ll
===================================================================
--- test/CodeGen/AArch64/vector-fcopysign.ll
+++ test/CodeGen/AArch64/vector-fcopysign.ll
@@ -94,22 +94,22 @@
define <4 x float> @test_copysign_v4f32_v4f64(<4 x float> %a, <4 x double> %b) #0 {
; CHECK-LABEL: test_copysign_v4f32_v4f64:
; CHECK-NEXT: mov s3, v0[1]
-; CHECK-NEXT: movi.4s v4, #128, lsl #24
-; CHECK-NEXT: fcvt s5, d1
-; CHECK-NEXT: mov s6, v0[2]
-; CHECK-NEXT: mov s7, v0[3]
-; CHECK-NEXT: bit.16b v0, v5, v4
-; CHECK-NEXT: fcvt s5, d2
-; CHECK-NEXT: bit.16b v6, v5, v4
-; CHECK-NEXT: mov d1, v1[1]
+; CHECK-NEXT: mov d4, v1[1]
+; CHECK-NEXT: fcvt s4, d4
+; CHECK-NEXT: movi.4s v5, #128, lsl #24
; CHECK-NEXT: fcvt s1, d1
-; CHECK-NEXT: bit.16b v3, v1, v4
+; CHECK-NEXT: bit.16b v3, v4, v5
+; CHECK-NEXT: mov s4, v0[2]
+; CHECK-NEXT: mov s6, v0[3]
+; CHECK-NEXT: bit.16b v0, v1, v5
+; CHECK-NEXT: fcvt s1, d2
+; CHECK-NEXT: bit.16b v4, v1, v5
; CHECK-NEXT: mov d1, v2[1]
; CHECK-NEXT: fcvt s1, d1
; CHECK-NEXT: ins.s v0[1], v3[0]
-; CHECK-NEXT: ins.s v0[2], v6[0]
-; CHECK-NEXT: bit.16b v7, v1, v4
-; CHECK-NEXT: ins.s v0[3], v7[0]
+; CHECK-NEXT: ins.s v0[2], v4[0]
+; CHECK-NEXT: bit.16b v6, v1, v5
+; CHECK-NEXT: ins.s v0[3], v6[0]
; CHECK-NEXT: ret
%tmp0 = fptrunc <4 x double> %b to <4 x float>
%r = call <4 x float> @llvm.copysign.v4f32(<4 x float> %a, <4 x float> %tmp0)
Index: lib/CodeGen/MachineScheduler.cpp
===================================================================
--- lib/CodeGen/MachineScheduler.cpp
+++ lib/CodeGen/MachineScheduler.cpp
@@ -2774,6 +2774,19 @@
Reason)) {
return true;
}
+
+ // If one of the candidates does neither increase or decrease pressure, but
+ // the other does increase the pressure, go with it.
+ if (TryP.getUnitInc() == 0 && CandP.getUnitInc() > 0) {
+ TryCand.Reason = Reason;
+ return true;
+ }
+ if (TryP.getUnitInc() > 0 && CandP.getUnitInc() == 0) {
+ if (Cand.Reason > Reason)
+ Cand.Reason = Reason;
+ return true;
+ }
+
// Do not compare the magnitude of pressure changes between top and bottom
// boundary.
if (Cand.AtTop != TryCand.AtTop)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: D38164.116307.patch
Type: text/x-patch
Size: 2292 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170922/0828dd08/attachment.bin>
More information about the llvm-commits
mailing list