[LLVMdev] MISched: What does it mean when PressureChange objects are not valid?

Tom Stellard tom at stellard.net
Fri Apr 24 08:11:58 PDT 2015


Hi,

I've been trying to debug an issue where the scheduler does not hide
latency well and schedules an ALU instruction before a load.

v_add_i32_e32 v0, s10, v0     <-- This should be scheduled after the load.
s_load_dwordx4 s[0:3], s[8:9], 0x4
s_waitcnt lgkmcnt(0)
buffer_load_format_xyzw v[0:3], v0, s[0:3], 0 idxen
v_mov_b32_e32 v4, 0
s_waitcnt vmcnt(0)
exp 15, 32, 0, 0, 0, v0, v1, v4, v4
s_endpgm

The reason that v_add_i32_e32 is scheduled first is because the
tryPressure call checking CriticalMax returns true.

The CriticalMax PressureChange for v_add_i32_e32 is invalid, which gives
it a higher rank than s_load_dwordx4.  I'm wondering what it means to
have an invalid PressureChange value for CriticalMax and why an invalid
Pressure change is always scheduled first.  For some more context here
is some debug output from the machine scheduler:

SU(3):   %vreg20<def> = V_ADD_I32_e32 %vreg5, %vreg7, %EXEC<imp-use>,
%VCC<imp-def,dead>; VGPR_32:%vreg20,%vreg7 SGPR_32:%vreg5
  # preds left       : 2
  # succs left       : 1
  # rdefs left       : 0
  Latency            : 1
  Depth              : 0
  Height             : 451
  Predecessors:
   val SU(1): Latency=0 Reg=%vreg5
   val SU(0): Latency=0 Reg=%vreg7
  Successors:
   val SU(5): Latency=1 Reg=%vreg20

SU(4):   %vreg13<def> = S_LOAD_DWORDX4_IMM %vreg4, 4;
mem:LD16[%3(addrspace=2)](tbaa=<0x1631310>) SReg_128:%vreg13
SReg_64:%vreg4
  # preds left       : 1
  # succs left       : 1
  # rdefs left       : 0
  Latency            : 10
  Depth              : 0
  Height             : 460
  Predecessors:
   val SU(2): Latency=0 Reg=%vreg4
  Successors:
   val SU(5): Latency=10 Reg=%vreg13

===tryCandidate( Cand = 3, tryCand = 4)===
biasPhysRegCopy:
tryPressure Execess:
+tryPressure()
TryCand = 4, Cand = 3
TryRank = 65535 CandRank = 65535
Both candidates affect the same set
tryPressure CriticalMax:
+tryPressure()
TryCand = 4, Cand = 3
TryRank = 65535 CandRank = 65535
Both candidates affect the same set
tryLatency:
tryLess, getLatencyStallCycles:
tryGreater, cluster:
tryLess, getWeakLeft:
tryPression CurrentMax:
+tryPressure()
TryCand = 4, Cand = 3
TryRank = 12 CandRank = 65535
tryGreater
Pick Top REG-MAX   
Scheduling SU(3) %vreg20<def> = V_ADD_I32_e32 %vreg5, %vreg7,
%EXEC<imp-use>, %VCC<imp-def,dead>; VGPR_32:%vreg20,%vreg7
%SGPR_32:%vreg5
  Ready @0c
  HWVALU +1x3255u
  *** Max MOps 1 at cycle 0
Cycle: 1 TopQ.A
TopQ.A @1c
  Retired: 1
  Executed: 1c
  Critical: 1c, 1 MOps
  ExpectedLatency: 0c
  - Latency limited.
TopQ.A: 6 4 
  SU(6) ORDER                             

-Tom 



More information about the llvm-dev mailing list