[PATCH] D72487: [AMDGPU] Fix bundle scheduling

Jay Foad via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 10 03:34:26 PST 2020


foad added a comment.

This is only relevant for post-ra scheduling because we don't have any bundles when we do pre-ra scheduling, right?

Doing it as a DAG mutation seems a bit late. One problem is that you set the latency for the bundle's succ, but not for the corresponding pred, so you see odd mismatches like this:

  SU(5):   BUNDLE implicit-def $sgpr6_sgpr7, implicit-def $sgpr6, implicit-def $sgpr7, implicit-def $scc
    # preds left       : 2
    # succs left       : 1
    # rdefs left       : 0
    Latency            : 3
    Depth              : 0
    Height             : 1
    Predecessors:
      SU(0): Anti Latency=0
      SU(0): Anti Latency=0
    Successors:
      SU(11): Data Latency=1 Reg=$sgpr6_sgpr7
  ...
  SU(11):   S_NOP 0, implicit killed $sgpr6_sgpr7, implicit $sgpr0_sgpr1_sgpr2_sgpr3, implicit $sgpr4, implicit killed $vgpr0_vgpr1_vgpr2_vgpr3
    # preds left       : 6
    # succs left       : 0
    # rdefs left       : 0
    Latency            : 1
    Depth              : 2
    Height             : 0
    Predecessors:
      SU(10): Data Latency=0 Reg=$vgpr0_vgpr1_vgpr2_vgpr3
      SU(9): Data Latency=0 Reg=$vgpr0_vgpr1_vgpr2_vgpr3
      SU(8): Data Latency=0 Reg=$vgpr0_vgpr1_vgpr2_vgpr3
      SU(7): Data Latency=0 Reg=$vgpr0_vgpr1_vgpr2_vgpr3
      SU(6): Data Latency=0 Reg=$sgpr4
      SU(5): Data Latency=0 Reg=$sgpr6_sgpr7

(Latency for succ of SU(5) is 1, but latency for pred of SU(11) is 0.)

Instead how about doing this and implementing it in `adjustSchedDependency` for AMDGPU?

  diff --git a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
  index 96a1f86c3e0..ef5926e4f8f 100644
  --- a/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
  +++ b/llvm/lib/CodeGen/ScheduleDAGInstrs.cpp
  @@ -269,9 +269,9 @@ void ScheduleDAGInstrs::addPhysRegDataDeps(SUnit *SU, unsigned OperIdx) {
         if (!ImplicitPseudoDef && !ImplicitPseudoUse) {
           Dep.setLatency(SchedModel.computeOperandLatency(SU->getInstr(), OperIdx,
                                                           RegUse, UseOp));
  -        ST.adjustSchedDependency(SU, UseSU, Dep);
         } else
           Dep.setLatency(0);
  +      ST.adjustSchedDependency(SU, UseSU, Dep);
   
         UseSU->addPred(Dep);
       }

Also, just curious, why did you update so many tests? I only found 4 that were failing with the patch:

  Failing Tests (4):
      LLVM :: CodeGen/AMDGPU/llvm.amdgcn.ds.gws.barrier.ll
      LLVM :: CodeGen/AMDGPU/llvm.amdgcn.ds.gws.init.ll
      LLVM :: CodeGen/AMDGPU/misched-killflags.mir
      LLVM :: CodeGen/AMDGPU/mul24-pass-ordering.ll


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D72487/new/

https://reviews.llvm.org/D72487





More information about the llvm-commits mailing list