[llvm] 5c7bbe3 - [MachinePipeliner] Refine the RecMII calculation
    Jinsong Ji via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Mon Apr 13 12:17:23 PDT 2020
    
    
  
Author: Lama
Date: 2020-04-13T19:17:15Z
New Revision: 5c7bbe3659a04c1d17deb3b50ab5b88204327842
URL: https://github.com/llvm/llvm-project/commit/5c7bbe3659a04c1d17deb3b50ab5b88204327842
DIFF: https://github.com/llvm/llvm-project/commit/5c7bbe3659a04c1d17deb3b50ab5b88204327842.diff
LOG: [MachinePipeliner] Refine the RecMII calculation
In the case of more than one SDep  between two successor SUnits in the Nodeset, the current implementation sums the latencies of the dependencies, which could create a larger RecMII than necessary.
for example, in case there is both a data dependency and an output dependency (with latency > 0) between successor nodes:
SU(1) inst1:
  successors:
    SU(2): out  latency = 1
    SU(2): data latency = 1
SU(2) inst2:
  successors:
    SU(3): out  latency = 1
    SU(3): data latency = 1
SU(3) inst3:
  successors:
    SU(1): out  latency = 1
    SU(1): data latency = 1
the NodeSet latency returned would be 6, whereas it could be 3 if we take the max for each successor SUnit.
In general this can be extended to finding the shortest path in the recurrence..
thoughts?
Unfortunately I had a hard time creating a test for this in Hexagon/PowerPC, so help would be appreciated.
Reviewed By: bcahoon
Differential Revision: https://reviews.llvm.org/D75918
Added: 
    
Modified: 
    llvm/include/llvm/CodeGen/MachinePipeliner.h
Removed: 
    
################################################################################
diff  --git a/llvm/include/llvm/CodeGen/MachinePipeliner.h b/llvm/include/llvm/CodeGen/MachinePipeliner.h
index 24e85a953d47..49276fb1a94d 100644
--- a/llvm/include/llvm/CodeGen/MachinePipeliner.h
+++ b/llvm/include/llvm/CodeGen/MachinePipeliner.h
@@ -330,10 +330,22 @@ class NodeSet {
   NodeSet() = default;
   NodeSet(iterator S, iterator E) : Nodes(S, E), HasRecurrence(true) {
     Latency = 0;
-    for (unsigned i = 0, e = Nodes.size(); i < e; ++i)
-      for (const SDep &Succ : Nodes[i]->Succs)
-        if (Nodes.count(Succ.getSUnit()))
-          Latency += Succ.getLatency();
+    for (unsigned i = 0, e = Nodes.size(); i < e; ++i) {
+      DenseMap<SUnit *, unsigned> SuccSUnitLatency;
+      for (const SDep &Succ : Nodes[i]->Succs) {
+        auto SuccSUnit = Succ.getSUnit();
+        if (!Nodes.count(SuccSUnit))
+          continue;
+        unsigned CurLatency = Succ.getLatency();
+        unsigned MaxLatency = 0;
+        if (SuccSUnitLatency.count(SuccSUnit))
+          MaxLatency = SuccSUnitLatency[SuccSUnit];
+        if (CurLatency > MaxLatency)
+          SuccSUnitLatency[SuccSUnit] = CurLatency;
+      }
+      for (auto SUnitLatency : SuccSUnitLatency)
+        Latency += SUnitLatency.second;
+    }
   }
 
   bool insert(SUnit *SU) { return Nodes.insert(SU); }
        
    
    
More information about the llvm-commits
mailing list