[llvm] r319951 - [MachineCombiner] Add up latencies of all instructions in new pattern.

Florian Hahn via llvm-commits llvm-commits at lists.llvm.org
Wed Dec 6 12:27:34 PST 2017


Author: fhahn
Date: Wed Dec  6 12:27:33 2017
New Revision: 319951

URL: http://llvm.org/viewvc/llvm-project?rev=319951&view=rev
Log:
[MachineCombiner] Add up latencies of all instructions in new pattern.

Summary:
When calculating the RootLatency, we add up all the latencies of the
deleted instructions. But for NewRootLatency we only add the latency of
the new root instructions, ignoring the latencies of the other
instructions inserted. This leads the combiner to underestimate the cost
of patterns which add multiple instructions. This patch fixes that by
summing up the latencies of all new instructions. For NewRootNode, the
more complex getLatency function is used.

Note that we may be slightly more precise than just summing up
all latencies. For example, consider a pattern like

    r1 = INS1 ..
    r2 = INS2 ..
    r3 = INS3 r1, r2

I think in some other places, the total latency of the pattern would be
estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider
that worth changing, I think it would be best to do in a follow-up
patch.

Reviewers: Gerolf, sebpop, spop, fhahn

Reviewed By: fhahn

Subscribers: evandro, llvm-commits

Differential Revision: https://reviews.llvm.org/D40307

Modified:
    llvm/trunk/lib/CodeGen/MachineCombiner.cpp

Modified: llvm/trunk/lib/CodeGen/MachineCombiner.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineCombiner.cpp?rev=319951&r1=319950&r2=319951&view=diff
==============================================================================
--- llvm/trunk/lib/CodeGen/MachineCombiner.cpp (original)
+++ llvm/trunk/lib/CodeGen/MachineCombiner.cpp Wed Dec  6 12:27:33 2017
@@ -282,9 +282,16 @@ bool MachineCombiner::improvesCriticalPa
   // of the original code sequence. This may allow the transform to proceed
   // even if the instruction depths (data dependency cycles) become worse.
 
-  unsigned NewRootLatency = getLatency(Root, NewRoot, BlockTrace);
-  unsigned RootLatency = 0;
+  // Account for the latency of the inserted and deleted instructions by
+  // adding up their latencies. This assumes that the inserted and deleted
+  // instructions are dependent instruction chains, which might not hold
+  // in all cases.
+  unsigned NewRootLatency = 0;
+  for (unsigned i = 0; i < InsInstrs.size() - 1; i++)
+    NewRootLatency += TSchedModel.computeInstrLatency(InsInstrs[i]);
+  NewRootLatency += getLatency(Root, NewRoot, BlockTrace);
 
+  unsigned RootLatency = 0;
   for (auto I : DelInstrs)
     RootLatency += TSchedModel.computeInstrLatency(I);
 




More information about the llvm-commits mailing list