[PATCH] D120232: [SelectOpti][4/5] Loop Heuristics

Sotiris Apostolakis via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 28 08:27:41 PDT 2022


apostolakis added inline comments.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:458
+  //    reducing the loop's critical path by some threshold; and
+  // ii) the cost of the select group is cheaper (in terms of latency) compared
+  //     to the select version.
----------------
davidxl wrote:
> Not clear about what the second condition actually means. What is the cost of select group?
Yes that was unclear. Clarified the second condition. 


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:468
+                                     L->getHeader()->getFirstNonPHI());
+    ORmissL << "Did not convert any select in the loop to branch due to "
+               "loop-level heuristics.";
----------------
davidxl wrote:
> Should this be missed optimization or Analysis. The message probably also needs more context to be useful.
I would prefer to stick to miss and pass remarks and not add analysis. It seems to me more natural and easier to know what to query for.
Changed though the code to emit more fine-grained and informative remarks for these cases.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:663
+  if (Diff[1] < Scaled64::get(GainCycleThreshold) &&
+      Diff[1] * Scaled64::get(8) < LoopCost[1].PredCost)
+    return false;
----------------
davidxl wrote:
> Add a parameter?
Added.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:668
+  // of the gain needs to be at least LoopGradientGain% (defaults to 25%).
+  if (Diff[1] > Diff[0] && (Diff[1] - Diff[0]) * Scaled64::get(100) <
+                               (LoopCost[1].PredCost - LoopCost[0].PredCost) *
----------------
davidxl wrote:
> Add explanation of this heuristic.
Added.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:678
+
+bool SelectOptimize::computeLoopCosts(
+    const Loop *L, const SelectGroups &SIGroups,
----------------
davidxl wrote:
> Document the return value.
Done.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:681
+    DenseMap<const Instruction *, CostInfo> &InstCostMap,
+    CostInfo LoopCost[2]) {
+  const auto &SIset = getSIset(SIGroups);
----------------
davidxl wrote:
> Perhaps passing by pointer explicitly?
Sure. Changed it.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:685
+  // both predicated and non-predicated version.
+  const unsigned Iterations = 2;
+  for (unsigned Iter = 0; Iter < Iterations; ++Iter) {
----------------
davidxl wrote:
> why only 2 iterations?
Two iterations suffice to determine if the the loop's critical path involves loop-carried dependences and compute the rate that the critical path increases from one iteration to another.  


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:713
+        IPredCost += Scaled64::get(ILatency.getValue());
+        INonPredCost += Scaled64::get(ILatency.getValue());
+
----------------
davidxl wrote:
> Should this exclude the selection instruction itself for non-pred case?
It would be correct to exclude the select instructions from here but for the select instructions the INonPredCost is overwritten later on.  So, we can skip adding a conditional here. 


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:741
+
+          INonPredCost = PredictedPathCost + MispredictCost;
+        }
----------------
davidxl wrote:
> should it weighted with misprediction rate?
Actually it is weighted with the misprediction rate: 
MispredictCost = max(MispredictPenalty, CondCost) * MispredictRate



================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:745
+        InstCostMap[&I] = {IPredCost, INonPredCost};
+        MaxCost.PredCost = std::max(MaxCost.PredCost, IPredCost);
+        MaxCost.NonPredCost = std::max(MaxCost.NonPredCost, INonPredCost);
----------------
davidxl wrote:
> Should Sum (of cost in a SI group) be used instead of max?
This computes the critical path which essentially is the cost of the last instruction of the critical path (and thus the instruction with the highest latency). This is regardless of how many select groups are included in the loop. These groups may or may not be part of the path. 


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:787
+  Scaled64 MispredictCost =
+      std::max(Scaled64::get(MispredictPenalty), CondCost) *
+      Scaled64::get(MispredictRate);
----------------
davidxl wrote:
> Should it be scaled by CondCost instead of taking the max?
I do not think it should be scaled. The cost we want to compute are the cycles lost due to flushing the pipeline after a mispredict. If the CondCost is small, the operands of the condition are ready when needed and the MispredictPenalty (which depends on the pipeline depth and where the branch condition is determined in the pipeline) captures the mispredict cost . If the CondCost is very big, then the cycles lost end up being the latency of determining the condition, which is essentially CondCost. For in-between cases, the correct answer is not really the max but there is still some overlap. Therefore, max seemed a reasonable choice and it also seemed to be experimentally the best option among other versions I tested.


================
Comment at: llvm/lib/CodeGen/SelectOptimize.cpp:807
+      PredPathCost /= Scaled64::get(SumWeight);
+    }
+  }
----------------
davidxl wrote:
> Should there be an early return here?
Good catch! Yes, erased by mistake the early return when I converted to Scaled64.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120232/new/

https://reviews.llvm.org/D120232



More information about the llvm-commits mailing list