[PATCH] D32783: [PartialInlining] Add frequency based cost analysis

Wed May 3 18:00:53 PDT 2017

eraman added a comment.

I am not fully sure if this cost model - first check if the function is a suitable candidate for outlining and then use inliner's cost analysis to inline the original function to its callsites - is the right one.  Specifically, I'm thinking if both should be done on a per-callsite basis.

================
Comment at: lib/Transforms/IPO/PartialInlining.cpp:396
+          continue;
+        OutlinedRegionSize += InlineConstants::InstrCost;
+      }
----------------
This should ideally be in InlineCost. Not all instructions have the same cost (eg. switch). At the least it should be documented here.

================
Comment at: lib/Transforms/IPO/PartialInlining.cpp:413
+  int SizeCost;
+  int OutlinedRegionSize;
+
----------------
nit: This is also in multiples of InstrCost like SizeCost above. Using a consistent naming will make this more readable.

================
Comment at: lib/Transforms/IPO/PartialInlining.cpp:443
+  // Now compare with weighted runtime cost:
+  LoopInfo LI{DominatorTree(*F)};
+  BranchProbabilityInfo BPI(*F, LI);
----------------
I wonder if it is better to first clone the function and then call this isPartialInliningBeneficial. We might end up cloning it unnecessarily, but could reuse all these analyses which are computed on the duplicate function later.

================
Comment at: lib/Transforms/IPO/PartialInlining.cpp:453
+  int NonWeightedSavings = getCallsiteCost(CS, DL);
+  int Denom = (NonWeightedSavings > NonWeightedRcost ? NonWeightedSavings
+                                                     : NonWeightedRcost);
----------------
std::max?

================
Comment at: lib/Transforms/IPO/PartialInlining.cpp:458
+  // Weighted saving is smaller than weighted cost, return false
+  if (EntryFreq * NormWeightedSavings < OutlinedCallFreq * NormWeightedRcost) {
+    ORE.emit(OptimizationRemarkAnalysis(DEBUG_TYPE, "OutliningCallcostTooHigh",
----------------
NonWeightedSavings above only includes the savings due to elimination of call and arg setup. Since inlining also can eliminate instructions in a given context, we are likely undercounting the savings here.

================
Comment at: lib/Transforms/Utils/CodeExtractor.cpp:701
+  // Branch cost:
+  RCost += NumExitBlocks;
+
----------------
This should be multiplied by InstrCost.

================
Comment at: lib/Transforms/Utils/CodeExtractor.cpp:705
+  // Now the cost of calling the outlined function itself:
+  RCost += InlineConstants::CallPenalty;
+  SCost += InlineConstants::InstrCost;
----------------
This is different from how it is used in InlineCost. In inline cost analysis, there is no separate static and runtime costs and even CallPenalty is meant to model static cost (saving and restoring registers etc). It is clear that they need to be modeled separately, but for now I prefer replicating what is in InlineCost.cpp and may be assign both the same value. 

https://reviews.llvm.org/D32783