[PATCH] D119880: [FuncSpec] Support function specialization across multiple arguments.

Tue Mar 1 04:42:47 PST 2022

labrinea added a comment.

> ! In D119880#3337134 <https://reviews.llvm.org/D119880#3337134>, @SjoerdMeijer wrote:
>  These are not arbitrary restrictions, but was by design, like I mentioned in a comment inline. For the former, this was fundamental how the cost model used to work. For the latter, the candidates were sorted first on profitability, then the candidates with the least gain disregarded. Again, this was both by design, to manage the number of specialisations and compile-times. Also, this patch introduces an arbitrary heuristic: `MinGainThreshold`.

I'll give an example to make the problem more clear. Let's say function `foo` has four candidate specializations with bonus 10, 15, 20, 25 and cost 5 (assume penalty is zero at this point), and function `bar` has four specializations with bonus 20, 25, 30, 35 and cost 20 (assume it would have been 5 without the penalty). The corresponding gains are 5, 10, 15, 20 for `foo` and 0, 5, 10, 15 for `bar`. With `MaxClonesThreshold = 2` we reject the first two candidates of each function. With the new cost model the gains would be 5, 10, 15, 20 for `foo` (same as before) and 15, 20, 25, 30 for `bar`. With `MinGainThreshold = 20` we reject the first three candidates of `foo` and the first candidate of `bar`. As a result we have the same number of specializations as before, but we have kept the most profitable ones. `MinGainThreshold` is a way to control the code size increase as well as the compilation times without having to sort all the specializations (of all functions, not per function). Its value was decided empirically as I explained in my previous comment. The existing cost model considers everything with gain above zero profitable. `MinGainThreshold` allows fine-tuning this value. If we prefered sorting all the specializations by gain we would then need to decide how many of them we are keeping based on some heuristic (percentage maybe?), which is more or less the same problem as deciding a value for `MinGainThreshold`: both are somewhat "arbitrary".

> So why exactly do we specialise less?

We specialize less because the default value of `MinGainThreshold` has been set high.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:112
+//
+// testname         | % diff
+// -----------------+-------------------
----------------
SjoerdMeijer wrote:
> Like I mentioned before, not sure what the value is of recording compile time numbers here as things are still in flux; i.e. I don't think this information will age very well.
That's why I am mentioning percentages and not absolute values, but okay, I'll remove this comment.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:132
 // to the actual transform helper functions.
-struct ArgInfo {
-  Function *Fn;         // The function to perform specialisation on.
-  Argument *Formal;     // The Formal argument being analysed.
-  Constant *Actual;     // A corresponding actual constant argument.
-  InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
-
-  // Flag if this will be a partial specialization, in which case we will need
-  // to keep the original function around in addition to the added
-  // specializations.
-  bool Partial = false;
-
-  ArgInfo(Function *F, Argument *A, Constant *C, InstructionCost G)
-      : Fn(F), Formal(A), Actual(C), Gain(G){};
+struct Specialization {
+  SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
----------------
SjoerdMeijer wrote:
> Nit: perhaps `SpecializationInfo`, to be a tiny bit more consistent (with ArgInfo) and descriptive.
The name `SpecializationInfo` is being used below:

```
using SpecializationInfo = SmallMapVector<CallBase *, Specialization, 8>;
```

 I will change that one to`SpecializationMap` if that's okay.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:321

-      auto ConstArgs = calculateGains(F, Cost);
-      if (ConstArgs.empty()) {
+      SpecializationInfo Info = calculateGains(F, Cost);
+      if (Info.empty()) {
----------------
SjoerdMeijer wrote:
> Nit: `Info` could be more descriptive.
Will change it to `Specializations`.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:400
+  SpecializationInfo calculateGains(Function *F, InstructionCost Cost) {
+    SpecializationInfo Info;
     // Determine if we should specialize the function based on the values the
----------------
SjoerdMeijer wrote:
> Nit: `Info` could be more descriptive.
Will change it to `Specializations`

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:550
+    // instructions in the function.
+    return Metrics.NumInsts * InlineConstants::InstrCost;
   }
----------------
SjoerdMeijer wrote:
> This was fundamental how the cost model used to work: the more function specialised, the harder that become due to the Penalty. So I disagree with this statement:
> 
> >  I've also lifted two arbitrary limitations around the specialization selection: the penalty in cost estimation ....
> 
> This was not arbitrary, but by design. 
> 
> 
Maybe "arbitrary" was the wrong vocabulary here. Sorry. What I mean is that it is not fair to bias the calculation of the cost of a given function based on historical data as "how many specializations have happend so far". As I explained on the description, a potential specialization may never trigger even if it is more profitable from one that did, just because it was discovered first. I could separate this change to another patch if it makes the review easier.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119880/new/

https://reviews.llvm.org/D119880