[PATCH] D119880: [FuncSpec] Support function specialization across multiple arguments.

Tue Feb 22 01:30:41 PST 2022

SjoerdMeijer added a comment.

Sorry for the delay. First, about a NFC and some refactoring, can the reshuffle of `ArgInfo` and `SpecialisationInfo` and the changes in the Solver functions be an NFC change perhaps?

But more importantly, rereading the description, I disagree with these statements:

> I've also lifted two arbitrary limitations around the specialization selection: the penalty in cost estimation for newly discovered functions, and the truncation of clones for a given function.

These are not arbitrary restrictions, but was by design, like I mentioned in a comment inline. For the former, this was fundamental how the cost model used to work. For the latter, the candidates were sorted first on profitability, then the candidates with the least gain disregarded. Again, this was both by design, to manage the number of specialisations and compile-times. Also, this patch introduces an arbitrary heuristic: `MinGainThreshold`.

The real arbitrary restriction of the current implementation is this: we only specialise the first argument. This is the real restriction that we should lift first, and it should be based of course on profitability. That's why I feel this patch is doing too much at the same time, from some restructuring, to changing how the cost-model works and specialising multiple arguments. So my suggestion/question is if it would be possible to split things up accordingly. Does that make sense?

In D119880#3327657 <https://reviews.llvm.org/D119880#3327657>, @labrinea wrote:

> Here are a few more statistics from comparing this patch to the current implementation of function specialization:
>
> This is from compiling the llvm-test-suite at -O3 under perf with a release build (no asserts) targeting x86. The metric is instruction count (average of three)
>
> | **test name**    | **%delta** |
> | ClamAV           | +0.009545546536911       |
> | 7zip             | -0.001629518928931       |
> | tramp3d-v4       | -0.046465647871192       |
> | kimwitu++        | +0.011940454030694       |
> | sqlite3          | -0.158695422048798       |
> | mafft            | -0.014463100189515       |
> | lencod           | -0.020921880121996       |
> | SPASS            | -0.047946880831827       |
> | Bullet           | -0.003464312699035       |
> | consumer-typeset | -0.008383706273952       |
> |
>
> geomean = -0.0280598%
>
> This is from compiling/running the llvm-test-suite at -O3 targeting AArch64 with statistics.
>
> | **test name**                                 | **num specializations before** | **num specializations after** |
> | MultiSource/Applications/ClamAV/clamscan.test | 3                              | 0                             |
> | MultiSource/Applications/d/make_dparser.test  | 1                              | 0                             |
> | MultiSource/Applications/oggenc/oggenc.test   | 2                              | 2                             |
> | MultiSource/Applications/sqlite3/sqlite3.test | 3                              | 0                             |
> |

So why exactly do we specialise less?

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:112
+//
+// testname         | % diff
+// -----------------+-------------------
----------------
Like I mentioned before, not sure what the value is of recording compile time numbers here as things are still in flux; i.e. I don't think this information will age very well.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:132
 // to the actual transform helper functions.
-struct ArgInfo {
-  Function *Fn;         // The function to perform specialisation on.
-  Argument *Formal;     // The Formal argument being analysed.
-  Constant *Actual;     // A corresponding actual constant argument.
-  InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
-
-  // Flag if this will be a partial specialization, in which case we will need
-  // to keep the original function around in addition to the added
-  // specializations.
-  bool Partial = false;
-
-  ArgInfo(Function *F, Argument *A, Constant *C, InstructionCost G)
-      : Fn(F), Formal(A), Actual(C), Gain(G){};
+struct Specialization {
+  SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
----------------
Nit: perhaps `SpecializationInfo`, to be a tiny bit more consistent (with ArgInfo) and descriptive.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:321

-      auto ConstArgs = calculateGains(F, Cost);
-      if (ConstArgs.empty()) {
+      SpecializationInfo Info = calculateGains(F, Cost);
+      if (Info.empty()) {
----------------
Nit: `Info` could be more descriptive.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:400
+  SpecializationInfo calculateGains(Function *F, InstructionCost Cost) {
+    SpecializationInfo Info;
     // Determine if we should specialize the function based on the values the
----------------
Nit: `Info` could be more descriptive.

================
Comment at: llvm/lib/Transforms/IPO/FunctionSpecialization.cpp:550
+    // instructions in the function.
+    return Metrics.NumInsts * InlineConstants::InstrCost;
   }
----------------
This was fundamental how the cost model used to work: the more function specialised, the harder that become due to the Penalty. So I disagree with this statement:

>  I've also lifted two arbitrary limitations around the specialization selection: the penalty in cost estimation ....

This was not arbitrary, but by design. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119880/new/

https://reviews.llvm.org/D119880