[llvm] [FuncSpec] Improve accounting of specialization codesize growth (PR #113448)

Mon Oct 28 04:58:14 PDT 2024

================
@@ -759,6 +771,14 @@ bool FunctionSpecializer::run() {
   SmallVector<Function *> Clones;
   for (unsigned I = 0; I < NSpecs; ++I) {
     Spec &S = AllSpecs[BestSpecs[I]];
+
+    // Check that creating this specialization doesn't exceed the maximum
+    // codesize growth.
+    unsigned FuncSize = getCostValue(FunctionMetrics[S.F].NumInsts);
+    if ((FunctionGrowth[S.F] + S.CodeSizeCost) / FuncSize > MaxCodeSizeGrowth)
+      continue;
----------------
hazzlim wrote:

Ah ok that makes a lot of sense, thank you for providing the additional context.

I have updated the code so that we only check the codesize increase inside the `isProfitable` lambda, but do the accumulation into `FunctionGrowth[S.F]` at the point we are actually creating the specialization. This has the downside of us potentially violating `MaxCodeSizeGrowth` in a single iteration, in the case where we have specialization candidates within a single iteration that individually do not cause us to exceed `MaxCodeSizeGrowth` (and so pass the check in the `isProfitable` lambda), but in combination do exceed when they are all performed. This is why I have removed the second test case from the regression test, as it no longer makes sense.

However I think that this is an acceptable trade-off, as it still preserves the intention of the MaxCodeSizeGrowth threshold by preventing linear codesize increase across multiple iterations, as even if the threshold is exceeded in an iteration we would not consider further specializations in successive iterations once the growth has been accounted for, and we have MaxClones to limit the amount of specializations per iteration as you say.

We could fix the potential violation of `MaxCodeSizeGrowth` within a single iteration by moving the accumulation into `FunctionGrowth[S.F]` to the end of the `isProfitable` lambda as you suggested. However I don't really like this, because we may end up discounting the most profitable specialization candidate for a function. To clarify my previous comment - by Nth and N+1-th candidates I meant candidates within a single iteration. If we accumulate the codesize growth whilst we are analyzing the candidates, we may 'use-up' MaxCodeSizeGrowth before encountering the best specialization and therefore fail to consider it:

```
// (Within a single iteration, MaxCodeSizeGrowth=100%)
spec1 [Score 500, Growth 50%]
spec2 [Score 500, Growth 50%]
// After examining the above, FunctionGrowth[S.F] = 100% so no more candidates will be considered profitable!
// Candidate with highest Score:
spec3 [Score 1000, Growth 10%]
```

https://github.com/llvm/llvm-project/pull/113448