[PATCH] D93838: [SCCP] Add Function Specialization pass

Thu May 13 03:41:21 PDT 2021

SjoerdMeijer added a comment.

Re: compile-times, I finally managed to get the numbers out.

| Timing report (in seconds)           | LTO    | LTO + func spec | Diff                            |
| ------------------------------------ | ------ | --------------- | ------------------------------- |
| Pass execution 1                     | 0.5054 | 0.7222          | +43%                            |
| Pass execution 2                     | 0.4992 | 0.6274          | +26%                            |
| DWARF Emission                       | 0.0399 | 0.0304          | Not interesting, probably noise |
| Register Allocation                  | 0.0381 | 0.0433          | ''                              |
| Instruction Selection and Scheduling | 0.1256 | 0.0982          | ''                              |
|

These percentages look bad, but the absolute numbers are small... Zooming in on the `Pass Execution 1` regressions, the top 3 compile time consumers before were:

  0.1590 ( 22.9%)   0.0000 (  0.1%)   0.1590 ( 22.0%)   0.1590 ( 22.0%)  InstCombinePass
  0.0699 ( 10.1%)   0.0006 (  2.1%)   0.0705 (  9.8%)   0.0705 (  9.8%)  GVN
  0.0670 (  9.6%)   0.0034 ( 12.2%)   0.0704 (  9.7%)   0.0703 (  9.7%)  IndVarSimplifyPass

And after, with this patch applied:

  0.1116 ( 23.0%)   0.0000 (  0.0%)   0.1116 ( 22.1%)   0.1116 ( 22.1%)  InstCombinePass
  0.0560 ( 11.5%)   0.0001 (  0.6%)   0.0561 ( 11.1%)   0.0560 ( 11.1%)  IndVarSimplifyPass
  0.0380 (  7.8%)   0.0037 ( 19.1%)   0.0417 (  8.3%)   0.0417 (  8.3%)  GVN

Function specialisation on itself is not expensive and takes the 7th place with:

  0.0346 (  5.0%)   0.0000 (  0.0%)   0.0346 (  4.8%)   0.0346 (  4.8%)  FunctionSpecializationPass

Looking at `Pass Execution 2`, the top 3 consumers before were:

  0.1270 ( 29.4%)   0.0115 ( 17.3%)   0.1386 ( 27.8%)   0.1386 ( 27.8%)  AArch64 Instruction Selection
  0.0241 (  5.6%)   0.0438 ( 65.8%)   0.0679 ( 13.6%)   0.0679 ( 13.6%)  AArch64 Assembly Printer
  0.0579 ( 13.4%)   0.0001 (  0.1%)   0.0580 ( 11.6%)   0.0579 ( 11.6%)  Loop Strength Reduction   

And after:

  0.1432 ( 27.3%)   0.0360 ( 34.9%)   0.1792 ( 28.6%)   0.1792 ( 28.6%)  AArch64 Instruction Selection
  0.0336 (  6.4%)   0.0556 ( 53.8%)   0.0891 ( 14.2%)   0.0891 ( 14.2%)  AArch64 Assembly Printer
  0.0736 ( 14.0%)   0.0024 (  2.3%)   0.0760 ( 12.1%)   0.0759 ( 12.1%)  Loop Strength Reduction

My conclusion from this is that it's not just one thing: function specialisation on itself is just a minor contributor, but extra compile time is spent in existing passes without anything really standing out.

Thus, how about this as an intial commit @sanwou01, @dmgreen, @fhahn? Iterating on the current limitations with this in tree is more convenient.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93838/new/

https://reviews.llvm.org/D93838