[PATCH] D93838: [LLVM] [SCCP] : Add Function Specialization pass

Mon Apr 19 19:19:31 PDT 2021

ChuanqiXu added a comment.

In D93838#2695205 <https://reviews.llvm.org/D93838#2695205>, @SjoerdMeijer wrote:

> @ChuanqiXu: I have started looking at increased compile-times, and experimented with the biggest outlier you reported:
>
>   500.perlbench_r	27%
>
> In my experiment I think I see less than 1% compile time increase with this patch.
> Would you mind double checking this for me with the latest patch? Would be good to get a confirmation to make sure we are on the same page.

Yes, I would double checking on this. The compile-time is a key factor for this patch.

Here is my context information:
Hardware:

- Ampere
- Architecture:          aarch64
- CPU(s):                80
- Thread(s) per core:    1
- Core(s) per socket:    80
- Model:                 1
- CPU max MHz:           3000.0000CPU min MHz:           1000.0000
- L1d cache:             64K
- L1i cache:             64K
- L2 cache:              1024K
- Flags:                 fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs

Baseline option:

- COPTIMIZE: -flto=full
- CXXOPTIMIZE: -flto=full
- LDOPTIMIZE: -flto=full -fuse-ld=lld

Function Specialization:

- COPTIMIZE: -flto=full -mllvm -function-specialize-level=aggressive
- CXXOPTIMIZE: -flto=full -mllvm -function-specialize-level=aggressive
- LDOPTIMIZE: -flto=full -fuse-ld=lld -mllvm -function-specialize-level=aggressive

The compiler is a private version based on SPEC2017.

In D93838#2694124 <https://reviews.llvm.org/D93838#2694124>, @SjoerdMeijer wrote:

> I could have elaborated a bit more on my plans in my previous message, so let me add that here.
>
> I saw three parts to this work:
>
> 1. The first one was the SCCP refactoring that already went in,
> 2. This patch that adds the function specialisation framework,
> 3. The further fine-tuning and cost-modelling to get this enabled by default.
>
> In my opinion, the GCC results would justify getting 2 committed first, then work on 3. A few notes on this:
>
> - The requirement to get 2 committed I think is that we get compile-times under control,
> - We really would like to get this enabled by default. If we don't reach parity with GCC, this work has little value to us,  and there is little point in doing this work.
>
> Like I said, that was my plan, but am interested to hear if you agree or you see things differently @fhahn, @ChuanqiXu. Please let me know.

Yes, I prefer to turn this on by default if we could control the compile-time. The reason why I am looking into this pass is I saw this from `The present and future of Interprocedural Optimization`:
F16401663: image.png <https://reviews.llvm.org/F16401663>

It makes sense to me that function specialization is potential. I am also planning to enable function specialization in thinLTO (by adjusting the logic we try to import function) and do function specialization based value range (so we could do interprocedural value range propogation).

================
Comment at: llvm/lib/Transforms/IPO/SCCP.cpp:129
+  if (!runFunctionSpecialization(M, DL, GetTLI, GetTTI, GetAC, GetAnalysis,
+                                 true))
+    return PreservedAnalyses::all();
----------------
SjoerdMeijer wrote:
> @ChuanqiXu : please note the hard coded `true` here, which corresponds to `IsAggressive` boolean. So I think your timings were timing the aggressive mode. But anyway, I will upload one more intermediate diff with all sorts of pass manager stuff fixed, and then will start doing some timing too.
Yes, I were timing the aggrresive mode only.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93838/new/

https://reviews.llvm.org/D93838