[PATCH] D93838: [SCCP] Add Function Specialization pass

Thu Apr 29 02:44:24 PDT 2021

SjoerdMeijer added a comment.

In D93838#2725047 <https://reviews.llvm.org/D93838#2725047>, @ChuanqiXu wrote:

> In D93838#2724968 <https://reviews.llvm.org/D93838#2724968>, @ChuanqiXu wrote:
>
>> I just runned the newest revision with the SPEC2017 nitrate (without 548.exchange2_r). The number  of default iteration limits is one. Below is the results. The overall results look good to me.
>>
>> Performance:
>> I observed that 505.mcf_r get 10% increment, which is consistent with previous experiment.
>> Then I didn't find increment for 520.omnetpp_r nor regression. We need to explore it further.
>> Then there is no other observable changes for other benchmarks
>>
>> Compile-time:
>>
>> | benchmark       | compile-time change with limiting 1 iteration | Note                                                    |
>> | --------------- | --------------------------------------------- | ------------------------------------------------------- |
>> | 500.perlbench_r | 2%                                            |                                                         |
>> | 502.gcc_r       | 6%                                            |                                                         |
>> | 505.mcf_r       | 19%                                           | The total compile time for 505.mcf_r is relatively fast |
>> | 520.omnetpp_r   | 3%                                            |                                                         |
>> | 523.xalancbmk_r | 3%                                            |                                                         |
>> |
>>
>> No observable changes for other benchmarks.
>>
>> Code Sizes:
>>
>> | benchmark       | Code Size change with limiting 1 iteration |
>> | --------------- | ------------------------------------------ |
>> | 505.mcf_r       | 14%                                        |
>> | 523.xalancbmk_r | 2%                                         |
>> |
>
> Note for others who want to reproduce the results: I tried to run 505.mcf_r in x86-64, then I find no observable improments. The expeirment before was done in AArch64. Is it related to hardware? Or is it related to architecture?
>
> In D93838#2725038 <https://reviews.llvm.org/D93838#2725038>, @xbolva00 wrote:
>
>>>> 505.mcf_r
>>
>> Interesting, did you analyze it more? What is the reason of such improvement? Additional vectorization?
>
> I haven't look into the details. I would do that if possible.

Many thanks for the new numbers!
I was struggling with my setup yesterday (no building it, but running it and observe the uplift). I am going to give it another try today.

>> Interesting, did you analyze it more? What is the reason of such improvement? Additional vectorization?
>
> I haven't look into the details. I would do that if possible.

I would need to double check, but I think the motivating example is included as a regression test (llvm/test/Transforms/FunctionSpecialization/function-specialization.ll). It shows that specialisation eventually results in inlining and a lot of simplifications.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D93838/new/

https://reviews.llvm.org/D93838