[LLVMdev] Performance regression in the LiveIntevals phase
Hal Finkel
hfinkel at anl.gov
Thu Oct 9 09:17:26 PDT 2014
----- Original Message -----
> From: "Vaidas Gasiunas" <vaidas.gasiunas at sap.com>
> To: "LLVM Dev (llvmdev at cs.uiuc.edu)" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, October 9, 2014 4:37:39 AM
> Subject: [LLVMdev] Performance regression in the LiveIntevals phase
>
> Some time ago we reported a compile-time performance regression in
> the LiveIntervals analysis pass (see
> http://llvm.org/bugs/show_bug.cgi?id=18580 ). We detected it at
> first after migrating from LLVM 3.1 to 3.3, but the problem persists
> also in 3.5. This regression is especially critical when compiling
> long functions. In one of our benchmarks compile time goes from 200s
> (in 3.1) up to 1500s (in 3.3).
>
>
>
> We also managed to reproduce the regression with a C++ example using
> clang. Here are the instructions:
>
>
>
> Generate the example C++ file with the Perl script attached to the
> Bug (5000 is a good size parameter to clearly see the regression):
>
> perl create.pl example.cpp 5000
>
> Generate LLVM IR code for the C++ file (the regression is in the
> backend functionality, so we don't want to measure frontend
> compilation)
>
> llvm31/clang++ example.cpp -c -emit-llvm -o example.ir-31
>
> llvm35/clang++ example.cpp -c -emit-llvm -o example.ir-35
>
>
>
> Now you can run the actual performance measurements using llc:
>
> llvm31/llc -cppgen=program -o=5000.asm example.ir-31
>
> llvm35/llc -cppgen=program -o=5000.asm example.ir-35
>
> or you can also compile example.ir-31 with llvm3.5:
>
> llvm35/llc -cppgen=program -o=5000.asm example.ir-31
>
>
>
> Our analysis had shown that the most of time is taken for generation
> of live intervals for physical registers. In the biggest example,
> the live interval of one of the physical registers consists of about
> 500.000 live ranges. Insertions in the middle of the array of live
> ranges cause quadratic algorithmic complexity, which is apparently
> the main reason for the slow-down. I changed the implementation of
> computeRegUnitInterval() so that it uses a set instead of the array,
> and in this way managed to reduce the execution time of the
> computeRegUnitInterval from 1200s down to about 1s. The fix is a bit
> ugly, however, because I cannot completely switch to the set, since
> further analyses are more efficient on the array. For that reason, I
> flush the contents of the set into the array at the end of
> computeRegUnitInterval. I also had to rewrite various operations on
> the live interval so that they use the set structure if it is
> available and the array otherwise.
>
> If there is an interest, I can port the patch to the dev version of
> llvm, but maybe someone has a better idea of how to deal with the
> regression.
I think this would be interesting, please do. There are a few places in LLVM where we currently keep dual data structures, and needing another one is not necessarily bad (we have SetVector, for example).
Thanks!
-Hal
>
>
>
> Regards,
>
> Vaidas
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
--
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory
More information about the llvm-dev
mailing list