[cfe-dev] [llvm-dev] llvm and clang are getting slower

Wed Mar 9 06:28:29 PST 2016

----- Original Message -----
> From: "Daniel Berenyi via cfe-dev" <cfe-dev at lists.llvm.org>
> To: cfe-dev at lists.llvm.org
> Sent: Wednesday, March 9, 2016 2:33:15 AM
> Subject: Re: [cfe-dev] [llvm-dev] llvm and clang are getting slower
> 
> Dear All,
> 
> One bit got my attention in this discussion:
> "if an optimization increases compile time by 5% or increases code
> size by 5% for a particular benchmark, that benchmark should also be
> one which sees a 5% runtime improvement"
> 
> In some cases (e.g. distributed scientific computing but I could name
> a few more) we'd be happy to trade 2-4 times more compile time to 5%
> runtime, because the compile time is on the order of minutes-10
> minutes but the runtime is measured in days-weeks-months. If you
> multiply that up by the number of cores in the cluster (tens,
> hundreds, ...), it boils down to potentially wasting time and
> electricity (money and money).
> 
> So, for us, we'd be glad to have a switch that enables even the more
> expensive optimization passes, basically expressing the following:
> "Dear compiler, I give you all the resources you may need, just
> optimize whatever you can in this code as much as you can".

Agreed. The metric you quoted is the best description we've come up with for -O2. -O3 is more aggressive.

There are, however, at least two complicating factors:

 1. Even in HPC, we have code bases with many millions of lines that already take hours to compile. Significantly increasing compile time is a hard sell under these circumstances too.

 2. Because we have compiler algorithms that are super-linear, even letting the compiler work harder (i.e. raising current cutoffs), you run into a law of quickly diminishing returns. Not that we can't profitably do more here, but we might not realistically be able to do much more. Many of the things a compiler does come down to heuristically approximating the solution to some NP-complete problem. Thus, regardless of the resources provided, we'll never consistently provide an optimal answer (even if our modeling were completely accurate, which it is not). IMHO, moving forward in this direction involves using better algorithms, and/or an additional set of optimizations, for "HPC" (e.g. -O4) mode. These would be algorithms and optimizations that consistently produce better results than our current ones, but with a high static cost making them inappropriate for a general-purpose setting. I don't think anyone has yet produced a compelling argument that this is necessary or desirable for any specific set of thresholds/algorithms/optimizations.

 -Hal

> 
> Daniel
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory