[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status

Jack Howarth howarth at bromo.med.uc.edu
Fri Jun 10 07:00:36 PDT 2011


On Thu, Jun 09, 2011 at 08:47:26PM -0400, Jack Howarth wrote:
> Duncan,
>     Here are the complete benchmarks rerun against gcc 4.5.4 built with...
> 
> Using built-in specs.
> COLLECT_GCC=gfortran-fsf-4.5
> COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper
> Target: x86_64-apple-darwin11.0.0
> Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release
> Thread model: posix
> gcc version 4.5.4 20110608 (prerelease) (GCC) 
> 
> x86_64 darwin 
> 
> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize 
> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
> D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2
> E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2
> 
> Run Time (seconds)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/ 
>               gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                            optimize=2
> 
> ac             9.58          9.11             12.28          9.12               12.73  
> aermod        20.99         16.18             17.86         16.30               17.89 
> air            6.06          6.58              7.69          6.51                7.64 
> capacita      35.76         39.86             46.10         39.58               45.89 
> channel        2.03          2.04              1.96          2.04                1.96
> doduc         28.16         28.50             30.34         28.53               30.42
> fatigue        8.12          7.09             10.34          7.06               10.25
> gas_dyn       10.16          9.92             11.67          9.96               11.81
> induct        20.14         20.76             48.75         20.78               48.75
> linpk         15.43         15.41             15.64         15.41               15.64
> mdbx          11.41         11.72             12.11         11.72               12.07 
> nf            27.90         28.52             29.26         28.42               29.13
> protein       38.65         38.72             41.31         38.75               39.49
> rnflow        27.22         28.18             31.81         28.15               31.98 
> test_fpu      11.49         11.23             11.57         11.17               11.52
> tfft           1.91          1.95              2.15          1.95                2.16
> 
> Mean          12.72         12.60             14.73         12.59               14.72
> 
> Compile Time (seconds)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/
>               gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                            optimize=2
> 
> ac             0.86          0.44             0.31          0.41                0.28
> aermod        31.13         25.81            20.94         25.44               20.87
> air            1.74          1.48             0.81          1.46                0.78 
> capacita       0.86          0.74             0.44          0.71                0.42
> channel        0.35          0.32             0.23          0.30                0.23
> doduc          3.08          2.63             1.63          2.60                1.58
> fatigue        1.04          1.05             0.89          0.90                0.70
> gas_dyn        0.94          0.94             0.75          0.84                0.62
> induct         3.30          2.52             1.84          2.36                1.66
> linpk          0.33          0.28             0.20          0.28                0.20
> mdbx           1.09          1.02             0.60          0.99                0.59
> nf             0.41          0.40             0.28          0.40                0.28
> protein        1.56          1.28             0.98          1.21                0.82
> rnflow         1.75          1.70             1.24          1.61                1.13 
> test_fpu       1.38          1.41             1.05          1.31                0.95
> tfft           0.31          0.28             0.19          0.28                0.19

mean             3.13          2.64             2.02          2.57                1.96

Duncan,
    hese numbers were from release builds for both FSF gcc 4.5.4 and llvm. It seems that
-fplugin-arg-dragonegg-llvm-ir-optimize=2 provides a small offsetting reduction in compile
time to compensate for the increased compile time from -fplugin-arg-dragonegg-enable-gcc-optzns
at -O3 -ffast-math. It also appears that with -fplugin-arg-dragonegg-llvm-ir-optimize=2,
the addition of -fplugin-arg-dragonegg-enable-gcc-optzns slows compilation by 24%
with -O3 -ffast-math (which is very close to the 23% increase in compile time seen
without -fplugin-arg-dragonegg-llvm-ir-optimize=2). We should rebenchmark pb05 with -O2 -ffast-math
to see if -fplugin-arg-dragonegg-enable-gcc-optzns has the same impact on compile times.
IMHO, if -fplugin-arg-dragonegg-enable-gcc-optzns has less effect at -O2, it would might make sense
to default -fplugin-arg-dragonegg-enable-gcc-optzns on in dragonegg. That is, if the compile time
regressions are mainly at -O3 that would be tolerable because run-time of the resulting binaries
should be more important there.
            Jack


> 
> Executable Size (bytes)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/
>               gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                            optimize=2
> 
> ac              26344       30896             26704          30896              26824
> aermod        1145924     1043816           1052056        1027680            1031880 
> air             57404       57700             53532          53556              53532
> capacita        40864       41008             37064          41008              37064
> channel         22448       22664             22664          22664              22664 
> doduc          127340      124108            120124         124372             120484
> fatigue         61152       65352             65664          61256              61568
> gas_dyn        647864       58768             59024          54672              54960
> induct         162360      180440            175312         168304             163176
> linpk           18112       18848             18864          18848              18896
> mdbx            53464       57652             49516          57652              49516
> nf              22560       23784             24080          23784              24080
> protein         74320       74440             74816          70344              66624
> rnflow          66040       71488             71648          67416              67616
> test_fpu        52624       58224             58320          54128              54256
> tfft            18416       18456             18600          18456              18600
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



More information about the llvm-dev mailing list