[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status

Duncan Sands baldrick at free.fr
Fri Jun 10 07:30:33 PDT 2011


Hi Jack,

>      Here are the complete benchmarks rerun against gcc 4.5.4 built with...

thanks for these great numbers.  It is interesting to see that dropping the LLVM
IR optimization level to 2 makes no difference to the run-times.  As a radical
experiment I just committed a patch to dragonegg (commit 132846) that disables
all heavy LLVM optimizations when the GCC optimizers are enabled.  A few small
cleanups are run on each function, but otherwise only LLVM codegen (and codegen
optimizations) are done.  I did some measurements and this results in very fast
compile times.  But how does it impact run-time?  Can you please benchmark
run times with -fplugin-arg-dragonegg-enable-gcc-optzns and this patch applied
(plus don't use the -fplugin-arg-dragonegg-llvm-ir-optimize option since that
turns on heavy LLVM IR optimizations again).  If it has no impact on run-times
then that would suggest that LLVM's IR level optimizers are not doing any useful
optimization: GCC already got everything.  If it does have an impact then that
suggests that LLVM is picking up stuff that GCC missed.  I can't way to see!

Thanks a lot, Duncan.

>
> Using built-in specs.
> COLLECT_GCC=gfortran-fsf-4.5
> COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper
> Target: x86_64-apple-darwin11.0.0
> Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release
> Thread model: posix
> gcc version 4.5.4 20110608 (prerelease) (GCC)
>
> x86_64 darwin
>
> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
> D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2
> E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2
>
> Run Time (seconds)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/
>                gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                             optimize=2
>
> ac             9.58          9.11             12.28          9.12               12.73
> aermod        20.99         16.18             17.86         16.30               17.89
> air            6.06          6.58              7.69          6.51                7.64
> capacita      35.76         39.86             46.10         39.58               45.89
> channel        2.03          2.04              1.96          2.04                1.96
> doduc         28.16         28.50             30.34         28.53               30.42
> fatigue        8.12          7.09             10.34          7.06               10.25
> gas_dyn       10.16          9.92             11.67          9.96               11.81
> induct        20.14         20.76             48.75         20.78               48.75
> linpk         15.43         15.41             15.64         15.41               15.64
> mdbx          11.41         11.72             12.11         11.72               12.07
> nf            27.90         28.52             29.26         28.42               29.13
> protein       38.65         38.72             41.31         38.75               39.49
> rnflow        27.22         28.18             31.81         28.15               31.98
> test_fpu      11.49         11.23             11.57         11.17               11.52
> tfft           1.91          1.95              2.15          1.95                2.16
>
> Mean          12.72         12.60             14.73         12.59               14.72
>
> Compile Time (seconds)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/
>                gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                             optimize=2
>
> ac             0.86          0.44             0.31          0.41                0.28
> aermod        31.13         25.81            20.94         25.44               20.87
> air            1.74          1.48             0.81          1.46                0.78
> capacita       0.86          0.74             0.44          0.71                0.42
> channel        0.35          0.32             0.23          0.30                0.23
> doduc          3.08          2.63             1.63          2.60                1.58
> fatigue        1.04          1.05             0.89          0.90                0.70
> gas_dyn        0.94          0.94             0.75          0.84                0.62
> induct         3.30          2.52             1.84          2.36                1.66
> linpk          0.33          0.28             0.20          0.28                0.20
> mdbx           1.09          1.02             0.60          0.99                0.59
> nf             0.41          0.40             0.28          0.40                0.28
> protein        1.56          1.28             0.98          1.21                0.82
> rnflow         1.75          1.70             1.24          1.61                1.13
> test_fpu       1.38          1.41             1.05          1.31                0.95
> tfft           0.31          0.28             0.19          0.28                0.19
>
> Executable Size (bytes)
> Benchmark     A) stock    B) gcc 4.5.4/    C) gcc 4.5.4/   D) gcc 4.5.4/       E) gcc 4.5.4/
>                gcc 4.5.4   dragonegg/optzns    dragonegg    dragonegg/optzns/   dragonegg/optimize=2
>                                                             optimize=2
>
> ac              26344       30896             26704          30896              26824
> aermod        1145924     1043816           1052056        1027680            1031880
> air             57404       57700             53532          53556              53532
> capacita        40864       41008             37064          41008              37064
> channel         22448       22664             22664          22664              22664
> doduc          127340      124108            120124         124372             120484
> fatigue         61152       65352             65664          61256              61568
> gas_dyn        647864       58768             59024          54672              54960
> induct         162360      180440            175312         168304             163176
> linpk           18112       18848             18864          18848              18896
> mdbx            53464       57652             49516          57652              49516
> nf              22560       23784             24080          23784              24080
> protein         74320       74440             74816          70344              66624
> rnflow          66040       71488             71648          67416              67616
> test_fpu        52624       58224             58320          54128              54256
> tfft            18416       18456             18600          18456              18600
>
>




More information about the llvm-dev mailing list