[LLVMdev] -fplugin-arg-dragonegg-enable-gcc-optzns status
Jack Howarth
howarth at bromo.med.uc.edu
Fri Jun 10 07:00:36 PDT 2011
On Thu, Jun 09, 2011 at 08:47:26PM -0400, Jack Howarth wrote:
> Duncan,
> Here are the complete benchmarks rerun against gcc 4.5.4 built with...
>
> Using built-in specs.
> COLLECT_GCC=gfortran-fsf-4.5
> COLLECT_LTO_WRAPPER=/sw/lib/gcc4.5/libexec/gcc/x86_64-apple-darwin11.0.0/4.5.4/lto-wrapper
> Target: x86_64-apple-darwin11.0.0
> Configured with: ../gcc-4.5.4/configure --prefix=/sw --prefix=/sw/lib/gcc4.5 --mandir=/sw/share/man --infodir=/sw/lib/gcc4.5/info --enable-languages=c,c++,fortran,objc,obj-c++,java --with-gmp=/sw --with-libiconv-prefix=/sw --with-ppl=/sw --with-cloog=/sw --with-mpc=/sw --with-system-zlib --x-includes=/usr/X11R6/include --x-libraries=/usr/X11R6/lib --program-suffix=-fsf-4.5 --enable-lto --enable-checking=release
> Thread model: posix
> gcc version 4.5.4 20110608 (prerelease) (GCC)
>
> x86_64 darwin
>
> A) gcc 4.5.4svn using -msse3 -ffast-math -O3 -fno-tree-vectorize
> B) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns
> C) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize
> D) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns -fplugin-arg-dragonegg-llvm-ir-optimize=2
> E) gcc 4.5.4svn/dragonegg using -msse3 -ffast-math -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-ir-optimize=2
>
> Run Time (seconds)
> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/
> gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2
> optimize=2
>
> ac 9.58 9.11 12.28 9.12 12.73
> aermod 20.99 16.18 17.86 16.30 17.89
> air 6.06 6.58 7.69 6.51 7.64
> capacita 35.76 39.86 46.10 39.58 45.89
> channel 2.03 2.04 1.96 2.04 1.96
> doduc 28.16 28.50 30.34 28.53 30.42
> fatigue 8.12 7.09 10.34 7.06 10.25
> gas_dyn 10.16 9.92 11.67 9.96 11.81
> induct 20.14 20.76 48.75 20.78 48.75
> linpk 15.43 15.41 15.64 15.41 15.64
> mdbx 11.41 11.72 12.11 11.72 12.07
> nf 27.90 28.52 29.26 28.42 29.13
> protein 38.65 38.72 41.31 38.75 39.49
> rnflow 27.22 28.18 31.81 28.15 31.98
> test_fpu 11.49 11.23 11.57 11.17 11.52
> tfft 1.91 1.95 2.15 1.95 2.16
>
> Mean 12.72 12.60 14.73 12.59 14.72
>
> Compile Time (seconds)
> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/
> gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2
> optimize=2
>
> ac 0.86 0.44 0.31 0.41 0.28
> aermod 31.13 25.81 20.94 25.44 20.87
> air 1.74 1.48 0.81 1.46 0.78
> capacita 0.86 0.74 0.44 0.71 0.42
> channel 0.35 0.32 0.23 0.30 0.23
> doduc 3.08 2.63 1.63 2.60 1.58
> fatigue 1.04 1.05 0.89 0.90 0.70
> gas_dyn 0.94 0.94 0.75 0.84 0.62
> induct 3.30 2.52 1.84 2.36 1.66
> linpk 0.33 0.28 0.20 0.28 0.20
> mdbx 1.09 1.02 0.60 0.99 0.59
> nf 0.41 0.40 0.28 0.40 0.28
> protein 1.56 1.28 0.98 1.21 0.82
> rnflow 1.75 1.70 1.24 1.61 1.13
> test_fpu 1.38 1.41 1.05 1.31 0.95
> tfft 0.31 0.28 0.19 0.28 0.19
mean 3.13 2.64 2.02 2.57 1.96
Duncan,
hese numbers were from release builds for both FSF gcc 4.5.4 and llvm. It seems that
-fplugin-arg-dragonegg-llvm-ir-optimize=2 provides a small offsetting reduction in compile
time to compensate for the increased compile time from -fplugin-arg-dragonegg-enable-gcc-optzns
at -O3 -ffast-math. It also appears that with -fplugin-arg-dragonegg-llvm-ir-optimize=2,
the addition of -fplugin-arg-dragonegg-enable-gcc-optzns slows compilation by 24%
with -O3 -ffast-math (which is very close to the 23% increase in compile time seen
without -fplugin-arg-dragonegg-llvm-ir-optimize=2). We should rebenchmark pb05 with -O2 -ffast-math
to see if -fplugin-arg-dragonegg-enable-gcc-optzns has the same impact on compile times.
IMHO, if -fplugin-arg-dragonegg-enable-gcc-optzns has less effect at -O2, it would might make sense
to default -fplugin-arg-dragonegg-enable-gcc-optzns on in dragonegg. That is, if the compile time
regressions are mainly at -O3 that would be tolerable because run-time of the resulting binaries
should be more important there.
Jack
>
> Executable Size (bytes)
> Benchmark A) stock B) gcc 4.5.4/ C) gcc 4.5.4/ D) gcc 4.5.4/ E) gcc 4.5.4/
> gcc 4.5.4 dragonegg/optzns dragonegg dragonegg/optzns/ dragonegg/optimize=2
> optimize=2
>
> ac 26344 30896 26704 30896 26824
> aermod 1145924 1043816 1052056 1027680 1031880
> air 57404 57700 53532 53556 53532
> capacita 40864 41008 37064 41008 37064
> channel 22448 22664 22664 22664 22664
> doduc 127340 124108 120124 124372 120484
> fatigue 61152 65352 65664 61256 61568
> gas_dyn 647864 58768 59024 54672 54960
> induct 162360 180440 175312 168304 163176
> linpk 18112 18848 18864 18848 18896
> mdbx 53464 57652 49516 57652 49516
> nf 22560 23784 24080 23784 24080
> protein 74320 74440 74816 70344 66624
> rnflow 66040 71488 71648 67416 67616
> test_fpu 52624 58224 58320 54128 54256
> tfft 18416 18456 18600 18456 18600
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list