[LLVMdev] Polyhedron 2005 results for dragonegg 3.3svn

Sat Jun 1 12:34:16 PDT 2013

On Sat, Jun 01, 2013 at 06:45:48AM +0200, Duncan Sands wrote:
>
> These results are very disappointing, I was hoping to see a big improvement
> somewhere instead of no real improvement anywhere (except for gas_dyn) or a
> regression (eg: mdbx).  I think LLVM now has a reasonable array of fast-math
> optimizations.  I will try to find time to poke at gas_dyn and induct: since
> turning on gcc's optimizations there halve the run-time, LLVM's IR optimizers
> are clearly missing something important.
>
> Ciao, Duncan.

Duncan,
   Appended are another set of benchmark runs where I attempted to decouple the
fast math optimizations from the vectorization by passing -fno-tree-vectorize.
I am unclear if dragonegg really honors -fno-tree-vectorize to disable the llvm
vectorization.

Tested on x86_apple-darwin12

Compile Flags: -ffast-math -funroll-loops -O3 -fno-tree-vectorize

de-gfc48: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.specs
de-gfc48+optzns: /sw/lib/gcc4.8/bin/gfortran -fplugin=/sw/lib/gcc4.8/lib/dragonegg.so -specs=/sw/lib/gcc4.8/lib/integrated-as.spec
s -fplugin-arg-dragonegg-enable-gcc-optzns
gfortran48: /sw/bin/gfortran-fsf-4.8

Run time (secs)

Benchmark     de-gfc48  de-gfc48   gfortran48
                        +optzns 

ac             11.33      8.10       8.02 
aermod         16.03     14.45      16.13
air             6.80      5.28       5.73
capacita       39.89     35.21      34.96
channel         2.06      2.29       2.69 
doduc          27.35     26.13      25.74
fatigue         8.83      4.82       4.67
gas_dyn        11.41      9.79       9.60
induct         23.95     21.75      21.14
linpk          15.49     15.48      15.69
mdbx           11.91     11.28      11.39
nf             29.92     29.57      27.99
protein        36.34     33.94      31.91
rnflow         25.97     25.27      22.78
test_fpu       11.48     10.91       9.64
tfft            1.92      1.91       1.91 

Geom. Mean     13.12     11.70      11.64

Assuming that the de-gfc48+optzns run really has disabled the llvm vectorization,
I am hoping that additional benchmarking of de-gfc48+optzns with individual
-ffast-math optimizations disabled (such as passing -fno-unsafe-math-optimizations)
may give us a clue as the the origin of the performance delta between the stock
dragonegg results with -ffast-math and those with -fplugin-arg-dragonegg-enable-gcc-optzns.
      Jack