[LLVMdev] pb05 results for current llvm/dragonegg

Jack Howarth howarth at bromo.med.uc.edu
Tue Apr 3 13:50:59 PDT 2012

  Attached are the Polyhedron 2005 benchmark results for current llvm/dragonegg svn
on x86_64-apple-darwin11 built against Xcode 4.3.2 and FSF gcc 4.6.3. The benchmarks
for -msse3 and -msse4 appear identical (at least for degg+optnz). This is fortunate
since there seems to be a bug in -msse4 on 2.33 GHz (T7600) Intel Core 2 Duo Merom
(http://llvm.org/bugs/show_bug.cgi?id=12434). I've added two additional entries to
the table. The first, degg+novect+optnz, should show the optimizations achieved by
-fplugin-arg-dragonegg-enable-gcc-optzns in the absence of autovectorization by
FSF gcc. This shows the missing optimization opportunities for LLVM IR-level outside
of autovectorization. The second entry is for the new LLVM autovectorization option
with all of its related options set. This shows mixed results with some benchmarks
being improved over the simple -fplugin-arg-dragonegg-llvm-option=-vectorize
and some being worsened in performance.

llvm/dragonegg r153877

de-gfortran46 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n

de-gfortran46 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-llvm-option=-vectorize %n.f90 -o %n

de-gfortran46 -msse3 -ffast-math -funroll-loops -O3 -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n

gfortran-fsf-4.6 -msse3 -ffast-math -funroll-loops -O3 %n.f90 -o %n

de-gfortran46 -msse3 -ffast-math -funroll-loops -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-enable-gcc-optzns %n.f90 -o %n

de-gfortran46 -msse3 -ffast-math -funroll-loops -O3 -fno-tree-vectorize -fplugin-arg-dragonegg-llvm-option=-vectorize -fplugin-arg-dragonegg-llvm-option=-unroll-allow-partia
l -fplugin-arg-dragonegg-llvm-option=-unroll-runtime -fplugin-arg-dragonegg-llvm-option=-bb-vectorize-aligned-only -fplugin-arg-dragonegg-llvm-option=-bb-vectorize-no-ints %
n.f90 -o %n

Ave Run (secs)
               dragonegg degg+vectorize degg+optnz  gfortran degg+novect+optnz degg+fullvect+optnz
ac               12.45       12.45         8.85       8.80       8.90              10.89
aermod           16.15       16.05        14.80      17.48      14.12              15.84
air               7.10        7.11         6.46       5.50       6.46               8.15
capacita         40.00       39.96        37.72      32.62      39.38              39.94
channel           2.16        2.15         1.99       1.84       2.15               2.56
doduc            29.13       28.41        27.48      26.74      28.27              29.05
fatigue           8.75        9.03         8.11       8.44       7.28              10.49
gas_dyn          11.72       11.80         4.47       4.26      10.02              11.63
induct           24.02       24.91        12.08      13.65      20.54              24.68
linpk            15.40       15.78        15.74      15.45      15.39              15.46
mdbx             11.80       12.22        11.86      11.20      11.82              11.50
nf               28.45       28.50        29.25      27.91      29.17              28.16
protein          38.15       39.26        37.87      32.49      39.08              38.62
rnflow           32.25       32.35        26.47      24.06      28.75              31.05 
test_fpu         11.34       11.35         9.31       8.04      10.88              10.19
tftt              1.91        1.92         1.93       1.87       1.94               1.90 

Geometric Mean   13.50       13.62        11.34      10.87      12.53              13.65

Compile (secs)
               dragonegg degg+vectorize degg+optnz  gfortran degg+novect+optnz degg+fullvect+optnz
ac                0.33        0.38         0.72       1.27       0.71               0.39
aermod           25.91       27.58        32.34      43.91      25.13              23.62
air               1.07        1.25         1.52       2.25       1.36               1.34
capacita          0.49        0.52         0.89       1.71       0.71               0.98
channel           0.29        0.36         0.50       0.62       0.42               0.49
doduc             1.71        4.50         3.25       5.34       2.75               5.42
fatigue           0.84        0.97         1.19       1.76       1.00               1.24
gas_dyn           0.67        0.68         1.20       3.02       0.90               1.81
induct            1.60        2.14         2.82       3.99       2.53               2.15
linpk             0.22        0.24         0.47       0.78       0.30               0.46
mdbx              0.63        0.77         1.16       1.85       0.99               1.12
nf                0.37        0.40         0.70       1.66       0.42               1.22
protein           0.93        1.02         1.75       4.01       1.40               2.73
rnflow            1.20        1.25         2.63       5.44       1.72               2.85 
test_fpu          0.88        0.92         2.13       4.39       1.26               2.38
tftt              0.21        0.24         0.34       0.56       0.30               0.27  

Executable (bytes)
               dragonegg degg+vectorize  degg+optnz  gfortran degg+novect+optnz degg+fullvect+optnz
ac                26856       26856        39120      50968      39120             35144
aermod          1043700     1055988      1046288    1265640    1013488           1146196
air               62004       62004        53740      73988      53740             78392
capacita          41416       41416        45552      73896      41416             70096
channel           22808       22808        26768      34784      22672             34984
doduc            128448      128448       136996     197240     128868            173512
fatigue           69824       69824        69840      86080      65712             78016
gas_dyn           59112       59112        67416     119744      59160             91952
induct           163152      167248       167344     174976     176696            179552
linpk             18752       18752        27056      38648      18904             31200 
mdbx              53692       53692        57884      82112      53788             70080
nf                23960       23960        32104      71800      23912             48568
protein           75032       75032        87208     132040      78912            132376 
rnflow            71896       71896        96632     181120      67928            137528 
test_fpu          54272       54272        78776     155072      50144            111640
tftt              18640       18640        18488      30768      18488             22744

More information about the llvm-dev mailing list