[llvm-commits] [test-suite] r49153 - /test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp

Owen Anderson resistor at mac.com
Thu Apr 3 00:25:34 PDT 2008


Author: resistor
Date: Thu Apr  3 02:25:34 2008
New Revision: 49153

URL: http://llvm.org/viewvc/llvm-project?rev=49153&view=rev
Log:
Add a version of the Livermore Loops benchmark.

Added:
    test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp   (with props)

Added: test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Misc-C%2B%2B/llloops.cpp?rev=49153&view=auto

==============================================================================
--- test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp (added)
+++ test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp Thu Apr  3 02:25:34 2008
@@ -0,0 +1,2219 @@
+/************************************************************************
+ *                                                                      *
+ * L. L. N. L.  " C "  K E R N E L S:  M F L O P S  P C  V E R S I O N  *
+ *                                                                      *
+ *     These kernels measure   " C "   numerical computation            *
+ *     rates for  a  spectrum  of  cpu-limited computational            *
+ *     structures or benchmarks.   Mathematical  through-put            *
+ *     is measured  in  units  of millions of floating-point            *
+ *     operations executed per second, called Megaflops/sec.            *
+ *                                                                      *
+ ************************************************************************
+ *     Originally from  Greg Astfalk, AT&T, P.O.Box 900, Princeton,     *
+ *           NJ. 08540.    by way of Frank McMahon (LLNL).              *
+ *                                                                      *
+ *   Modifications by Tim Peters, Kendall Square Res. Corp. Oct 92.     *
+ *                                                                      *
+ *       This version by Roy Longbottom (retired, ex-CCTA UK)           *
+ *             Roy_Longbottom 101323.2241 at compuserve.com                *
+ *                          March 1996                                  *
+ *                                                                      *
+ *                               REFERENCE                              *
+ *                                                                      *
+ *              F.H.McMahon,   The Livermore Fortran Kernels:           *
+ *              A Computer Test Of The Numerical Performance Range,     *
+ *              Lawrence Livermore National Laboratory,                 *
+ *              Livermore, California, UCRL-53745, December 1986.       *
+ *                                                                      *
+ *       from:  National Technical Information Service                  *
+ *              U.S. Department of Commerce                             *
+ *              5285 Port Royal Road                                    *
+ *              Springfield, VA.  22161                                 *
+ *                                                                      *
+ ************************************************************************           
+ *  The standard "C" code accesses the FORTRAN version for data         *
+ *  generation and result analysis. These features have been merged     *
+ *  to produce a program more suitable to run on PCs. FORTRAN features  *
+ *  for detailed statistical analysis of the results have been omitted. *  
+ *                                                                      *
+ *        Changes to "C" code to produce correct results:               *
+ *                                                                      *
+ *        Kernel 2  change i = ipntp - 1; to i = ipntp;                 *
+ *        Kernel 7  third line of inner loop change r to q              *
+ ************************************************************************
+ *  Because of the inaccuracy of the PC clock, this version arranges    
+ *  for timing to be based on at least five seconds.
+ *
+ *  The kernels are executed as follows:
+ *
+ *                parameters(x);
+ *                do
+ *                 {
+ *                    execute kernel code
+ *
+ *                    endloop(x);
+ *                 }
+ *                while (count < loop);
+ *
+ *  Function parameters obtains the loop parameters, generates all the data
+ *  and makes a copy of it for use with extra loops. Timing is started at
+ *  the end of the function.
+ *
+ *  The variable loop has a defined number of passes (e.g. 7 for kernel 1,
+ *  long span - see Passes in table). This is multiplied by a further
+ *  constant for which checksums are defined - 200/400/1600 for long/medium
+ *  /short spans was chosen. The overhead of executing function endloop is
+ *  calculated as below. This is deducted from the total time but probably
+ *  could be ignored on PCs.
+ *
+ *  The running time for each loop is set to a minimum of five seconds via
+ *  repeating all loops until each has recorded at least 0.07 seconds (see
+ *  calibration below). The extra loops required are shown under E in the
+ *  tables. The data used in the loops is re-initialised from the copy in
+ *  function endloop for each of the extra loops. The worst case overhead
+ *  of this has been measured as less than 1% and is ignored. Note, the
+ *  alternative of summing the time for each set of count passes cannot
+ *  be relied upon when the time for one set is of the same order of
+ *  magnitude as the clock resolution (0.05 to 0.06 seconds). Calibration
+ *  also gives an indication of the linearity of timing. In the example
+ *  shown, the overhead of 24 occurrences of data generation, which is
+ *  excluded from the main timing, is about 0.6 seconds. 
+ *
+ *  The total floating point operations for the first kernel 1 results are
+ *  200 x 7 x 15 x 5 x 1001. For some other kernels, the total is not
+ *  proportional to the span.
+ *
+ *  The OK column in the tables indicates the number of correct significant
+ *  digits out of 16 compared with the defined checksums. 
+ *   
+ *                                                                      
+ *                  Example of Results
+ *
+ * L.L.N.L. 'C' KERNELS: MFLOPS   P.C.  VERSION
+ *
+ * Calculating outer loop overhead
+ *      1000 times   0.00 seconds
+ *     10000 times   0.00 seconds
+ *    100000 times   0.06 seconds
+ *   1000000 times   0.33 seconds
+ *   2000000 times   0.88 seconds
+ *   4000000 times   1.59 seconds
+ *   8000000 times   3.30 seconds
+ *  16000000 times   6.64 seconds
+ * Overhead for each loop  4.1500e-007 seconds
+ *
+ * Calibrating part 1 of 3
+ *
+ * Loop count          4  0.94 seconds
+ * Loop count         16  2.08 seconds
+ * Loop count         32  3.52 seconds
+ * Loop count         64  6.42 seconds
+ * Loop count        128 12.31 seconds
+ *
+ * Loops  200 x  1 x Passes
+ *
+ * Kernel       Floating Pt ops
+ * No  Passes E No    Total      Secs.  MFLOPS Span     Checksums          OK
+ * ------------ -- ------------- ----- ------- ---- ---------------------- --
+ *  1   7 x  15  5 1.051050e+008  5.10   20.60 1001 5.114652693224671e+004 16
+ *  2  67 x  21  4 1.091832e+008  5.20   20.98  101 1.539721811668384e+003 15
+ *  3   9 x  15  2 5.405400e+007  4.17   12.97 1001 1.000742883066364e+001 15
+ *  4  14 x  30  2 1.008000e+008  5.52   18.28 1001 5.999250595473891e-001 16
+ *  5  10 x  12  2 4.800000e+007  5.43    8.84 1001 4.548871642387267e+003 16
+ *  6   3 x  19  2 4.523520e+007  4.34   10.43   64 4.375116344729986e+003 16
+ *  7   4 x  10 16 1.273600e+008  4.45   28.64  995 6.104251075174761e+004 16
+ *  8  10 x   7 36 9.979200e+007  5.15   19.36  100 1.501268005625795e+005 15
+ *  9  36 x   6 17 7.417440e+007  5.20   14.26  101 1.189443609974981e+005 16
+ * 10  34 x   5  9 3.090600e+007  5.48    5.64  101 7.310369784325296e+004 16
+ * 11  11 x  15  1 3.300000e+007  5.65    5.84 1001 3.342910972650109e+007 16
+ * 12  12 x  30  1 7.200000e+007  6.50   11.08 1000 2.907141294167248e-005 16
+ * 13  36 x   4  7 1.290240e+007  6.41    2.01   64 1.202533961842804e+011 15
+ * 14   2 x   4 11 1.761760e+007  5.61    3.14 1001 3.165553044000335e+009 15
+ * 15   1 x  15 33 4.950000e+007  5.66    8.75  101 3.943816690352044e+004 15
+ * 16  25 x  30 10 7.950000e+007  6.14   12.95   75 5.650760000000000e+005 16
+ * 17  35 x   9  9 5.726700e+007  5.03   11.38  101 1.114641772902486e+003 16
+ * 18   2 x  11 44 9.583200e+007  5.76   16.64  100 1.015727037502299e+005 15
+ * 19  39 x  21  6 9.926280e+007  6.14   16.16  101 5.421816960147207e+002 16
+ * 20   1 x  15 26 7.800000e+007  5.93   13.16 1000 3.040644339351238e+007 16
+ * 21   1 x   1  2 2.525000e+007  6.37    3.96  101 1.597308280710200e+008 15
+ * 22  11 x  12 17 4.532880e+007  5.43    8.35  101 2.938604376566698e+002 15
+ * 23   8 x  12 11 1.045440e+008  5.10   20.49  100 3.549900501563624e+004 15
+ * 24   5 x  30  1 3.000000e+007  4.93    6.09 1001 5.000000000000000e+002 16
+ *
+ *                     Maximum   Rate   28.64 
+ *                     Average   Rate   12.50 
+ *                     Geometric Mean   10.50 
+ *                     Harmonic  Mean    8.25 
+ *                     Minimum   Rate    2.01 
+ *
+ *                     Do Span    471
+ *
+ * Calibrating part 2 of 3
+ *
+ * Loop count          8  0.88 seconds
+ * Loop count         32  1.86 seconds
+ * Loop count         64  3.19 seconds
+ * Loop count        128  5.77 seconds
+ * Loop count        256 10.93 seconds
+ *
+ * Loops  200 x  2 x Passes
+ *
+ * Kernel       Floating Pt ops
+ * No  Passes E No    Total      Secs.  MFLOPS Span     Checksums          OK
+ * ------------ -- ------------- ----- ------- ---- ---------------------- --
+ *  1  40 x  15  5 1.212000e+008  4.84   25.04  101 5.253344778937972e+002 16
+ *  2  40 x  20  4 1.241600e+008  5.91   21.02  101 1.539721811668384e+003 15
+ *  3  53 x  20  2 8.564800e+007  5.10   16.78  101 1.009741436578952e+000 16
+ *  4  70 x  32  2 1.075200e+008  4.29   25.07  101 5.999250595473891e-001 16
+ *  5  55 x  13  2 5.720000e+007  4.99   11.46  101 4.589031939600982e+001 16
+ *  6   7 x  19  2 5.107200e+007  4.70   10.87   32 8.631675645333210e+001 16
+ *  7  22 x  12 16 1.706496e+008  5.56   30.71  101 6.345586315784055e+002 16
+ *  8   6 x   6 36 1.026432e+008  5.26   19.50  100 1.501268005625795e+005 15
+ *  9  21 x   5 17 7.211400e+007  5.03   14.33  101 1.189443609974981e+005 16
+ * 10  19 x   5  9 3.454200e+007  6.13    5.63  101 7.310369784325296e+004 16
+ * 11  64 x  20  1 5.120000e+007  4.95   10.35  101 3.433560407475758e+004 16
+ * 12  68 x  20  1 5.440000e+007  5.16   10.53  100 7.127569130821465e-006 16
+ * 13  41 x   3  7 1.102080e+007  5.47    2.01   32 9.816387810944356e+010 15
+ * 14  10 x   4 11 1.777600e+007  5.49    3.24  101 3.039983465145392e+007 15
+ * 15   1 x   7 33 4.620000e+007  5.32    8.69  101 3.943816690352044e+004 15
+ * 16  27 x  21 10 6.350400e+007  5.02   12.66   40 6.480410000000000e+005 16
+ * 17  20 x   9  9 6.544800e+007  5.74   11.40  101 1.114641772902486e+003 16
+ * 18   1 x  10 44 8.712000e+007  5.22   16.69  100 1.015727037502299e+005 15
+ * 19  23 x  15  6 8.362800e+007  5.11   16.36  101 5.421816960147207e+002 16
+ * 20   8 x   9 26 7.488000e+007  5.43   13.80  100 3.126205178815432e+004 16
+ * 21   1 x   2  2 5.000000e+007  5.55    9.01   50 7.824524877232093e+007 16
+ * 22   7 x   9 17 4.326840e+007  5.21    8.31  101 2.938604376566698e+002 15
+ * 23   5 x   9 11 9.801000e+007  4.77   20.54  100 3.549900501563624e+004 15
+ * 24  31 x  30  1 3.720000e+007  6.06    6.14  101 5.000000000000000e+001 16
+ *
+ *                     Maximum   Rate   30.71 
+ *                     Average   Rate   13.76 
+ *                     Geometric Mean   11.69 
+ *                     Harmonic  Mean    9.19 
+ *                     Minimum   Rate    2.01 
+ *
+ *                     Do Span     90
+ *
+ * Calibrating part 3 of 3
+ *
+ * Loop count         32  0.77 seconds
+ * Loop count        128  1.54 seconds
+ * Loop count        256  2.47 seconds
+ * Loop count        512  4.34 seconds
+ * Loop count       1024  8.13 seconds
+ *
+ * Loops  200 x  8 x Passes
+ *
+ * Kernel       Floating Pt ops
+ * No  Passes E No    Total      Secs.  MFLOPS Span     Checksums          OK
+ * ------------ -- ------------- ----- ------- ---- ---------------------- --
+ *  1  28 x  22  5 1.330560e+008  5.31   25.05   27 3.855104502494961e+001 16
+ *  2  46 x  22  4 7.124480e+007  4.38   16.27   15 3.953296986903060e+001 16
+ *  3  37 x  23  2 7.352640e+007  4.26   17.24   27 2.699309089320672e-001 16
+ *  4  38 x  35  2 6.384000e+007  3.79   16.86   27 5.999250595473891e-001 16
+ *  5  40 x  23  2 7.654400e+007  4.45   17.20   27 3.182615248447483e+000 16
+ *  6  21 x  32  2 5.160960e+007  4.82   10.70    8 1.120309393467088e+000 15
+ *  7  20 x  12 16 1.290240e+008  4.24   30.43   21 2.845720217644024e+001 16
+ *  8   9 x   8 36 1.078272e+008  5.17   20.85   14 2.960543667875005e+003 15
+ *  9  26 x  16 17 1.697280e+008  5.33   31.82   15 2.623968460874250e+003 16
+ * 10  25 x  11  9 5.940000e+007  5.42   10.96   15 1.651291227698265e+003 16
+ * 11  46 x  22  1 4.209920e+007  3.67   11.48   27 6.551161335845770e+002 16
+ * 12  48 x  23  1 4.592640e+007  5.04    9.12   26 1.943435981130448e-006 16
+ * 13  31 x   4  7 1.111040e+007  5.57    2.00    8 3.847124199949431e+010 15
+ * 14   8 x   6 11 2.280960e+007  5.19    4.40   27 2.923540598672009e+006 15
+ * 15   1 x  15 33 5.544000e+007  6.14    9.03   15 1.108997288134785e+003 16
+ * 16  14 x  31 10 7.638400e+007  5.80   13.17   15 5.152160000000000e+005 16
+ * 17  26 x  11  9 6.177600e+007  5.14   12.02   15 2.947368618589360e+001 16
+ * 18   2 x  12 44 1.098240e+008  5.36   20.47   14 9.700646212337040e+002 16
+ * 19  28 x  21  6 8.467200e+007  5.38   15.74   15 1.268230698051004e+001 15
+ * 20   7 x  10 26 7.571200e+007  5.27   14.36   26 5.987713249475302e+002 16
+ * 21   1 x   2  2 8.000000e+007  5.50   14.55   20 5.009945671204667e+007 16
+ * 22   8 x  13 17 4.243200e+007  5.04    8.42   15 6.109968728263973e+000 16
+ * 23   7 x  15 11 1.201200e+008  4.38   27.42   14 4.850340602749970e+002 16
+ * 24  23 x  32  1 3.061760e+007  5.01    6.11   27 1.300000000000000e+001 16
+ *
+ *                     Maximum   Rate   31.82 
+ *                     Average   Rate   15.24 
+ *                     Geometric Mean   13.06 
+ *                     Harmonic  Mean   10.26 
+ *                     Minimum   Rate    2.00 
+ *
+ *                     Do Span     19
+ *
+ *                Overall
+ *
+ *                Part 1 weight 1
+ *                Part 2 weight 2
+ *                Part 3 weight 1
+ *
+ *                     Maximum   Rate   31.82 
+ *                     Average   Rate   13.81 
+ *                     Geometric Mean   11.70 
+ *                     Harmonic  Mean    9.17 
+ *                     Minimum   Rate    2.00 
+ *
+ *                     Do Span    167
+ *
+ * Enter the following data which will be filed with the results
+ *
+ * Month run       9/1996
+ * PC model        Escom
+ * CPU             Pentium
+ * Clock MHz       100
+ * Cache           256K
+ * Options         Neptune chipset
+ * OS/DOS          Windows 95
+ * Compiler        Watcom C/C++ Version 10.5
+ * OptLevel        Win386 -zp4 -otexan -om -fp5 -zc -5r
+ * Run by          Roy Longbottom
+ * From            UK
+ * Mail            101323.2241 at compuserve.com 
+ *
+ *  Note: the date, compiler and opt level are inserted by the program.
+ *
+ *  The tables of results and running details are appended to file
+ *  LLloops.txt.
+ *
+ *  When a single MFLOPS rating is claimed for this benchmark it is
+ *  usually the overall geometric mean result.
+ *
+ **********************************************************************
+ *
+ *  Pre-compiled codes were produced via a Watcom C/C++ 10.5 compiler. 
+ *  Versions are available for DOS, Windows 3/95 and NT/Win 95. Both
+ *  non-optimised and optimised programs are available. The latter have
+ *  options as in the above example.
+ *
+ *  In this source code, function prototypes are declared and function
+ *  headers have embedded parameter types to produce code for C and C++
+ *  at least suitable for compiling as such with the Watcom compiler.
+ *
+ ***********************************************************************  
+ */
+ 
+#include <stdio.h>
+#include <math.h>
+#include <stdlib.h>
+
+
+   struct Arrays
+   {
+     double U[1001];
+     double V[1001];
+     double W[1001];
+     double X[1001];
+     double Y[1001];
+     double Z[1001];
+     double G[1001];
+     double Du1[101];
+     double Du2[101];
+     double Du3[101];
+     double Grd[1001];
+     double Dex[1001];
+     double Xi[1001];
+     double Ex[1001];
+     double Ex1[1001];
+     double Dex1[1001];
+     double Vx[1001];
+     double Xx[1001];
+     double Rx[1001];
+     double Rh[2048];
+     double Vsp[101];
+     double Vstp[101];
+     double Vxne[101];
+     double Vxnd[101];
+     double Ve3[101];
+     double Vlr[101];
+     double Vlin[101];
+     double B5[101];
+     double Plan[300];
+     double D[300];
+     double Sa[101];
+     double Sb[101];     
+     double P[512][4];
+     double Px[101][25];
+     double Cx[101][25];
+     double Vy[25][101];
+     double Vh[7][101];
+     double Vf[7][101];
+     double Vg[7][101];
+     double Vs[7][101];
+     double Za[7][101];
+     double Zp[7][101];
+     double Zq[7][101];
+     double Zr[7][101];
+     double Zm[7][101];
+     double Zb[7][101];
+     double Zu[7][101];
+     double Zv[7][101];
+     double Zz[7][101];               
+     double B[64][64];
+     double C[64][64];
+     double H[64][64];     
+     double U1[2][101][5];
+     double U2[2][101][5];
+     double U3[2][101][5];
+     double Xtra[40];     
+     long   E[96];
+     long   F[96];
+     long   Ix[1001];
+     long   Ir[1001];
+     long   Zone[301];
+     double X0[1001];
+     double W0[1001];
+     double Px0[101][25];
+     double P0[512][4];
+     double H0[64][64];
+     double Rh0[2048];
+     double Vxne0[101];
+     double Zr0[7][101];
+     double Zu0[7][101];
+     double Zv0[7][101];
+     double Zz0[7][101];
+     double Za0[101][25];
+     double Stb50;               
+     double Xx0;
+
+     
+  }as1;
+   #define u        as1.U
+   #define v        as1.V
+   #define w        as1.W
+   #define x        as1.X
+   #define y        as1.Y
+   #define z        as1.Z
+   #define g        as1.G
+   #define du1      as1.Du1
+   #define du2      as1.Du2
+   #define du3      as1.Du3
+   #define grd      as1.Grd
+   #define dex      as1.Dex
+   #define xi       as1.Xi
+   #define ex       as1.Ex
+   #define ex1      as1.Ex1
+   #define dex1     as1.Dex1
+   #define vx       as1.Vx
+   #define xx       as1.Xx
+   #define rx       as1.Rx
+   #define rh       as1.Rh
+   #define vsp      as1.Vsp
+   #define vstp     as1.Vstp
+   #define vxne     as1.Vxne
+   #define vxnd     as1.Vxnd
+   #define ve3      as1.Ve3
+   #define vlr      as1.Vlr
+   #define vlin     as1.Vlin
+   #define b5       as1.B5
+   #define plan     as1.Plan
+   #define d        as1.D
+   #define sa       as1.Sa
+   #define sb       as1.Sb
+   #define p        as1.P
+   #define px       as1.Px
+   #define cx       as1.Cx
+   #define vy       as1.Vy
+   #define vh       as1.Vh
+   #define vf       as1.Vf
+   #define vg       as1.Vg
+   #define vs       as1.Vs
+   #define za       as1.Za
+   #define zb       as1.Zb
+   #define zp       as1.Zp
+   #define zq       as1.Zq
+   #define zr       as1.Zr
+   #define zm       as1.Zm
+   #define zz       as1.Zz
+   #define zu       as1.Zu
+   #define zv       as1.Zv
+   #define b        as1.B
+   #define c        as1.C
+   #define h        as1.H
+   #define u1       as1.U1
+   #define u2       as1.U2
+   #define u3       as1.U3
+   #define xtra     as1.Xtra
+   #define a11      as1.Xtra[1]
+   #define a12      as1.Xtra[2]
+   #define a13      as1.Xtra[3]
+   #define a21      as1.Xtra[4]
+   #define a22      as1.Xtra[5]
+   #define a23      as1.Xtra[6]
+   #define a31      as1.Xtra[7]
+   #define a32      as1.Xtra[8]
+   #define a33      as1.Xtra[9]
+   #define c0       as1.Xtra[12]
+   #define dk       as1.Xtra[15]
+   #define dm22     as1.Xtra[16]
+   #define dm23     as1.Xtra[17]
+   #define dm24     as1.Xtra[18]
+   #define dm25     as1.Xtra[19]
+   #define dm26     as1.Xtra[20]
+   #define dm27     as1.Xtra[21]
+   #define dm28     as1.Xtra[22]
+   #define expmax   as1.Xtra[26]
+   #define flx      as1.Xtra[27]
+   #define q        as1.Xtra[28]
+   #define r        as1.Xtra[30]
+   #define s        as1.Xtra[32]
+   #define sig      as1.Xtra[34]
+   #define stb5     as1.Xtra[35]
+   #define t        as1.Xtra[36]
+   #define xnm      as1.Xtra[39]   
+   #define e        as1.E  
+   #define f        as1.F
+   #define ix       as1.Ix
+   #define ir       as1.Ir
+   #define zone     as1.Zone
+   #define x0       as1.X0
+   #define w0       as1.W0
+   #define px0      as1.Px0
+   #define p0       as1.P0
+   #define h0       as1.H0
+   #define rh0      as1.Rh0
+   #define vxne0    as1.Vxne0
+   #define zr0      as1.Zr0
+   #define zu0      as1.Zu0
+   #define zv0      as1.Zv0
+   #define zz0      as1.Zz0
+   #define za0      as1.Za0
+   #define stb50    as1.Stb50
+   #define xx0      as1.Xx0              
+
+
+   struct Parameters
+   {
+       long   Inner_loops;
+       long   Outer_loops;
+       long   Loop_mult;
+       double Flops_per_loop;
+       double Sumcheck[3][25];
+       long   Accuracy[3][25];
+       double LoopTime[3][25];
+       double LoopSpeed[3][25];
+       double LoopFlos[3][25];
+       long   Xflops[25];
+       long   Xloops[3][25];
+       long   Nspan[3][25];       
+       double TimeStart;
+       double TimeEnd;
+       double Loopohead;
+       long   Count;
+       long   Count2;
+       long   Pass;
+       long   Extra_loops[3][25];
+       long   K2;
+       long   K3;
+       long   M16;
+       long   J5;
+       long   Section;
+       long   N16;
+       double Mastersum;
+       long   M24;
+  
+       
+   }as2;
+   
+   #define n            as2.Inner_loops
+   #define loop         as2.Outer_loops
+   #define mult         as2.Loop_mult
+   #define nflops       as2.Flops_per_loop
+   #define Checksum     as2.Sumcheck
+   #define accuracy     as2.Accuracy
+   #define RunTime      as2.LoopTime
+   #define Mflops       as2.LoopSpeed
+   #define FPops        as2.LoopFlos
+   #define nspan        as2.Nspan
+   #define xflops       as2.Xflops
+   #define xloops       as2.Xloops
+   #define StartTime    as2.TimeStart
+   #define EndTime      as2.TimeEnd
+   #define overhead_l   as2.Loopohead
+   #define count        as2.Count
+   #define count2       as2.Count2
+   #define pass         as2.Pass
+   #define extra_loops  as2.Extra_loops
+   #define k2           as2.K2
+   #define k3           as2.K3
+   #define m16          as2.M16
+   #define j5           as2.J5
+   #define section      as2.Section
+   #define n16          as2.N16
+   #define MasterSum    as2.Mastersum
+   #define m24          as2.M24
+
+
+   void init(long which);
+   
+        /* Initialises arrays and variables  */
+             
+   long endloop(long which);
+   
+        /* Controls outer loops and stores results */
+
+   long parameters(long which);
+   
+        /* Gets loop parameters and variables, starts timer */
+        
+   void kernels();
+   
+        /* The 24 kernels */
+        
+   void check(long which);
+   
+        /* Calculates checksum accuracy */
+             
+   void iqranf();
+   
+        /* Random number generator for Kernel 14 */
+                 
+main(int argc, char *argv[])
+{
+    double pass_time, least, lmult, now = 1.0, wt;
+    double time1, time2;
+    long   i, k, loop_passes;
+    long   mul[3] = {1, 2, 8};
+    double weight[3] = {1.0, 2.0, 1.0};
+    long   Endit, which;
+    double maximum[4];
+    double minimum[4];
+    double average[4];
+    double harmonic[4];
+    double geometric[4];
+    long   xspan[4];
+    char   general[9][80] = {" "};
+    FILE   *outfile;
+    int  getinput = 1;
+    
+    if (argc > 1)
+     {
+       switch (argv[1][0])
+        {
+            case 'N':
+               getinput = 0;
+               break;
+            case 'n':
+               getinput = 0;
+               break;
+        }
+     }
+    
+ 
+    printf ("L.L.N.L. 'C' KERNELS: MFLOPS   P.C.  VERSION 4.0\n\n");
+
+    if (getinput == 0)
+     {
+        printf ("***** No run time input data *****\n\n");
+     }
+    else
+     {
+        printf ("*** With run time input data ***\n\n");
+     }
+    
+/************************************************************************
+ *      Execute the kernels three times at different Do Spans           *
+ ************************************************************************/
+    
+    for ( section=0 ; section<3 ; section++ )
+    {
+        loop_passes = 200 * mul[section];
+        pass = -20;
+        mult = 2 * mul[section];
+    
+        for ( i=1; i<25; i++)
+        {
+            extra_loops[section][i] = 500;
+        }
+
+/************************************************************************
+ *                      Execute the kernels                             *
+ ************************************************************************/
+        
+        kernels();
+
+        maximum[section] = 0.0;
+        minimum[section] = Mflops[section][1];
+        average[section] = 0.0;
+        harmonic[section] = 0.0;
+        geometric[section] = 0.0;
+        xspan[section] = 0.0;
+    }
+
+/************************************************************************
+ *    End of executing the kernels three times at different Do Spans    *
+ ************************************************************************/
+}
+    
+/************************************************************************
+ *                          The Kernels                                 *
+ ************************************************************************/
+
+void kernels()
+ {
+
+   long   lw;
+   long   ipnt, ipntp, ii;
+   double temp;
+   long   nl1, nl2;
+   long   kx, ky;
+   double ar, br, cr;
+   long   i, j, k, m;
+   long   ip, i1, i2, j1, j2, j4, lb;
+   long   ng, nz;
+   double tmp;
+   double scale, xnei, xnc, e3,e6;
+   long   ink, jn, kn, kb5i;
+   double di, dn;
+   double qa;
+  
+   for ( k=0 ; k<25; k++)
+    {
+        Checksum[section][k] = 0.0;
+    }
+
+
+       
+    /*
+     *******************************************************************
+     *   Kernel 1 -- hydro fragment
+     *******************************************************************
+     */
+
+    parameters (1);
+    
+    do
+     {
+        for ( k=0 ; k<n ; k++ )
+        {
+            x[k] = q + y[k]*( r*z[k+10] + t*z[k+11] );
+        }
+      
+        endloop (1);
+     }
+     while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 2 -- ICCG excerpt (Incomplete Cholesky Conjugate Gradient)
+     *******************************************************************
+    */
+
+    parameters (2); 
+
+    do
+     {
+        ii = n;
+        ipntp = 0;
+        do
+         {
+            ipnt = ipntp;
+            ipntp += ii;
+            ii /= 2;
+            i = ipntp;
+            for ( k=ipnt+1 ; k<ipntp ; k=k+2 )
+             {
+                i++;
+                x[i] = x[k] - v[k]*x[k-1] - v[k+1]*x[k+1];
+             }
+         } while ( ii>0 );
+     
+         endloop (2);
+     }
+     while (count < loop);
+  
+    /*
+     *******************************************************************
+     *   Kernel 3 -- inner product
+     *******************************************************************
+      */
+
+    parameters (3);
+
+    do
+     {
+        q = 0.0;
+        for ( k=0 ; k<n ; k++ )
+        {
+            q += z[k]*x[k];
+        }     
+      
+        endloop (3);
+     }
+     while (count < loop);
+
+
+    /*
+     *******************************************************************
+     *   Kernel 4 -- banded linear equations
+     *******************************************************************
+      */
+
+    parameters (4);
+
+    m = ( 1001-7 )/2;
+    do
+     {
+        for ( k=6 ; k<1001 ; k=k+m )
+         {
+            lw = k - 6;
+            temp = x[k-1];
+
+            for ( j=4 ; j<n ; j=j+5 )
+             {
+                temp -= x[lw]*y[j];
+                lw++;
+             }
+            x[k-1] = y[4]*temp;
+         }
+         
+         endloop (4);
+     }
+     while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 5 -- tri-diagonal elimination, below diagonal
+     *******************************************************************
+     */
+
+    parameters (5);
+
+    do
+     {
+        for ( i=1 ; i<n ; i++ )
+         {
+            x[i] = z[i]*( y[i] - x[i-1] );
+         }
+
+         endloop (5);
+     }
+     while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 6 -- general linear recurrence equations
+     *******************************************************************
+     */
+          
+    parameters (6); 
+
+
+    do
+     {
+        for ( i=1 ; i<n ; i++ )
+         {
+            w[i] = 0.01;
+            for ( k=0 ; k<i ; k++ )
+             {
+                w[i] += b[k][i] * w[(i-k)-1];
+             }
+         }
+
+        endloop (6);
+     }
+     while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 7 -- equation of state fragment
+     *******************************************************************
+     */
+     
+    parameters (7);
+    
+    do
+     {
+
+        for ( k=0 ; k<n ; k++ )
+         {
+            x[k] = u[k] + r*( z[k] + r*y[k] ) +
+                   t*( u[k+3] + r*( u[k+2] + r*u[k+1] ) +
+                      t*( u[k+6] + q*( u[k+5] + q*u[k+4] ) ) );
+         }
+
+         endloop (7);
+     }
+     while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 8 -- ADI integration
+     *******************************************************************
+     */
+
+    nl1 = 0;
+    nl2 = 1;
+    
+    parameters (8);
+
+    do
+    {
+        for ( kx=1 ; kx<3 ; kx++ )
+        {
+
+           for ( ky=1 ; ky<n ; ky++ )
+           {
+              du1[ky] = u1[nl1][ky+1][kx] - u1[nl1][ky-1][kx];
+              du2[ky] = u2[nl1][ky+1][kx] - u2[nl1][ky-1][kx];
+              du3[ky] = u3[nl1][ky+1][kx] - u3[nl1][ky-1][kx];
+              u1[nl2][ky][kx]=
+                 u1[nl1][ky][kx]+a11*du1[ky]+a12*du2[ky]+a13*du3[ky] + sig*
+                  (u1[nl1][ky][kx+1]-2.0*u1[nl1][ky][kx]+u1[nl1][ky][kx-1]);
+              u2[nl2][ky][kx]=
+                 u2[nl1][ky][kx]+a21*du1[ky]+a22*du2[ky]+a23*du3[ky] + sig*
+                  (u2[nl1][ky][kx+1]-2.0*u2[nl1][ky][kx]+u2[nl1][ky][kx-1]);
+              u3[nl2][ky][kx]=
+                 u3[nl1][ky][kx]+a31*du1[ky]+a32*du2[ky]+a33*du3[ky] + sig*
+                  (u3[nl1][ky][kx+1]-2.0*u3[nl1][ky][kx]+u3[nl1][ky][kx-1]);
+           }
+        }
+    
+        endloop (8);
+   }
+    while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 9 -- integrate predictors
+     *******************************************************************
+     */
+
+    parameters (9);
+    
+    do
+    {
+        for ( i=0 ; i<n ; i++ )
+        {
+            px[i][0] = dm28*px[i][12] + dm27*px[i][11] + dm26*px[i][10] +
+                       dm25*px[i][ 9] + dm24*px[i][ 8] + dm23*px[i][ 7] +
+                       dm22*px[i][ 6] + c0*( px[i][ 4] + px[i][ 5])
+                                                       + px[i][ 2];
+        }
+
+        endloop (9);
+   }
+    while (count < loop);
+    
+    /*
+     *******************************************************************
+     *   Kernel 10 -- difference predictors
+     *******************************************************************
+     */
+     
+    parameters (10); 
+    
+    do
+    {
+        for ( i=0 ; i<n ; i++ )
+        {
+            ar        =      cx[i][ 4];
+            br        = ar - px[i][ 4];
+            px[i][ 4] = ar;
+            cr        = br - px[i][ 5];
+            px[i][ 5] = br;
+            ar        = cr - px[i][ 6];
+            px[i][ 6] = cr;
+            br        = ar - px[i][ 7];
+            px[i][ 7] = ar;
+            cr        = br - px[i][ 8];
+            px[i][ 8] = br;
+            ar        = cr - px[i][ 9];
+            px[i][ 9] = cr;
+            br        = ar - px[i][10];
+            px[i][10] = ar;
+            cr        = br - px[i][11];
+            px[i][11] = br;
+            px[i][13] = cr - px[i][12];
+            px[i][12] = cr;
+        }
+
+        endloop (10);
+   }
+   while (count < loop);
+     
+    /*
+     *******************************************************************
+     *   Kernel 11 -- first sum
+     *******************************************************************
+     */
+     
+    parameters (11); 
+
+    do
+    {
+        x[0] = y[0];
+        for ( k=1 ; k<n ; k++ )
+        {
+            x[k] = x[k-1] + y[k];
+        }
+   
+        endloop (11);
+   }
+   while (count < loop);
+ 
+    /*
+     *******************************************************************
+     *   Kernel 12 -- first difference
+     *******************************************************************
+     */
+     
+    parameters (12);
+
+    do
+    {
+        for ( k=0 ; k<n ; k++ )
+        {
+            x[k] = y[k+1] - y[k];
+        }
+
+        endloop (12);
+   }
+   while (count < loop);
+ 
+
+    /*
+     *******************************************************************
+     *   Kernel 13 -- 2-D PIC (Particle In Cell)
+     *******************************************************************
+     */
+
+   parameters (13); 
+  
+   do
+    {
+        for ( ip=0; ip<n; ip++)
+        {
+            i1 = p[ip][0];
+            j1 = p[ip][1];
+            i1 &= 64-1;
+            j1 &= 64-1;
+            p[ip][2] += b[j1][i1];
+            p[ip][3] += c[j1][i1];
+            p[ip][0] += p[ip][2];
+            p[ip][1] += p[ip][3];
+            i2 = p[ip][0];
+            j2 = p[ip][1];
+            i2 = ( i2 & 64-1 ) - 1 ;
+            j2 = ( j2 & 64-1 ) - 1 ;
+            p[ip][0] += y[i2+32];
+            p[ip][1] += z[j2+32];
+            i2 += e[i2+32];
+            j2 += f[j2+32];
+            h[j2][i2] += 1.0;
+        }
+        endloop (13);
+   }
+   while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 14 -- 1-D PIC (Particle In Cell)
+     *******************************************************************
+     */
+
+    parameters (14);
+
+    do
+    {
+        for ( k=0 ; k<n ; k++ )
+        {
+            vx[k] = 0.0;
+            xx[k] = 0.0;
+            ix[k] = (long) grd[k];
+            xi[k] = (double) ix[k];
+            ex1[k] = ex[ ix[k] - 1 ];
+            dex1[k] = dex[ ix[k] - 1 ];
+        }
+        for ( k=0 ; k<n ; k++ )
+        {
+            vx[k] = vx[k] + ex1[k] + ( xx[k] - xi[k] )*dex1[k];
+            xx[k] = xx[k] + vx[k]  + flx;
+            ir[k] = xx[k];
+            rx[k] = xx[k] - ir[k];
+            ir[k] = ( ir[k] & 2048-1 ) + 1;
+            xx[k] = rx[k] + ir[k];
+        }
+        for ( k=0 ; k<n ; k++ )
+        {
+            rh[ ir[k]-1 ] += 1.0 - rx[k];
+            rh[ ir[k]   ] += rx[k];
+        }
+        endloop (14);
+   }
+   while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 15 -- Casual Fortran.  Development version
+     *******************************************************************
+    */
+    
+    parameters (15);
+
+    do
+    {
+        ng = 7;
+        nz = n;
+        ar = 0.053;
+        br = 0.073;
+        for ( j=1 ; j<ng ; j++ )
+        {
+            for ( k=1 ; k<nz ; k++ )
+            {
+                if ( (j+1) >= ng )
+                {
+                    vy[j][k] = 0.0;
+                    continue;
+                }
+                if ( vh[j+1][k] > vh[j][k] )
+                {
+                    t = ar;
+                }
+                else
+                {
+                    t = br;
+                }
+                if ( vf[j][k] < vf[j][k-1] )
+                {
+                    if ( vh[j][k-1] > vh[j+1][k-1] )
+                        r = vh[j][k-1];
+                    else
+                        r = vh[j+1][k-1];
+                    s = vf[j][k-1];
+                }
+                else
+                {
+                    if ( vh[j][k] > vh[j+1][k] )
+                        r = vh[j][k];
+                    else
+                        r = vh[j+1][k];
+                    s = vf[j][k];
+                }
+                vy[j][k] = sqrt( vg[j][k]*vg[j][k] + r*r )* t/s;
+                if ( (k+1) >= nz )
+                {
+                    vs[j][k] = 0.0;
+                    continue;
+                }
+                if ( vf[j][k] < vf[j-1][k] )
+                {
+                    if ( vg[j-1][k] > vg[j-1][k+1] )
+                        r = vg[j-1][k];
+                    else
+                        r = vg[j-1][k+1];
+                    s = vf[j-1][k];
+                    t = br;
+                }
+                else
+                {
+                    if ( vg[j][k] > vg[j][k+1] )
+                        r = vg[j][k];
+                    else
+                        r = vg[j][k+1];
+                    s = vf[j][k];
+                    t = ar;
+                }
+                vs[j][k] = sqrt( vh[j][k]*vh[j][k] + r*r )* t / s;
+            }
+        }
+        endloop (15);
+   }
+   while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 16 -- Monte Carlo search loop
+     *******************************************************************
+     */
+
+    parameters (16);
+    
+
+    ii = n / 3;
+    lb = ii + ii;
+    k3 = k2 = 0;
+    do
+    {
+        i1 = m16 = 1;
+        label410:
+        j2 = ( n + n )*( m16 - 1 ) + 1;
+        for ( k=1 ; k<=n ; k++ )
+        {
+            k2++;
+            j4 = j2 + k + k;
+            j5 = zone[j4-1];
+            if ( j5 < n )
+            {
+                if ( j5+lb < n )
+                {                             /* 420 */
+                    tmp = plan[j5-1] - t;       /* 435 */
+                }
+                else
+                {
+                    if ( j5+ii < n )
+                    {                           /* 415 */
+                        tmp = plan[j5-1] - s;   /* 430 */
+                    }
+                    else
+                    {
+                        tmp = plan[j5-1] - r;   /* 425 */
+                    }
+                }
+            }
+            else if( j5 == n )
+            {
+                break;                          /* 475 */
+            }
+            else
+            {
+                k3++;                           /* 450 */
+                tmp=(d[j5-1]-(d[j5-2]*(t-d[j5-3])*(t-d[j5-3])+(s-d[j5-4])*
+                              (s-d[j5-4])+(r-d[j5-5])*(r-d[j5-5])));
+            }
+            if ( tmp < 0.0 )
+            {
+                if ( zone[j4-2] < 0 )            /* 445 */
+                    continue;                   /* 470 */
+                else if ( !zone[j4-2] )
+                    break;                      /* 480 */
+            }
+            else if ( tmp )
+            {
+                if ( zone[j4-2] > 0 )           /* 440 */
+                    continue;                   /* 470 */
+                else if ( !zone[j4-2] )
+                    break;                      /* 480 */
+            }
+            else break;                       /* 485 */
+            m16++;                              /* 455 */
+            if ( m16 > zone[0] )
+                m16 = 1;                          /* 460 */
+            if ( i1-m16 )                         /* 465 */
+                goto label410;
+            else
+                break;
+        }
+        endloop (16);
+   }
+   while (count < loop);
+   
+    /*
+     *******************************************************************
+     *   Kernel 17 -- implicit, conditional computation
+     *******************************************************************
+     */
+
+    parameters (17);  
+
+    do
+    {
+        i = n-1;
+        j = 0;
+        ink = -1;
+        scale = 5.0 / 3.0;
+        xnm = 1.0 / 3.0;
+        e6 = 1.03 / 3.07;
+        goto l61;
+l60:    e6 = xnm*vsp[i] + vstp[i];
+        vxne[i] = e6;
+        xnm = e6;
+        ve3[i] = e6;
+        i += ink;
+        if ( i==j ) goto l62;
+l61:    e3 = xnm*vlr[i] + vlin[i];
+        xnei = vxne[i];
+        vxnd[i] = e6;
+        xnc = scale*e3;
+        if ( xnm > xnc ) goto l60;
+        if ( xnei > xnc ) goto l60;
+        ve3[i] = e3;
+        e6 = e3 + e3 - xnm;
+        vxne[i] = e3 + e3 - xnei;
+        xnm = e6;
+        i += ink;
+        if ( i != j ) goto l61;
+l62:;        
+        endloop (17);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 18 - 2-D explicit hydrodynamics fragment
+     *******************************************************************
+     */
+
+    parameters (18);  
+
+    do
+    {
+       t = 0.0037;
+       s = 0.0041;
+       kn = 6;
+       jn = n;
+       for ( k=1 ; k<kn ; k++ )
+       {
+
+         for ( j=1 ; j<jn ; j++ )
+         {
+           za[k][j] = ( zp[k+1][j-1] +zq[k+1][j-1] -zp[k][j-1] -zq[k][j-1] )*
+                      ( zr[k][j] +zr[k][j-1] ) / ( zm[k][j-1] +zm[k+1][j-1]);
+           zb[k][j] = ( zp[k][j-1] +zq[k][j-1] -zp[k][j] -zq[k][j] ) *
+                      ( zr[k][j] +zr[k-1][j] ) / ( zm[k][j] +zm[k][j-1]);
+         }
+       }
+        for ( k=1 ; k<kn ; k++ )
+        {
+
+            for ( j=1 ; j<jn ; j++ )
+            {
+                zu[k][j] += s*( za[k][j]   *( zz[k][j] - zz[k][j+1] ) -
+                                za[k][j-1] *( zz[k][j] - zz[k][j-1] ) -
+                                zb[k][j]   *( zz[k][j] - zz[k-1][j] ) +
+                                zb[k+1][j] *( zz[k][j] - zz[k+1][j] ) );
+                zv[k][j] += s*( za[k][j]   *( zr[k][j] - zr[k][j+1] ) -
+                                za[k][j-1] *( zr[k][j] - zr[k][j-1] ) -
+                                zb[k][j]   *( zr[k][j] - zr[k-1][j] ) +
+                                zb[k+1][j] *( zr[k][j] - zr[k+1][j] ) );
+            }
+        }
+        for ( k=1 ; k<kn ; k++ )
+        {
+
+            for ( j=1 ; j<jn ; j++ )
+            {
+                zr[k][j] = zr[k][j] + t*zu[k][j];
+                zz[k][j] = zz[k][j] + t*zv[k][j];
+            }
+        }
+        endloop (18);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 19 -- general linear recurrence equations
+     *******************************************************************
+     */
+
+    parameters (19);
+    
+    kb5i = 0;
+    
+    do
+    {
+        for ( k=0 ; k<n ; k++ )
+        {
+            b5[k+kb5i] = sa[k] + stb5*sb[k];
+            stb5 = b5[k+kb5i] - stb5;
+        }
+        for ( i=1 ; i<=n ; i++ )
+        {
+            k = n - i;
+            b5[k+kb5i] = sa[k] + stb5*sb[k];
+            stb5 = b5[k+kb5i] - stb5;
+        }
+        endloop (19);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     * Kernel 20 - Discrete ordinates transport, conditional recurrence on xx
+     *******************************************************************
+    */
+
+    parameters (20);
+    
+    do
+    {
+        for ( k=0 ; k<n ; k++ )
+        {
+           di = y[k] - g[k] / ( xx[k] + dk );
+           dn = 0.2;
+           if ( di )
+           {
+               dn = z[k]/di ;
+               if ( t < dn ) dn = t;
+               if ( s > dn ) dn = s;
+           }
+           x[k] = ( ( w[k] + v[k]*dn )* xx[k] + u[k] ) / ( vx[k] + v[k]*dn );
+           xx[k+1] = ( x[k] - xx[k] )* dn + xx[k];
+        }
+        endloop (20);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 21 -- matrix*matrix product
+     *******************************************************************
+     */
+
+    parameters (21);
+
+    do
+    {
+        for ( k=0 ; k<25 ; k++ )
+        {
+            for ( i=0 ; i<25 ; i++ )
+            {
+                for ( j=0 ; j<n ; j++ )
+                {
+                    px[j][i] += vy[k][i] * cx[j][k];
+                }
+            }
+        }
+        endloop (21);
+    }
+    while (count < loop);
+    
+    /*
+     *******************************************************************
+     *   Kernel 22 -- Planckian distribution
+     *******************************************************************
+     */
+
+    parameters (22);
+
+    expmax = 20.0;
+    u[n-1] = 0.99*expmax*v[n-1];
+    do
+    {
+        for ( k=0 ; k<n ; k++ )
+        {
+            y[k] = u[k] / v[k];
+            w[k] = x[k] / ( exp( y[k] ) -1.0 );
+        }
+        endloop (22);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 23 -- 2-D implicit hydrodynamics fragment
+     *******************************************************************
+     */
+     
+    parameters (23);
+
+    do
+    {
+        for ( j=1 ; j<6 ; j++ )
+        {
+            for ( k=1 ; k<n ; k++ )
+            {
+                qa = za[j+1][k]*zr[j][k] + za[j-1][k]*zb[j][k] +
+                     za[j][k+1]*zu[j][k] + za[j][k-1]*zv[j][k] + zz[j][k];
+                za[j][k] += 0.175*( qa - za[j][k] );
+            }
+        }
+        endloop (23);
+    }
+    while (count < loop);
+
+    /*
+     *******************************************************************
+     *   Kernel 24 -- find location of first minimum in array
+     *******************************************************************
+     */
+
+    parameters (24);
+     
+    x[n/2] = -1.0e+10;
+    do
+    {
+        m24 = 0;
+        for ( k=1 ; k<n ; k++ )
+        {
+            if ( x[k] < x[m24] ) m24 = k;
+        }
+        endloop (24);
+    }
+    while (count < loop);
+   
+   return;
+ }
+
+/************************************************************************
+ *        endloop procedure - calculate checksums and MFLOPS            *
+ ************************************************************************/
+ 
+long endloop(long which)
+{
+  double now = 1.0, useflops;
+  long   i, j, k, m;
+  double Scale = 1000000.0;
+    
+  count = count + 1;
+  if (count >= loop)   /* else return */
+  {
+
+/************************************************************************
+ *               End of standard set of loops for one kernel            *
+ ************************************************************************/
+      
+     count2 = count2 + 1;
+     if (count2 == extra_loops[section][which])
+                          /* else re-initialise parameters if required */
+     {
+
+/************************************************************************
+ *           End of extra loops for 5 seconds execution time            *
+ ************************************************************************/
+                        
+       count2 = 0;
+       if (which == 1)
+       {     
+           for ( k=0 ; k<n ; k++ )
+           {
+                Checksum[section][1] = Checksum[section][1] + x[k]
+                                          * (double)(k+1);
+           }
+           useflops = nflops * (double)(n * loop);
+       }
+       if (which == 2)
+       {
+          for ( k=0 ; k<n*2 ; k++ )
+          {
+               Checksum[section][2] = Checksum[section][2] + x[k]
+                                         * (double)(k+1);
+          }
+          useflops = nflops * (double)((n-4) * loop);
+       }
+       if (which == 3)
+       {
+           Checksum[section][3] = q;
+           useflops = nflops * (double)(n * loop);
+       }
+       if (which == 4)
+       {
+          for ( k=0 ; k<3 ; k++ )
+          {
+                Checksum[section][4] = Checksum[section][4] + v[k]
+                                          * (double)(k+1);
+          }
+          useflops = nflops * (double) ((((n-5)/5)+1) * 3 * loop); 
+       }
+       if (which == 5)
+       {
+          for ( k=1 ; k<n ; k++ )
+          {
+              Checksum[section][5] = Checksum[section][5] + x[k]
+                                        * (double)(k);
+          }
+          useflops = nflops * (double)((n-1) * loop);
+       }
+       if (which == 6)
+       {
+          for ( k=0 ; k<n ; k++ )
+          {
+         
+             Checksum[section][6] = Checksum[section][6] + w[k]
+                                       * (double)(k+1);
+         
+          }
+          useflops = nflops * (double)(n * ((n - 1) / 2) * loop);
+       } 
+       if (which == 7)
+       {      
+          for ( k=0 ; k<n ; k++ )
+          {
+              Checksum[section][7] = Checksum[section][7] + x[k]
+                                        * (double)(k+1);
+          }
+          useflops = nflops * (double)(n * loop);
+       }
+       if (which == 8)
+       {
+          for ( i=0 ; i<2 ; i++ )
+          {        
+              for ( j=0 ; j<101 ; j++ )
+              {
+                  for ( k=0 ; k<5 ; k++ )
+                  {
+                      m = 101 * 5 * i + 5 * j + k + 1;
+                      if (m < 10 * n + 1)
+                      {
+                          Checksum[section][8] = Checksum[section][8]
+                                  + u1[i][j][k] * m
+                                  + u2[i][j][k] * m + u3[i][j][k] * m;
+                      }
+                  }
+              }
+          }
+          useflops = nflops * (double)(2 * (n - 1) * loop);
+       }
+       if (which == 9)
+       {
+           for ( i=0 ; i<n  ; i++ )
+           {
+               for ( j=0 ; j<25 ; j++ )
+               {
+                   m = 25 * i + j + 1;
+                   if (m < 15 * n + 1)
+                   {
+                       Checksum[section][9] = Checksum[section][9]
+                                             + px[i][j] * (double)(m);
+                   }
+               }
+           }
+           useflops = nflops * (double)(n * loop);
+       }
+       if (which == 10)
+       {
+           for ( i=0 ; i<n ; i++ )
+           {
+               for (j=0 ; j<25 ; j++ )
+              {
+                   m = 25 * i + j + 1;
+                   if (m < 15 * n + 1)
+                   {
+                       Checksum[section][10] = Checksum[section][10]
+                                              + px[i][j] * (double)(m);
+                   }                  
+              }
+           }
+           useflops = nflops * (double)(n * loop);
+       }
+       if (which == 11)
+       { 
+           for ( k=1 ; k<n ; k++ )
+           {
+                Checksum[section][11] = Checksum[section][11]
+                                           + x[k] * (double)(k);
+           }
+           useflops = nflops * (double)((n - 1) * loop);
+       }
+       if (which == 12)
+       { 
+           for ( k=0 ; k<n-1 ; k++ )
+           {
+                Checksum[section][12] = Checksum[section][12] + x[k]
+                                           * (double)(k+1);
+           }
+           useflops = nflops * (double)(n * loop);
+       }
+       if (which == 13)
+       {
+          for ( k=0 ; k<2*n ; k++ )                  
+          {
+             for ( j=0 ; j<4 ; j++ )    
+              {
+                  m = 4 * k + j + 1;
+                  Checksum[section][13] = Checksum[section][13]
+                                             + p[k][j]* (double)(m);
+              }
+          }
+          for ( i=0 ; i<8*n/64 ; i++ )
+          {
+              for ( j=0 ; j<64 ; j++ )
+              {
+                  m = 64 * i + j + 1;
+                  if (m < 8 * n + 1)
+                  {
+                      Checksum[section][13] = Checksum[section][13]
+                                                  + h[i][j] * (double)(m);
+                  }
+              }
+         }
+         useflops = nflops * (double)(n * loop);  
+       }
+       if (which == 14)
+       {
+          for ( k=0 ; k<n ; k++ )
+          {
+                Checksum[section][14] = Checksum[section][14]
+                                           + (xx[k] + vx[k]) * (double)(k+1);
+          }
+          for ( k=0 ; k<67 ; k++ )
+          {
+              Checksum[section][14] = Checksum[section][14] + rh[k]
+                                         * (double)(k+1);
+          }
+          useflops = nflops * (double)(n * loop);
+       }
+       if (which == 15)
+       {
+           for ( j=0 ; j<7 ; j++ )
+           {
+               for ( k=0 ; k<101 ; k++ )
+               {
+                  m = 101 * j + k + 1;
+                  if (m < n * 7 + 1)
+                  {
+                      Checksum[section][15] = Checksum[section][15]
+                                       + (vs[j][k] + vy[j][k]) * (double)(m);
+                  }
+               }
+           }
+           useflops = nflops * (double)((n - 1) * 5 * loop);
+       }
+       if (which == 16)
+       {
+           Checksum[section][16] =  (double)(k3 + k2 + j5 + m16);
+           useflops = (k2 + k2 + 10 * k3);
+       }
+       if (which == 17)
+       {
+           Checksum[section][17] = xnm;
+           for ( k=0 ; k<n ; k++ )
+           {
+               Checksum[section][17] = Checksum[section][17]
+                                       + (vxne[k] + vxnd[k]) * (double)(k+1);
+           }
+           useflops = nflops * (double)(n * loop); 
+       }
+       if (which == 18)
+       {
+          for ( k=0 ; k<7 ; k++ )    
+           {
+               for ( j=0 ; j<101 ; j++ )
+               {
+                   m = 101 * k + j + 1;
+                   if (m < 7 * n + 1)
+                   {
+                       Checksum[section][18] = Checksum[section][18]
+                                        + (zz[k][j] + zr[k][j]) * (double)(m);
+                   }
+               }
+           }
+           useflops = nflops * (double)((n - 1) * 5 * loop);
+       }
+       if (which == 19)
+       {
+          Checksum[section][19] = stb5;
+          for ( k=0 ; k<n ; k++ )
+          {
+              Checksum[section][19] = Checksum[section][19] + b5[k]
+                                         * (double)(k+1);
+          }             
+          useflops = nflops * (double)(n * loop);
+       } 
+       if (which == 20)
+       {
+            for ( k=1 ; k<n+1 ; k++ )
+            {
+                Checksum[section][20] = Checksum[section][20] + xx[k]
+                                           * (double)(k);
+            }
+            useflops = nflops * (double)(n * loop);
+       }
+       if (which == 21)
+       {
+           for ( k=0 ; k<n ; k++ )          
+           {
+               for ( i=0 ; i<25 ; i++ )
+               {
+                  m = 25 * k + i + 1;
+                  Checksum[section][21] = Checksum[section][21]
+                                             + px[k][i] * (double)(m);
+               }
+           }
+           useflops = nflops * (double)(n * 625 * loop);      
+
+       }
+       if (which == 22)
+       {
+           for ( k=0 ; k<n ; k++ )
+           {
+                Checksum[section][22] = Checksum[section][22] + w[k]
+                                           * (double)(k+1);
+           }
+           useflops = nflops * (double)(n * loop);      
+       }
+       if (which == 23)
+       {
+           for ( j=0 ; j<7 ; j++ )
+           {        
+                for ( k=0 ; k<101 ; k++ )
+                {
+                    m = 101 * j + k + 1;
+                    if (m < 7 * n + 1)
+                    {
+                         Checksum[section][23] = Checksum[section][23]
+                                                + za[j][k] * (double)(m);
+                    }
+                }
+           }
+           useflops = nflops * (double)((n-1) * 5 * loop);       
+       }
+       if (which == 24)
+       {
+           Checksum[section][24] =  (double)(m24);
+           useflops = nflops * (double)((n - 1) * loop); 
+       }
+
+
+/************************************************************************
+ *     Deduct overheads from time, calculate MFLOPS, display results    *
+ ************************************************************************/
+
+       RunTime[section][which] = RunTime[section][which]
+                       - (loop * extra_loops[section][which]) * overhead_l;
+       FPops[section][which] =  useflops * extra_loops[section][which];   
+       Mflops[section][which] = FPops[section][which] / Scale
+                                            / RunTime[section][which];
+       
+
+/************************************************************************
+ *      Compare sumcheck with standard result, calculate accuracy       *
+ ************************************************************************/
+           
+      printf("%10.3f\n", Checksum[section][which]);
+
+     }
+     else
+     {
+/************************************************************************
+ *                     Re-initialise data if reqired                    *
+ ************************************************************************/
+
+       count = 0;  
+       if (which == 2)
+       {
+          for ( k=0 ; k<n ; k++ )
+          {
+              x[k] = x0[k];
+          }
+       }
+       if (which == 4)
+       {
+          m = (1001-7)/2;
+          for ( k=6 ; k<1001 ; k=k+m )
+          {
+              x[k] = x0[k];
+          }
+       }
+       if (which == 5)
+       {
+          for ( k=0 ; k<n ; k++ )
+          {
+              x[k] = x0[k];
+          }
+       }
+       if (which == 6)
+       {
+          for ( k=0 ; k<n ; k++ )
+          {
+              w[k] = w0[k];
+          }
+       }
+       if (which == 10)
+       {
+           for ( i=0 ; i<n ; i++ )
+           {
+               for (j=4 ; j<13 ; j++ )
+              {
+                  px[i][j] = px0[i][j];
+              }
+           }
+       }
+       if (which == 13)
+       {           
+           for ( i=0 ; i<n ; i++ )
+           {
+               for (j=0 ; j<4 ; j++ )
+               {
+                   p[i][j] = p0[i][j];
+               }
+           }
+           for ( i=0 ; i<64 ; i++ )
+           {
+               for (j=0 ; j<64 ; j++ )
+               {
+                   h[i][j] = h0[i][j];
+               }
+           }
+       }
+       if (which == 14)
+       {
+           for ( i=0; i<n ; i++ )
+           {
+               rh[ir[i] - 1] = rh0[ir[i] - 1];
+               rh[ir[i] ] = rh0[ir[i] ];
+           }
+       }
+       if (which == 17)
+       {
+           for ( i=0; i<n ; i++ )
+           {
+               vxne[i] = vxne0[i];
+           }
+       }
+       if (which == 18)
+       {
+          for ( i=1 ; i<6 ; i++ )
+          {
+              for (j=1 ; j<n ; j++ )
+              {
+                  zr[i][j] = zr0[i][j];
+                  zu[i][j] = zu0[i][j];
+                  zv[i][j] = zv0[i][j];
+                  zz[i][j] = zz0[i][j];  
+              }
+          }
+       }
+       if (which == 21)
+       {
+           for ( i=0 ; i<n ; i++ )
+           {
+               for (j=0 ; j<25 ; j++ )
+              {
+                  px[i][j] = px0[i][j];
+              }
+           }
+       }
+       if (which == 23)
+       {
+          for ( i=1 ; i<6 ; i++ )
+          {
+              for (j=1 ; j<n ; j++ )
+              {
+                  za[i][j] = za0[i][j];
+              }
+          }
+       }
+       k3 = k2 = 0;
+       stb5 = stb50;
+       xx[0] = xx0;
+          
+     }
+  }
+  return 0;
+}
+
+/************************************************************************
+ *          init procedure - initialises data for all loops             *
+ ************************************************************************/
+
+ void init(long which)
+ {
+    long   i, j, k, l, m, nn;
+    double ds, dw, rr, ss;
+    double fuzz, fizz, buzz, scaled, one;  
+
+     scaled =  (double)(10.0);
+     scaled =  (double)(1.0) / scaled;
+     fuzz =    (double)(0.0012345);
+     buzz =    (double)(1.0) + fuzz;
+     fizz =    (double)(1.1) * fuzz;
+     one =     (double)(1.0);
+     
+     for ( k=0 ; k<19977 + 34132 ; k++)
+     {
+         if (k == 19977)
+         {
+                fuzz = (double)(0.0012345);
+                buzz = (double) (1.0) + fuzz;
+                fizz = (double) (1.1) * fuzz;
+         }         
+         buzz = (one - fuzz) * buzz + fuzz;
+         fuzz = - fuzz;
+         u[k] = (buzz - fizz) * scaled;
+     }
+     
+     fuzz = (double)(0.0012345);
+     buzz = (double) (1.0) + fuzz;
+     fizz = (double) (1.1) * fuzz;
+     
+     for ( k=1 ; k<40 ; k++)
+     {
+         buzz = (one - fuzz) * buzz + fuzz;
+         fuzz = - fuzz;
+         xtra[k] = (buzz - fizz) * scaled;
+     }
+
+    ds = 1.0;
+    dw = 0.5;
+    for ( l=0 ; l<4 ; l++ )   
+    {
+         for ( i=0 ; i<512 ; i++ )
+        {
+            p[i][l] = ds;
+            ds = ds + dw;
+        }
+    }
+     for ( i=0 ; i<96 ; i++ )
+     {
+         e[i] = 1;
+         f[i] = 1;
+     }    
+
+     
+         iqranf();
+         dw = -100.0;
+         for ( i=0; i<1001 ; i++ )
+         {
+             dex[i] = dw * dex[i];
+             grd[i] = ix[i];
+         }     
+         flx = 0.001;
+
+                  
+         d[0]= 1.01980486428764;
+         nn = n16;
+    
+         for ( l=1 ; l<300 ; l++ )
+         {
+              d[l] = d[l-1] + 1.000e-4 / d[l-1];
+         }
+         rr = d[nn-1];
+         for ( l=1 ; l<=2 ; l++ )
+         {
+             m = (nn+nn)*(l-1);
+             for ( j=1 ; j<=2 ; j++ )
+             {
+                 for ( k=1 ; k<=nn ; k++ )
+                 {
+                     m = m + 1;
+                     ss = (double)(k);
+                     plan[m-1] = rr * ((ss + 1.0) / ss);
+                     zone[m-1] = k + k;
+                 }
+            }
+        }
+        k = nn + nn + 1;
+        zone[k-1] = nn;
+        
+        if (which == 16)
+        {
+             r = d[n-1];
+             s = d[n-2];
+             t = d[n-3];
+             k3 = k2 = 0;
+        }
+        expmax = 20.0;
+        if (which == 22)
+        {
+             u[n-1] = 0.99*expmax*v[n-1];
+        }
+        if (which == 24)
+        {
+             x[n/2] = -1.0e+10;
+        }
+
+/************************************************************************
+ *                 Make copies of data for extra loops                  *
+ ************************************************************************/
+ 
+        for ( i=0; i<1001 ; i++ )
+        {
+            x0[i] = x[i];
+            w0[i] = w[i];
+        }
+        for ( i=0 ; i<101 ; i++ )
+        {
+            for (j=0 ; j<25 ; j++ )
+            {
+                px0[i][j] = px[i][j];
+            }
+        }
+        for ( i=0 ; i<512 ; i++ )
+        {
+            for (j=0 ; j<4 ; j++ )
+            {
+                p0[i][j] = p[i][j];
+            }
+        }
+        for ( i=0 ; i<64 ; i++ )
+        {
+            for (j=0 ; j<64 ; j++ )
+            {
+                h0[i][j] = h[i][j];
+            }
+        }
+        for ( i=0; i<2048 ; i++ )
+        {
+            rh0[i] = rh[i];
+        }
+        for ( i=0; i<101 ; i++ )
+        {
+            vxne0[i] = vxne[i];
+        }
+        for ( i=0 ; i<7 ; i++ )
+        {
+            for (j=0 ; j<101 ; j++ )
+            {
+                zr0[i][j] = zr[i][j];
+                zu0[i][j] = zu[i][j];
+                zv0[i][j] = zv[i][j];
+                zz0[i][j] = zz[i][j];
+                za0[i][j] = za[i][j];
+            }
+        }
+        stb50 = stb5;
+        xx0 = xx[0];
+                 
+    return;
+ }
+
+/************************************************************************
+ *   parameters procedure for loop counts, Do spans, sumchecks, FLOPS   *
+ ************************************************************************/
+
+   long parameters(long which)
+   {
+
+       long nloops[3][25] =
+            { {0, 1001, 101, 1001, 1001, 1001, 64, 995, 100,
+                   101, 101, 1001, 1000, 64, 1001, 101, 75,
+                   101, 100, 101, 1000, 101, 101, 100, 1001  },
+              {0,  101, 101, 101, 101, 101,  32, 101, 100,
+                   101, 101, 101, 100,  32, 101, 101,  40,
+                   101, 100, 101, 100,  50, 101, 100, 101    },
+              {0,   27, 15, 27, 27, 27,  8, 21, 14,
+                    15, 15, 27, 26,  8, 27, 15, 15,
+                    15, 14, 15, 26, 20, 15, 14, 27           }  };
+
+
+                             
+       long lpass[3][25] =
+             { {0, 7, 67,  9, 14, 10,  3,  4, 10, 36, 34, 11, 12,
+                  36, 2,  1,  25, 35,  2, 39,  1,  1, 11,  8,  5  },
+               {0, 40, 40, 53, 70, 55,  7, 22,  6, 21, 19, 64, 68,
+                   41, 10,  1, 27, 20,  1, 23,  8,  1,  7,  5, 31   },
+               {0, 28, 46, 37, 38, 40, 21, 20,  9, 26, 25, 46, 48,
+                   31,  8,  1, 14, 26,  2, 28,  7,  1,  8,  7, 23 } };
+
+       double sums[3][25] = 
+        {
+         { 0.0,
+         5.114652693224671e+04, 1.539721811668385e+03, 1.000742883066363e+01,
+         5.999250595473891e-01, 4.548871642387267e+03, 4.375116344729986e+03,
+         6.104251075174761e+04, 1.501268005625798e+05, 1.189443609974981e+05,
+         7.310369784325296e+04, 3.342910972650109e+07, 2.907141294167248e-05,
+         1.202533961842803e+11, 3.165553044000334e+09, 3.943816690352042e+04,
+         5.650760000000000e+05, 1.114641772902486e+03, 1.015727037502300e+05,
+         5.421816960147207e+02, 3.040644339351239e+07, 1.597308280710199e+08,
+         2.938604376566697e+02, 3.549900501563623e+04, 5.000000000000000e+02
+                                                                           },
+
+         { 0.0,
+         5.253344778937972e+02, 1.539721811668385e+03, 1.009741436578952e+00,
+         5.999250595473891e-01, 4.589031939600982e+01, 8.631675645333210e+01,
+         6.345586315784055e+02, 1.501268005625798e+05, 1.189443609974981e+05,
+         7.310369784325296e+04, 3.433560407475758e+04, 7.127569130821465e-06,
+         9.816387810944345e+10, 3.039983465145393e+07, 3.943816690352042e+04,
+         6.480410000000000e+05, 1.114641772902486e+03, 1.015727037502300e+05,
+         5.421816960147207e+02, 3.126205178815431e+04, 7.824524877232093e+07,
+         2.938604376566697e+02, 3.549900501563623e+04, 5.000000000000000e+01
+                                                                          },
+     
+         { 0.0,
+         3.855104502494961e+01, 3.953296986903059e+01, 2.699309089320672e-01,
+         5.999250595473891e-01, 3.182615248447483e+00, 1.120309393467088e+00,
+         2.845720217644024e+01, 2.960543667875003e+03, 2.623968460874250e+03,
+         1.651291227698265e+03, 6.551161335845770e+02, 1.943435981130448e-06,
+         3.847124199949426e+10, 2.923540598672011e+06, 1.108997288134785e+03,
+         5.152160000000000e+05, 2.947368618589360e+01, 9.700646212337040e+02,
+         1.268230698051003e+01, 5.987713249475302e+02, 5.009945671204667e+07,
+         6.109968728263972e+00, 4.850340602749970e+02, 1.300000000000000e+01
+                                                                         } };
+                               
+ 
+     
+       double number_flops[25] = {0, 5., 4., 2., 2., 2., 2., 16., 36., 17.,
+                                      9., 1., 1., 7., 11., 33.,10., 9., 44.,
+                                      6., 26., 2., 17., 11., 1.};
+       double now = 1.0;
+      
+                           
+       n = nloops[section][which];
+       nspan[section][which] = n;
+       n16 = nloops[section][16];
+       nflops = number_flops[which];
+       xflops[which] = nflops;
+       loop = lpass[section][which];
+       xloops[section][which] = loop;
+       loop = loop * mult;
+       MasterSum = sums[section][which];
+       count = 0;
+
+       init(which);
+
+              
+       return 0;
+   }
+
+/************************************************************************
+ *          check procedure to check accuracy of calculations           *
+ ************************************************************************/
+   
+   void check(long which)
+   {
+        long maxs = 16;
+        double xm, ym, re, min1, max1;
+
+        xm = MasterSum;
+        ym = Checksum[section][which];
+      
+       if (xm * ym < 0.0)
+       {
+           accuracy[section][which] = 0;
+       }
+       else
+       {
+           if ( xm == ym)
+           {
+               accuracy[section][which] = maxs;
+           }
+           else
+           {
+               xm = fabs(xm);
+               ym = fabs(ym);
+               min1 = xm;
+               max1 = ym;
+               if (ym < xm)
+               {
+                   min1 = ym;
+                   max1 = xm;
+               }
+               re = 1.0 - min1 / max1;
+               accuracy[section][which] =
+                                        (long)( fabs(log10(fabs(re))) + 0.5);
+           }
+       }
+
+       return;
+   } 
+   
+/************************************************************************
+ *      iqranf procedure - random number generator for Kernel 14        *
+ ************************************************************************/
+  
+    void iqranf()
+      {
+        
+        long   inset, Mmin, Mmax, nn, i, kk;
+        double span, spin, realn, per, scale1, qq, dkk, dp, dq;
+        long   seed[3] = { 256, 12491249, 1499352848 };
+
+        nn = 1001;
+        Mmin = 1;
+        Mmax = 1001;
+        kk = seed[section];
+        
+        inset= Mmin;
+        span= Mmax - Mmin;
+        spin= 16807;
+        per= 2147483647;
+        realn= nn;
+        scale1= 1.00001;
+        qq= scale1 * (span / realn);
+        dkk= kk;
+        
+        for ( i=0 ; i<nn ; i++)
+        { 
+            dp= dkk*spin;
+            dkk= dp - (long)( dp/per)*per;
+            dq= dkk*span;
+            ix[i] = inset + ( dq/ per);
+            if (ix[i] < Mmin | ix[i] > Mmax)
+            {
+                ix[i] = inset + i + 1 * qq;
+            }
+        }
+        
+        return;         
+      }
+

Propchange: test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp

------------------------------------------------------------------------------
    svn:executable = *





More information about the llvm-commits mailing list