[llvm-commits] [test-suite] r60375 - /test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
Owen Anderson
resistor at mac.com
Mon Dec 1 17:07:11 PST 2008
I have no particular love of it.
--Owen
On Dec 1, 2008, at 2:23 PM, Chris Lattner wrote:
> Author: lattner
> Date: Mon Dec 1 16:23:16 2008
> New Revision: 60375
>
> URL: http://llvm.org/viewvc/llvm-project?rev=60375&view=rev
> Log:
> Remove this benchmark, it is using undefined behavior (accessing
> at least the 'u' array beyond its bounds in 'init') and is too
> brain twisting to fix. Owen, if you really really really want this
> benchmark, feel free to fix it and recommit.
>
> Removed:
> test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
>
> Removed: test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
> URL: http://llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Misc-C%2B%2B/llloops.cpp?rev=60374&view=auto
>
> =
> =
> =
> =
> =
> =
> =
> =
> ======================================================================
> --- test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
> (original)
> +++ test-suite/trunk/SingleSource/Benchmarks/Misc-C++/llloops.cpp
> (removed)
> @@ -1,2217 +0,0 @@
> -/
> ************************************************************************
> -
> * *
> - * L. L. N. L. " C " K E R N E L S: M F L O P S P C V E R S I
> O N *
> -
> * *
> - * These kernels measure " C " numerical
> computation *
> - * rates for a spectrum of cpu-limited
> computational *
> - * structures or benchmarks. Mathematical through-
> put *
> - * is measured in units of millions of floating-
> point *
> - * operations executed per second, called Megaflops/
> sec. *
> -
> * *
> -
> ************************************************************************
> - * Originally from Greg Astfalk, AT&T, P.O.Box 900,
> Princeton, *
> - * NJ. 08540. by way of Frank McMahon
> (LLNL). *
> -
> * *
> - * Modifications by Tim Peters, Kendall Square Res. Corp. Oct
> 92. *
> -
> * *
> - * This version by Roy Longbottom (retired, ex-CCTA
> UK) *
> - * Roy_Longbottom
> 101323.2241 at compuserve.com *
> - * March
> 1996 *
> -
> * *
> - *
> REFERENCE *
> -
> * *
> - * F.H.McMahon, The Livermore Fortran
> Kernels: *
> - * A Computer Test Of The Numerical Performance
> Range, *
> - * Lawrence Livermore National
> Laboratory, *
> - * Livermore, California, UCRL-53745, December
> 1986. *
> -
> * *
> - * from: National Technical Information
> Service *
> - * U.S. Department of
> Commerce *
> - * 5285 Port Royal
> Road *
> - * Springfield, VA.
> 22161 *
> -
> * *
> -
> ************************************************************************
> - * The standard "C" code accesses the FORTRAN version for
> data *
> - * generation and result analysis. These features have been
> merged *
> - * to produce a program more suitable to run on PCs. FORTRAN
> features *
> - * for detailed statistical analysis of the results have been
> omitted. *
> -
> * *
> - * Changes to "C" code to produce correct
> results: *
> -
> * *
> - * Kernel 2 change i = ipntp - 1; to i =
> ipntp; *
> - * Kernel 7 third line of inner loop change r to
> q *
> -
> ************************************************************************
> - *
> - * The kernels are executed as follows:
> - *
> - * parameters(x);
> - * do
> - * {
> - * execute kernel code
> - *
> - * endloop(x);
> - * }
> - * while (count < loop);
> - *
> - * Function parameters obtains the loop parameters, generates all
> the data
> - * and makes a copy of it for use with extra loops. Timing is
> started at
> - * the end of the function.
> - *
> - * The variable loop has a defined number of passes (e.g. 7 for
> kernel 1,
> - * long span - see Passes in table). This is multiplied by a further
> - * constant for which checksums are defined - 200/400/1600 for
> long/medium
> - * /short spans was chosen. The overhead of executing function
> endloop is
> - * calculated as below. This is deducted from the total time but
> probably
> - * could be ignored on PCs.
> - *
> - * The running time for each loop is set to a minimum of five
> seconds via
> - * repeating all loops until each has recorded at least 0.07
> seconds (see
> - * calibration below). The extra loops required are shown under E
> in the
> - * tables. The data used in the loops is re-initialised from the
> copy in
> - * function endloop for each of the extra loops. The worst case
> overhead
> - * of this has been measured as less than 1% and is ignored. Note,
> the
> - * alternative of summing the time for each set of count passes
> cannot
> - * be relied upon when the time for one set is of the same order of
> - * magnitude as the clock resolution (0.05 to 0.06 seconds).
> Calibration
> - * also gives an indication of the linearity of timing. In the
> example
> - * shown, the overhead of 24 occurrences of data generation, which
> is
> - * excluded from the main timing, is about 0.6 seconds.
> - *
> - * The total floating point operations for the first kernel 1
> results are
> - * 200 x 7 x 15 x 5 x 1001. For some other kernels, the total is not
> - * proportional to the span.
> - *
> - * The OK column in the tables indicates the number of correct
> significant
> - * digits out of 16 compared with the defined checksums.
> - *
> - *
> - * Example of Results
> - *
> - * L.L.N.L. 'C' KERNELS: MFLOPS P.C. VERSION
> - *
> - * Calculating outer loop overhead
> - * 1000 times 0.00 seconds
> - * 10000 times 0.00 seconds
> - * 100000 times 0.06 seconds
> - * 1000000 times 0.33 seconds
> - * 2000000 times 0.88 seconds
> - * 4000000 times 1.59 seconds
> - * 8000000 times 3.30 seconds
> - * 16000000 times 6.64 seconds
> - * Overhead for each loop 4.1500e-007 seconds
> - *
> - * Calibrating part 1 of 3
> - *
> - * Loop count 4 0.94 seconds
> - * Loop count 16 2.08 seconds
> - * Loop count 32 3.52 seconds
> - * Loop count 64 6.42 seconds
> - * Loop count 128 12.31 seconds
> - *
> - * Loops 200 x 1 x Passes
> - *
> - * Kernel Floating Pt ops
> - * No Passes E No Total Secs. MFLOPS Span
> Checksums OK
> - * ------------ -- ------------- ----- ------- ----
> ---------------------- --
> - * 1 7 x 15 5 1.051050e+008 5.10 20.60 1001
> 5.114652693224671e+004 16
> - * 2 67 x 21 4 1.091832e+008 5.20 20.98 101
> 1.539721811668384e+003 15
> - * 3 9 x 15 2 5.405400e+007 4.17 12.97 1001
> 1.000742883066364e+001 15
> - * 4 14 x 30 2 1.008000e+008 5.52 18.28 1001
> 5.999250595473891e-001 16
> - * 5 10 x 12 2 4.800000e+007 5.43 8.84 1001
> 4.548871642387267e+003 16
> - * 6 3 x 19 2 4.523520e+007 4.34 10.43 64
> 4.375116344729986e+003 16
> - * 7 4 x 10 16 1.273600e+008 4.45 28.64 995
> 6.104251075174761e+004 16
> - * 8 10 x 7 36 9.979200e+007 5.15 19.36 100
> 1.501268005625795e+005 15
> - * 9 36 x 6 17 7.417440e+007 5.20 14.26 101
> 1.189443609974981e+005 16
> - * 10 34 x 5 9 3.090600e+007 5.48 5.64 101
> 7.310369784325296e+004 16
> - * 11 11 x 15 1 3.300000e+007 5.65 5.84 1001
> 3.342910972650109e+007 16
> - * 12 12 x 30 1 7.200000e+007 6.50 11.08 1000
> 2.907141294167248e-005 16
> - * 13 36 x 4 7 1.290240e+007 6.41 2.01 64
> 1.202533961842804e+011 15
> - * 14 2 x 4 11 1.761760e+007 5.61 3.14 1001
> 3.165553044000335e+009 15
> - * 15 1 x 15 33 4.950000e+007 5.66 8.75 101
> 3.943816690352044e+004 15
> - * 16 25 x 30 10 7.950000e+007 6.14 12.95 75
> 5.650760000000000e+005 16
> - * 17 35 x 9 9 5.726700e+007 5.03 11.38 101
> 1.114641772902486e+003 16
> - * 18 2 x 11 44 9.583200e+007 5.76 16.64 100
> 1.015727037502299e+005 15
> - * 19 39 x 21 6 9.926280e+007 6.14 16.16 101
> 5.421816960147207e+002 16
> - * 20 1 x 15 26 7.800000e+007 5.93 13.16 1000
> 3.040644339351238e+007 16
> - * 21 1 x 1 2 2.525000e+007 6.37 3.96 101
> 1.597308280710200e+008 15
> - * 22 11 x 12 17 4.532880e+007 5.43 8.35 101
> 2.938604376566698e+002 15
> - * 23 8 x 12 11 1.045440e+008 5.10 20.49 100
> 3.549900501563624e+004 15
> - * 24 5 x 30 1 3.000000e+007 4.93 6.09 1001
> 5.000000000000000e+002 16
> - *
> - * Maximum Rate 28.64
> - * Average Rate 12.50
> - * Geometric Mean 10.50
> - * Harmonic Mean 8.25
> - * Minimum Rate 2.01
> - *
> - * Do Span 471
> - *
> - * Calibrating part 2 of 3
> - *
> - * Loop count 8 0.88 seconds
> - * Loop count 32 1.86 seconds
> - * Loop count 64 3.19 seconds
> - * Loop count 128 5.77 seconds
> - * Loop count 256 10.93 seconds
> - *
> - * Loops 200 x 2 x Passes
> - *
> - * Kernel Floating Pt ops
> - * No Passes E No Total Secs. MFLOPS Span
> Checksums OK
> - * ------------ -- ------------- ----- ------- ----
> ---------------------- --
> - * 1 40 x 15 5 1.212000e+008 4.84 25.04 101
> 5.253344778937972e+002 16
> - * 2 40 x 20 4 1.241600e+008 5.91 21.02 101
> 1.539721811668384e+003 15
> - * 3 53 x 20 2 8.564800e+007 5.10 16.78 101
> 1.009741436578952e+000 16
> - * 4 70 x 32 2 1.075200e+008 4.29 25.07 101
> 5.999250595473891e-001 16
> - * 5 55 x 13 2 5.720000e+007 4.99 11.46 101
> 4.589031939600982e+001 16
> - * 6 7 x 19 2 5.107200e+007 4.70 10.87 32
> 8.631675645333210e+001 16
> - * 7 22 x 12 16 1.706496e+008 5.56 30.71 101
> 6.345586315784055e+002 16
> - * 8 6 x 6 36 1.026432e+008 5.26 19.50 100
> 1.501268005625795e+005 15
> - * 9 21 x 5 17 7.211400e+007 5.03 14.33 101
> 1.189443609974981e+005 16
> - * 10 19 x 5 9 3.454200e+007 6.13 5.63 101
> 7.310369784325296e+004 16
> - * 11 64 x 20 1 5.120000e+007 4.95 10.35 101
> 3.433560407475758e+004 16
> - * 12 68 x 20 1 5.440000e+007 5.16 10.53 100
> 7.127569130821465e-006 16
> - * 13 41 x 3 7 1.102080e+007 5.47 2.01 32
> 9.816387810944356e+010 15
> - * 14 10 x 4 11 1.777600e+007 5.49 3.24 101
> 3.039983465145392e+007 15
> - * 15 1 x 7 33 4.620000e+007 5.32 8.69 101
> 3.943816690352044e+004 15
> - * 16 27 x 21 10 6.350400e+007 5.02 12.66 40
> 6.480410000000000e+005 16
> - * 17 20 x 9 9 6.544800e+007 5.74 11.40 101
> 1.114641772902486e+003 16
> - * 18 1 x 10 44 8.712000e+007 5.22 16.69 100
> 1.015727037502299e+005 15
> - * 19 23 x 15 6 8.362800e+007 5.11 16.36 101
> 5.421816960147207e+002 16
> - * 20 8 x 9 26 7.488000e+007 5.43 13.80 100
> 3.126205178815432e+004 16
> - * 21 1 x 2 2 5.000000e+007 5.55 9.01 50
> 7.824524877232093e+007 16
> - * 22 7 x 9 17 4.326840e+007 5.21 8.31 101
> 2.938604376566698e+002 15
> - * 23 5 x 9 11 9.801000e+007 4.77 20.54 100
> 3.549900501563624e+004 15
> - * 24 31 x 30 1 3.720000e+007 6.06 6.14 101
> 5.000000000000000e+001 16
> - *
> - * Maximum Rate 30.71
> - * Average Rate 13.76
> - * Geometric Mean 11.69
> - * Harmonic Mean 9.19
> - * Minimum Rate 2.01
> - *
> - * Do Span 90
> - *
> - * Calibrating part 3 of 3
> - *
> - * Loop count 32 0.77 seconds
> - * Loop count 128 1.54 seconds
> - * Loop count 256 2.47 seconds
> - * Loop count 512 4.34 seconds
> - * Loop count 1024 8.13 seconds
> - *
> - * Loops 200 x 8 x Passes
> - *
> - * Kernel Floating Pt ops
> - * No Passes E No Total Secs. MFLOPS Span
> Checksums OK
> - * ------------ -- ------------- ----- ------- ----
> ---------------------- --
> - * 1 28 x 22 5 1.330560e+008 5.31 25.05 27
> 3.855104502494961e+001 16
> - * 2 46 x 22 4 7.124480e+007 4.38 16.27 15
> 3.953296986903060e+001 16
> - * 3 37 x 23 2 7.352640e+007 4.26 17.24 27
> 2.699309089320672e-001 16
> - * 4 38 x 35 2 6.384000e+007 3.79 16.86 27
> 5.999250595473891e-001 16
> - * 5 40 x 23 2 7.654400e+007 4.45 17.20 27
> 3.182615248447483e+000 16
> - * 6 21 x 32 2 5.160960e+007 4.82 10.70 8
> 1.120309393467088e+000 15
> - * 7 20 x 12 16 1.290240e+008 4.24 30.43 21
> 2.845720217644024e+001 16
> - * 8 9 x 8 36 1.078272e+008 5.17 20.85 14
> 2.960543667875005e+003 15
> - * 9 26 x 16 17 1.697280e+008 5.33 31.82 15
> 2.623968460874250e+003 16
> - * 10 25 x 11 9 5.940000e+007 5.42 10.96 15
> 1.651291227698265e+003 16
> - * 11 46 x 22 1 4.209920e+007 3.67 11.48 27
> 6.551161335845770e+002 16
> - * 12 48 x 23 1 4.592640e+007 5.04 9.12 26
> 1.943435981130448e-006 16
> - * 13 31 x 4 7 1.111040e+007 5.57 2.00 8
> 3.847124199949431e+010 15
> - * 14 8 x 6 11 2.280960e+007 5.19 4.40 27
> 2.923540598672009e+006 15
> - * 15 1 x 15 33 5.544000e+007 6.14 9.03 15
> 1.108997288134785e+003 16
> - * 16 14 x 31 10 7.638400e+007 5.80 13.17 15
> 5.152160000000000e+005 16
> - * 17 26 x 11 9 6.177600e+007 5.14 12.02 15
> 2.947368618589360e+001 16
> - * 18 2 x 12 44 1.098240e+008 5.36 20.47 14
> 9.700646212337040e+002 16
> - * 19 28 x 21 6 8.467200e+007 5.38 15.74 15
> 1.268230698051004e+001 15
> - * 20 7 x 10 26 7.571200e+007 5.27 14.36 26
> 5.987713249475302e+002 16
> - * 21 1 x 2 2 8.000000e+007 5.50 14.55 20
> 5.009945671204667e+007 16
> - * 22 8 x 13 17 4.243200e+007 5.04 8.42 15
> 6.109968728263973e+000 16
> - * 23 7 x 15 11 1.201200e+008 4.38 27.42 14
> 4.850340602749970e+002 16
> - * 24 23 x 32 1 3.061760e+007 5.01 6.11 27
> 1.300000000000000e+001 16
> - *
> - * Maximum Rate 31.82
> - * Average Rate 15.24
> - * Geometric Mean 13.06
> - * Harmonic Mean 10.26
> - * Minimum Rate 2.00
> - *
> - * Do Span 19
> - *
> - * Overall
> - *
> - * Part 1 weight 1
> - * Part 2 weight 2
> - * Part 3 weight 1
> - *
> - * Maximum Rate 31.82
> - * Average Rate 13.81
> - * Geometric Mean 11.70
> - * Harmonic Mean 9.17
> - * Minimum Rate 2.00
> - *
> - * Do Span 167
> - *
> - * Enter the following data which will be filed with the results
> - *
> - * Month run 9/1996
> - * PC model Escom
> - * CPU Pentium
> - * Clock MHz 100
> - * Cache 256K
> - * Options Neptune chipset
> - * OS/DOS Windows 95
> - * Compiler Watcom C/C++ Version 10.5
> - * OptLevel Win386 -zp4 -otexan -om -fp5 -zc -5r
> - * Run by Roy Longbottom
> - * From UK
> - * Mail 101323.2241 at compuserve.com
> - *
> - * Note: the date, compiler and opt level are inserted by the
> program.
> - *
> - * The tables of results and running details are appended to file
> - * LLloops.txt.
> - *
> - * When a single MFLOPS rating is claimed for this benchmark it is
> - * usually the overall geometric mean result.
> - *
> -
> **********************************************************************
> - *
> - * Pre-compiled codes were produced via a Watcom C/C++ 10.5
> compiler.
> - * Versions are available for DOS, Windows 3/95 and NT/Win 95. Both
> - * non-optimised and optimised programs are available. The latter
> have
> - * options as in the above example.
> - *
> - * In this source code, function prototypes are declared and
> function
> - * headers have embedded parameter types to produce code for C and
> C++
> - * at least suitable for compiling as such with the Watcom compiler.
> - *
> -
> ***********************************************************************
> - */
> -
> -#include <stdio.h>
> -#include <math.h>
> -#include <stdlib.h>
> -
> -
> - struct Arrays
> - {
> - double U[1001];
> - double V[1001];
> - double W[1001];
> - double X[1001];
> - double Y[1001];
> - double Z[1001];
> - double G[1001];
> - double Du1[101];
> - double Du2[101];
> - double Du3[101];
> - double Grd[1001];
> - double Dex[1001];
> - double Xi[1001];
> - double Ex[1001];
> - double Ex1[1001];
> - double Dex1[1001];
> - double Vx[1001];
> - double Xx[1001];
> - double Rx[1001];
> - double Rh[2048];
> - double Vsp[101];
> - double Vstp[101];
> - double Vxne[101];
> - double Vxnd[101];
> - double Ve3[101];
> - double Vlr[101];
> - double Vlin[101];
> - double B5[101];
> - double Plan[300];
> - double D[300];
> - double Sa[101];
> - double Sb[101];
> - double P[512][4];
> - double Px[101][25];
> - double Cx[101][25];
> - double Vy[25][101];
> - double Vh[7][101];
> - double Vf[7][101];
> - double Vg[7][101];
> - double Vs[7][101];
> - double Za[7][101];
> - double Zp[7][101];
> - double Zq[7][101];
> - double Zr[7][101];
> - double Zm[7][101];
> - double Zb[7][101];
> - double Zu[7][101];
> - double Zv[7][101];
> - double Zz[7][101];
> - double B[64][64];
> - double C[64][64];
> - double H[64][64];
> - double U1[2][101][5];
> - double U2[2][101][5];
> - double U3[2][101][5];
> - double Xtra[40];
> - long E[96];
> - long F[96];
> - long Ix[1001];
> - long Ir[1001];
> - long Zone[301];
> - double X0[1001];
> - double W0[1001];
> - double Px0[101][25];
> - double P0[512][4];
> - double H0[64][64];
> - double Rh0[2048];
> - double Vxne0[101];
> - double Zr0[7][101];
> - double Zu0[7][101];
> - double Zv0[7][101];
> - double Zz0[7][101];
> - double Za0[101][25];
> - double Stb50;
> - double Xx0;
> -
> -
> - }as1;
> - #define u as1.U
> - #define v as1.V
> - #define w as1.W
> - #define x as1.X
> - #define y as1.Y
> - #define z as1.Z
> - #define g as1.G
> - #define du1 as1.Du1
> - #define du2 as1.Du2
> - #define du3 as1.Du3
> - #define grd as1.Grd
> - #define dex as1.Dex
> - #define xi as1.Xi
> - #define ex as1.Ex
> - #define ex1 as1.Ex1
> - #define dex1 as1.Dex1
> - #define vx as1.Vx
> - #define xx as1.Xx
> - #define rx as1.Rx
> - #define rh as1.Rh
> - #define vsp as1.Vsp
> - #define vstp as1.Vstp
> - #define vxne as1.Vxne
> - #define vxnd as1.Vxnd
> - #define ve3 as1.Ve3
> - #define vlr as1.Vlr
> - #define vlin as1.Vlin
> - #define b5 as1.B5
> - #define plan as1.Plan
> - #define d as1.D
> - #define sa as1.Sa
> - #define sb as1.Sb
> - #define p as1.P
> - #define px as1.Px
> - #define cx as1.Cx
> - #define vy as1.Vy
> - #define vh as1.Vh
> - #define vf as1.Vf
> - #define vg as1.Vg
> - #define vs as1.Vs
> - #define za as1.Za
> - #define zb as1.Zb
> - #define zp as1.Zp
> - #define zq as1.Zq
> - #define zr as1.Zr
> - #define zm as1.Zm
> - #define zz as1.Zz
> - #define zu as1.Zu
> - #define zv as1.Zv
> - #define b as1.B
> - #define c as1.C
> - #define h as1.H
> - #define u1 as1.U1
> - #define u2 as1.U2
> - #define u3 as1.U3
> - #define xtra as1.Xtra
> - #define a11 as1.Xtra[1]
> - #define a12 as1.Xtra[2]
> - #define a13 as1.Xtra[3]
> - #define a21 as1.Xtra[4]
> - #define a22 as1.Xtra[5]
> - #define a23 as1.Xtra[6]
> - #define a31 as1.Xtra[7]
> - #define a32 as1.Xtra[8]
> - #define a33 as1.Xtra[9]
> - #define c0 as1.Xtra[12]
> - #define dk as1.Xtra[15]
> - #define dm22 as1.Xtra[16]
> - #define dm23 as1.Xtra[17]
> - #define dm24 as1.Xtra[18]
> - #define dm25 as1.Xtra[19]
> - #define dm26 as1.Xtra[20]
> - #define dm27 as1.Xtra[21]
> - #define dm28 as1.Xtra[22]
> - #define expmax as1.Xtra[26]
> - #define flx as1.Xtra[27]
> - #define q as1.Xtra[28]
> - #define r as1.Xtra[30]
> - #define s as1.Xtra[32]
> - #define sig as1.Xtra[34]
> - #define stb5 as1.Xtra[35]
> - #define t as1.Xtra[36]
> - #define xnm as1.Xtra[39]
> - #define e as1.E
> - #define f as1.F
> - #define ix as1.Ix
> - #define ir as1.Ir
> - #define zone as1.Zone
> - #define x0 as1.X0
> - #define w0 as1.W0
> - #define px0 as1.Px0
> - #define p0 as1.P0
> - #define h0 as1.H0
> - #define rh0 as1.Rh0
> - #define vxne0 as1.Vxne0
> - #define zr0 as1.Zr0
> - #define zu0 as1.Zu0
> - #define zv0 as1.Zv0
> - #define zz0 as1.Zz0
> - #define za0 as1.Za0
> - #define stb50 as1.Stb50
> - #define xx0 as1.Xx0
> -
> -
> - struct Parameters
> - {
> - long Inner_loops;
> - long Outer_loops;
> - long Loop_mult;
> - double Flops_per_loop;
> - double Sumcheck[3][25];
> - long Accuracy[3][25];
> - double LoopTime[3][25];
> - double LoopSpeed[3][25];
> - double LoopFlos[3][25];
> - long Xflops[25];
> - long Xloops[3][25];
> - long Nspan[3][25];
> - double TimeStart;
> - double TimeEnd;
> - double Loopohead;
> - long Count;
> - long Count2;
> - long Pass;
> - long Extra_loops[3][25];
> - long K2;
> - long K3;
> - long M16;
> - long J5;
> - long Section;
> - long N16;
> - double Mastersum;
> - long M24;
> -
> -
> - }as2;
> -
> - #define n as2.Inner_loops
> - #define loop as2.Outer_loops
> - #define mult as2.Loop_mult
> - #define nflops as2.Flops_per_loop
> - #define Checksum as2.Sumcheck
> - #define accuracy as2.Accuracy
> - #define RunTime as2.LoopTime
> - #define Mflops as2.LoopSpeed
> - #define FPops as2.LoopFlos
> - #define nspan as2.Nspan
> - #define xflops as2.Xflops
> - #define xloops as2.Xloops
> - #define StartTime as2.TimeStart
> - #define EndTime as2.TimeEnd
> - #define overhead_l as2.Loopohead
> - #define count as2.Count
> - #define count2 as2.Count2
> - #define pass as2.Pass
> - #define extra_loops as2.Extra_loops
> - #define k2 as2.K2
> - #define k3 as2.K3
> - #define m16 as2.M16
> - #define j5 as2.J5
> - #define section as2.Section
> - #define n16 as2.N16
> - #define MasterSum as2.Mastersum
> - #define m24 as2.M24
> -
> -
> - void init(long which);
> -
> - /* Initialises arrays and variables */
> -
> - long endloop(long which);
> -
> - /* Controls outer loops and stores results */
> -
> - long parameters(long which);
> -
> - /* Gets loop parameters and variables, starts timer */
> -
> - void kernels();
> -
> - /* The 24 kernels */
> -
> - void check(long which);
> -
> - /* Calculates checksum accuracy */
> -
> - void iqranf();
> -
> - /* Random number generator for Kernel 14 */
> -
> -main(int argc, char *argv[])
> -{
> - double pass_time, least, lmult, now = 1.0, wt;
> - double time1, time2;
> - long i, k, loop_passes;
> - long mul[3] = {1, 2, 8};
> - double weight[3] = {1.0, 2.0, 1.0};
> - long Endit, which;
> - double maximum[4];
> - double minimum[4];
> - double average[4];
> - double harmonic[4];
> - double geometric[4];
> - long xspan[4];
> - char general[9][80] = {" "};
> - FILE *outfile;
> - int getinput = 1;
> -
> - if (argc > 1)
> - {
> - switch (argv[1][0])
> - {
> - case 'N':
> - getinput = 0;
> - break;
> - case 'n':
> - getinput = 0;
> - break;
> - }
> - }
> -
> -
> - printf ("L.L.N.L. 'C' KERNELS: MFLOPS P.C. VERSION 4.0\n\n");
> -
> - if (getinput == 0)
> - {
> - printf ("***** No run time input data *****\n\n");
> - }
> - else
> - {
> - printf ("*** With run time input data ***\n\n");
> - }
> -
> -/
> ************************************************************************
> - * Execute the kernels three times at different Do
> Spans *
> -
> ************************************************************************/
> -
> - for ( section=0 ; section<3 ; section++ )
> - {
> - loop_passes = 200 * mul[section];
> - pass = -20;
> - mult = 2 * mul[section];
> -
> - for ( i=1; i<25; i++)
> - {
> - extra_loops[section][i] = 500;
> - }
> -
> -/
> ************************************************************************
> - * Execute the
> kernels *
> -
> ************************************************************************/
> -
> - kernels();
> -
> - maximum[section] = 0.0;
> - minimum[section] = Mflops[section][1];
> - average[section] = 0.0;
> - harmonic[section] = 0.0;
> - geometric[section] = 0.0;
> - xspan[section] = 0.0;
> - }
> -
> -/
> ************************************************************************
> - * End of executing the kernels three times at different Do
> Spans *
> -
> ************************************************************************/
> -}
> -
> -/
> ************************************************************************
> - * The
> Kernels *
> -
> ************************************************************************/
> -
> -void kernels()
> - {
> -
> - long lw;
> - long ipnt, ipntp, ii;
> - double temp;
> - long nl1, nl2;
> - long kx, ky;
> - double ar, br, cr;
> - long i, j, k, m;
> - long ip, i1, i2, j1, j2, j4, lb;
> - long ng, nz;
> - double tmp;
> - double scale, xnei, xnc, e3,e6;
> - long ink, jn, kn, kb5i;
> - double di, dn;
> - double qa;
> -
> - for ( k=0 ; k<25; k++)
> - {
> - Checksum[section][k] = 0.0;
> - }
> -
> -
> -
> - /*
> -
> *******************************************************************
> - * Kernel 1 -- hydro fragment
> -
> *******************************************************************
> - */
> -
> - parameters (1);
> -
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - x[k] = q + y[k]*( r*z[k+10] + t*z[k+11] );
> - }
> -
> - endloop (1);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 2 -- ICCG excerpt (Incomplete Cholesky Conjugate
> Gradient)
> -
> *******************************************************************
> - */
> -
> - parameters (2);
> -
> - do
> - {
> - ii = n;
> - ipntp = 0;
> - do
> - {
> - ipnt = ipntp;
> - ipntp += ii;
> - ii /= 2;
> - i = ipntp;
> - for ( k=ipnt+1 ; k<ipntp ; k=k+2 )
> - {
> - i++;
> - x[i] = x[k] - v[k]*x[k-1] - v[k+1]*x[k+1];
> - }
> - } while ( ii>0 );
> -
> - endloop (2);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 3 -- inner product
> -
> *******************************************************************
> - */
> -
> - parameters (3);
> -
> - do
> - {
> - q = 0.0;
> - for ( k=0 ; k<n ; k++ )
> - {
> - q += z[k]*x[k];
> - }
> -
> - endloop (3);
> - }
> - while (count < loop);
> -
> -
> - /*
> -
> *******************************************************************
> - * Kernel 4 -- banded linear equations
> -
> *******************************************************************
> - */
> -
> - parameters (4);
> -
> - m = ( 1001-7 )/2;
> - do
> - {
> - for ( k=6 ; k<1001 ; k=k+m )
> - {
> - lw = k - 6;
> - temp = x[k-1];
> -
> - for ( j=4 ; j<n ; j=j+5 )
> - {
> - temp -= x[lw]*y[j];
> - lw++;
> - }
> - x[k-1] = y[4]*temp;
> - }
> -
> - endloop (4);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 5 -- tri-diagonal elimination, below diagonal
> -
> *******************************************************************
> - */
> -
> - parameters (5);
> -
> - do
> - {
> - for ( i=1 ; i<n ; i++ )
> - {
> - x[i] = z[i]*( y[i] - x[i-1] );
> - }
> -
> - endloop (5);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 6 -- general linear recurrence equations
> -
> *******************************************************************
> - */
> -
> - parameters (6);
> -
> -
> - do
> - {
> - for ( i=1 ; i<n ; i++ )
> - {
> - w[i] = 0.01;
> - for ( k=0 ; k<i ; k++ )
> - {
> - w[i] += b[k][i] * w[(i-k)-1];
> - }
> - }
> -
> - endloop (6);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 7 -- equation of state fragment
> -
> *******************************************************************
> - */
> -
> - parameters (7);
> -
> - do
> - {
> -
> - for ( k=0 ; k<n ; k++ )
> - {
> - x[k] = u[k] + r*( z[k] + r*y[k] ) +
> - t*( u[k+3] + r*( u[k+2] + r*u[k+1] ) +
> - t*( u[k+6] + q*( u[k+5] + q*u[k+4] ) ) );
> - }
> -
> - endloop (7);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 8 -- ADI integration
> -
> *******************************************************************
> - */
> -
> - nl1 = 0;
> - nl2 = 1;
> -
> - parameters (8);
> -
> - do
> - {
> - for ( kx=1 ; kx<3 ; kx++ )
> - {
> -
> - for ( ky=1 ; ky<n ; ky++ )
> - {
> - du1[ky] = u1[nl1][ky+1][kx] - u1[nl1][ky-1][kx];
> - du2[ky] = u2[nl1][ky+1][kx] - u2[nl1][ky-1][kx];
> - du3[ky] = u3[nl1][ky+1][kx] - u3[nl1][ky-1][kx];
> - u1[nl2][ky][kx]=
> - u1[nl1][ky][kx]+a11*du1[ky]+a12*du2[ky]
> +a13*du3[ky] + sig*
> - (u1[nl1][ky][kx+1]-2.0*u1[nl1][ky][kx]+u1[nl1][ky]
> [kx-1]);
> - u2[nl2][ky][kx]=
> - u2[nl1][ky][kx]+a21*du1[ky]+a22*du2[ky]
> +a23*du3[ky] + sig*
> - (u2[nl1][ky][kx+1]-2.0*u2[nl1][ky][kx]+u2[nl1][ky]
> [kx-1]);
> - u3[nl2][ky][kx]=
> - u3[nl1][ky][kx]+a31*du1[ky]+a32*du2[ky]
> +a33*du3[ky] + sig*
> - (u3[nl1][ky][kx+1]-2.0*u3[nl1][ky][kx]+u3[nl1][ky]
> [kx-1]);
> - }
> - }
> -
> - endloop (8);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 9 -- integrate predictors
> -
> *******************************************************************
> - */
> -
> - parameters (9);
> -
> - do
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - px[i][0] = dm28*px[i][12] + dm27*px[i][11] + dm26*px[i]
> [10] +
> - dm25*px[i][ 9] + dm24*px[i][ 8] + dm23*px[i]
> [ 7] +
> - dm22*px[i][ 6] + c0*( px[i][ 4] + px[i][ 5])
> - + px[i][ 2];
> - }
> -
> - endloop (9);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 10 -- difference predictors
> -
> *******************************************************************
> - */
> -
> - parameters (10);
> -
> - do
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - ar = cx[i][ 4];
> - br = ar - px[i][ 4];
> - px[i][ 4] = ar;
> - cr = br - px[i][ 5];
> - px[i][ 5] = br;
> - ar = cr - px[i][ 6];
> - px[i][ 6] = cr;
> - br = ar - px[i][ 7];
> - px[i][ 7] = ar;
> - cr = br - px[i][ 8];
> - px[i][ 8] = br;
> - ar = cr - px[i][ 9];
> - px[i][ 9] = cr;
> - br = ar - px[i][10];
> - px[i][10] = ar;
> - cr = br - px[i][11];
> - px[i][11] = br;
> - px[i][13] = cr - px[i][12];
> - px[i][12] = cr;
> - }
> -
> - endloop (10);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 11 -- first sum
> -
> *******************************************************************
> - */
> -
> - parameters (11);
> -
> - do
> - {
> - x[0] = y[0];
> - for ( k=1 ; k<n ; k++ )
> - {
> - x[k] = x[k-1] + y[k];
> - }
> -
> - endloop (11);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 12 -- first difference
> -
> *******************************************************************
> - */
> -
> - parameters (12);
> -
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - x[k] = y[k+1] - y[k];
> - }
> -
> - endloop (12);
> - }
> - while (count < loop);
> -
> -
> - /*
> -
> *******************************************************************
> - * Kernel 13 -- 2-D PIC (Particle In Cell)
> -
> *******************************************************************
> - */
> -
> - parameters (13);
> -
> - do
> - {
> - for ( ip=0; ip<n; ip++)
> - {
> - i1 = p[ip][0];
> - j1 = p[ip][1];
> - i1 &= 64-1;
> - j1 &= 64-1;
> - p[ip][2] += b[j1][i1];
> - p[ip][3] += c[j1][i1];
> - p[ip][0] += p[ip][2];
> - p[ip][1] += p[ip][3];
> - i2 = p[ip][0];
> - j2 = p[ip][1];
> - i2 = ( i2 & 64-1 ) - 1 ;
> - j2 = ( j2 & 64-1 ) - 1 ;
> - p[ip][0] += y[i2+32];
> - p[ip][1] += z[j2+32];
> - i2 += e[i2+32];
> - j2 += f[j2+32];
> - h[j2][i2] += 1.0;
> - }
> - endloop (13);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 14 -- 1-D PIC (Particle In Cell)
> -
> *******************************************************************
> - */
> -
> - parameters (14);
> -
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - vx[k] = 0.0;
> - xx[k] = 0.0;
> - ix[k] = (long) grd[k];
> - xi[k] = (double) ix[k];
> - ex1[k] = ex[ ix[k] - 1 ];
> - dex1[k] = dex[ ix[k] - 1 ];
> - }
> - for ( k=0 ; k<n ; k++ )
> - {
> - vx[k] = vx[k] + ex1[k] + ( xx[k] - xi[k] )*dex1[k];
> - xx[k] = xx[k] + vx[k] + flx;
> - ir[k] = xx[k];
> - rx[k] = xx[k] - ir[k];
> - ir[k] = ( ir[k] & 2048-1 ) + 1;
> - xx[k] = rx[k] + ir[k];
> - }
> - for ( k=0 ; k<n ; k++ )
> - {
> - rh[ ir[k]-1 ] += 1.0 - rx[k];
> - rh[ ir[k] ] += rx[k];
> - }
> - endloop (14);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 15 -- Casual Fortran. Development version
> -
> *******************************************************************
> - */
> -
> - parameters (15);
> -
> - do
> - {
> - ng = 7;
> - nz = n;
> - ar = 0.053;
> - br = 0.073;
> - for ( j=1 ; j<ng ; j++ )
> - {
> - for ( k=1 ; k<nz ; k++ )
> - {
> - if ( (j+1) >= ng )
> - {
> - vy[j][k] = 0.0;
> - continue;
> - }
> - if ( vh[j+1][k] > vh[j][k] )
> - {
> - t = ar;
> - }
> - else
> - {
> - t = br;
> - }
> - if ( vf[j][k] < vf[j][k-1] )
> - {
> - if ( vh[j][k-1] > vh[j+1][k-1] )
> - r = vh[j][k-1];
> - else
> - r = vh[j+1][k-1];
> - s = vf[j][k-1];
> - }
> - else
> - {
> - if ( vh[j][k] > vh[j+1][k] )
> - r = vh[j][k];
> - else
> - r = vh[j+1][k];
> - s = vf[j][k];
> - }
> - vy[j][k] = sqrt( vg[j][k]*vg[j][k] + r*r )* t/s;
> - if ( (k+1) >= nz )
> - {
> - vs[j][k] = 0.0;
> - continue;
> - }
> - if ( vf[j][k] < vf[j-1][k] )
> - {
> - if ( vg[j-1][k] > vg[j-1][k+1] )
> - r = vg[j-1][k];
> - else
> - r = vg[j-1][k+1];
> - s = vf[j-1][k];
> - t = br;
> - }
> - else
> - {
> - if ( vg[j][k] > vg[j][k+1] )
> - r = vg[j][k];
> - else
> - r = vg[j][k+1];
> - s = vf[j][k];
> - t = ar;
> - }
> - vs[j][k] = sqrt( vh[j][k]*vh[j][k] + r*r )* t / s;
> - }
> - }
> - endloop (15);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 16 -- Monte Carlo search loop
> -
> *******************************************************************
> - */
> -
> - parameters (16);
> -
> -
> - ii = n / 3;
> - lb = ii + ii;
> - k3 = k2 = 0;
> - do
> - {
> - i1 = m16 = 1;
> - label410:
> - j2 = ( n + n )*( m16 - 1 ) + 1;
> - for ( k=1 ; k<=n ; k++ )
> - {
> - k2++;
> - j4 = j2 + k + k;
> - j5 = zone[j4-1];
> - if ( j5 < n )
> - {
> - if ( j5+lb < n )
> - { /* 420 */
> - tmp = plan[j5-1] - t; /* 435 */
> - }
> - else
> - {
> - if ( j5+ii < n )
> - { /* 415 */
> - tmp = plan[j5-1] - s; /* 430 */
> - }
> - else
> - {
> - tmp = plan[j5-1] - r; /* 425 */
> - }
> - }
> - }
> - else if( j5 == n )
> - {
> - break; /* 475 */
> - }
> - else
> - {
> - k3++; /* 450 */
> - tmp=(d[j5-1]-(d[j5-2]*(t-d[j5-3])*(t-d[j5-3])+(s-
> d[j5-4])*
> - (s-d[j5-4])+(r-d[j5-5])*(r-d[j5-5])));
> - }
> - if ( tmp < 0.0 )
> - {
> - if ( zone[j4-2] < 0 ) /* 445 */
> - continue; /* 470 */
> - else if ( !zone[j4-2] )
> - break; /* 480 */
> - }
> - else if ( tmp )
> - {
> - if ( zone[j4-2] > 0 ) /* 440 */
> - continue; /* 470 */
> - else if ( !zone[j4-2] )
> - break; /* 480 */
> - }
> - else break; /* 485 */
> - m16++; /* 455 */
> - if ( m16 > zone[0] )
> - m16 = 1; /* 460 */
> - if ( i1-m16 ) /* 465 */
> - goto label410;
> - else
> - break;
> - }
> - endloop (16);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 17 -- implicit, conditional computation
> -
> *******************************************************************
> - */
> -
> - parameters (17);
> -
> - do
> - {
> - i = n-1;
> - j = 0;
> - ink = -1;
> - scale = 5.0 / 3.0;
> - xnm = 1.0 / 3.0;
> - e6 = 1.03 / 3.07;
> - goto l61;
> -l60: e6 = xnm*vsp[i] + vstp[i];
> - vxne[i] = e6;
> - xnm = e6;
> - ve3[i] = e6;
> - i += ink;
> - if ( i==j ) goto l62;
> -l61: e3 = xnm*vlr[i] + vlin[i];
> - xnei = vxne[i];
> - vxnd[i] = e6;
> - xnc = scale*e3;
> - if ( xnm > xnc ) goto l60;
> - if ( xnei > xnc ) goto l60;
> - ve3[i] = e3;
> - e6 = e3 + e3 - xnm;
> - vxne[i] = e3 + e3 - xnei;
> - xnm = e6;
> - i += ink;
> - if ( i != j ) goto l61;
> -l62:;
> - endloop (17);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 18 - 2-D explicit hydrodynamics fragment
> -
> *******************************************************************
> - */
> -
> - parameters (18);
> -
> - do
> - {
> - t = 0.0037;
> - s = 0.0041;
> - kn = 6;
> - jn = n;
> - for ( k=1 ; k<kn ; k++ )
> - {
> -
> - for ( j=1 ; j<jn ; j++ )
> - {
> - za[k][j] = ( zp[k+1][j-1] +zq[k+1][j-1] -zp[k][j-1] -
> zq[k][j-1] )*
> - ( zr[k][j] +zr[k][j-1] ) / ( zm[k][j-1] +zm[k
> +1][j-1]);
> - zb[k][j] = ( zp[k][j-1] +zq[k][j-1] -zp[k][j] -zq[k]
> [j] ) *
> - ( zr[k][j] +zr[k-1][j] ) / ( zm[k][j] +zm[k]
> [j-1]);
> - }
> - }
> - for ( k=1 ; k<kn ; k++ )
> - {
> -
> - for ( j=1 ; j<jn ; j++ )
> - {
> - zu[k][j] += s*( za[k][j] *( zz[k][j] - zz[k][j
> +1] ) -
> - za[k][j-1] *( zz[k][j] - zz[k]
> [j-1] ) -
> - zb[k][j] *( zz[k][j] - zz[k-1]
> [j] ) +
> - zb[k+1][j] *( zz[k][j] - zz[k+1]
> [j] ) );
> - zv[k][j] += s*( za[k][j] *( zr[k][j] - zr[k][j
> +1] ) -
> - za[k][j-1] *( zr[k][j] - zr[k]
> [j-1] ) -
> - zb[k][j] *( zr[k][j] - zr[k-1]
> [j] ) +
> - zb[k+1][j] *( zr[k][j] - zr[k+1]
> [j] ) );
> - }
> - }
> - for ( k=1 ; k<kn ; k++ )
> - {
> -
> - for ( j=1 ; j<jn ; j++ )
> - {
> - zr[k][j] = zr[k][j] + t*zu[k][j];
> - zz[k][j] = zz[k][j] + t*zv[k][j];
> - }
> - }
> - endloop (18);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 19 -- general linear recurrence equations
> -
> *******************************************************************
> - */
> -
> - parameters (19);
> -
> - kb5i = 0;
> -
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - b5[k+kb5i] = sa[k] + stb5*sb[k];
> - stb5 = b5[k+kb5i] - stb5;
> - }
> - for ( i=1 ; i<=n ; i++ )
> - {
> - k = n - i;
> - b5[k+kb5i] = sa[k] + stb5*sb[k];
> - stb5 = b5[k+kb5i] - stb5;
> - }
> - endloop (19);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 20 - Discrete ordinates transport, conditional
> recurrence on xx
> -
> *******************************************************************
> - */
> -
> - parameters (20);
> -
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - di = y[k] - g[k] / ( xx[k] + dk );
> - dn = 0.2;
> - if ( di )
> - {
> - dn = z[k]/di ;
> - if ( t < dn ) dn = t;
> - if ( s > dn ) dn = s;
> - }
> - x[k] = ( ( w[k] + v[k]*dn )* xx[k] + u[k] ) / ( vx[k] +
> v[k]*dn );
> - xx[k+1] = ( x[k] - xx[k] )* dn + xx[k];
> - }
> - endloop (20);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 21 -- matrix*matrix product
> -
> *******************************************************************
> - */
> -
> - parameters (21);
> -
> - do
> - {
> - for ( k=0 ; k<25 ; k++ )
> - {
> - for ( i=0 ; i<25 ; i++ )
> - {
> - for ( j=0 ; j<n ; j++ )
> - {
> - px[j][i] += vy[k][i] * cx[j][k];
> - }
> - }
> - }
> - endloop (21);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 22 -- Planckian distribution
> -
> *******************************************************************
> - */
> -
> - parameters (22);
> -
> - expmax = 20.0;
> - u[n-1] = 0.99*expmax*v[n-1];
> - do
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - y[k] = u[k] / v[k];
> - w[k] = x[k] / ( exp( y[k] ) -1.0 );
> - }
> - endloop (22);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 23 -- 2-D implicit hydrodynamics fragment
> -
> *******************************************************************
> - */
> -
> - parameters (23);
> -
> - do
> - {
> - for ( j=1 ; j<6 ; j++ )
> - {
> - for ( k=1 ; k<n ; k++ )
> - {
> - qa = za[j+1][k]*zr[j][k] + za[j-1][k]*zb[j][k] +
> - za[j][k+1]*zu[j][k] + za[j][k-1]*zv[j][k] +
> zz[j][k];
> - za[j][k] += 0.175*( qa - za[j][k] );
> - }
> - }
> - endloop (23);
> - }
> - while (count < loop);
> -
> - /*
> -
> *******************************************************************
> - * Kernel 24 -- find location of first minimum in array
> -
> *******************************************************************
> - */
> -
> - parameters (24);
> -
> - x[n/2] = -1.0e+10;
> - do
> - {
> - m24 = 0;
> - for ( k=1 ; k<n ; k++ )
> - {
> - if ( x[k] < x[m24] ) m24 = k;
> - }
> - endloop (24);
> - }
> - while (count < loop);
> -
> - return;
> - }
> -
> -/
> ************************************************************************
> - * endloop procedure - calculate checksums and
> MFLOPS *
> -
> ************************************************************************/
> -
> -long endloop(long which)
> -{
> - double now = 1.0, useflops;
> - long i, j, k, m;
> - double Scale = 1000000.0;
> -
> - count = count + 1;
> - if (count >= loop) /* else return */
> - {
> -
> -/
> ************************************************************************
> - * End of standard set of loops for one
> kernel *
> -
> ************************************************************************/
> -
> - count2 = count2 + 1;
> - if (count2 == extra_loops[section][which])
> - /* else re-initialise parameters if
> required */
> - {
> -
> -/
> ************************************************************************
> - * End of extra loops for 5 seconds execution
> time *
> -
> ************************************************************************/
> -
> - count2 = 0;
> - if (which == 1)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][1] = Checksum[section][1] + x[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 2)
> - {
> - for ( k=0 ; k<n*2 ; k++ )
> - {
> - Checksum[section][2] = Checksum[section][2] + x[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)((n-4) * loop);
> - }
> - if (which == 3)
> - {
> - Checksum[section][3] = q;
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 4)
> - {
> - for ( k=0 ; k<3 ; k++ )
> - {
> - Checksum[section][4] = Checksum[section][4] + v[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double) ((((n-5)/5)+1) * 3 * loop);
> - }
> - if (which == 5)
> - {
> - for ( k=1 ; k<n ; k++ )
> - {
> - Checksum[section][5] = Checksum[section][5] + x[k]
> - * (double)(k);
> - }
> - useflops = nflops * (double)((n-1) * loop);
> - }
> - if (which == 6)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> -
> - Checksum[section][6] = Checksum[section][6] + w[k]
> - * (double)(k+1);
> -
> - }
> - useflops = nflops * (double)(n * ((n - 1) / 2) * loop);
> - }
> - if (which == 7)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][7] = Checksum[section][7] + x[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 8)
> - {
> - for ( i=0 ; i<2 ; i++ )
> - {
> - for ( j=0 ; j<101 ; j++ )
> - {
> - for ( k=0 ; k<5 ; k++ )
> - {
> - m = 101 * 5 * i + 5 * j + k + 1;
> - if (m < 10 * n + 1)
> - {
> - Checksum[section][8] = Checksum[section][8]
> - + u1[i][j][k] * m
> - + u2[i][j][k] * m + u3[i][j][k] *
> m;
> - }
> - }
> - }
> - }
> - useflops = nflops * (double)(2 * (n - 1) * loop);
> - }
> - if (which == 9)
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - for ( j=0 ; j<25 ; j++ )
> - {
> - m = 25 * i + j + 1;
> - if (m < 15 * n + 1)
> - {
> - Checksum[section][9] = Checksum[section][9]
> - + px[i][j] * (double)
> (m);
> - }
> - }
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 10)
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - for (j=0 ; j<25 ; j++ )
> - {
> - m = 25 * i + j + 1;
> - if (m < 15 * n + 1)
> - {
> - Checksum[section][10] = Checksum[section][10]
> - + px[i][j] * (double)
> (m);
> - }
> - }
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 11)
> - {
> - for ( k=1 ; k<n ; k++ )
> - {
> - Checksum[section][11] = Checksum[section][11]
> - + x[k] * (double)(k);
> - }
> - useflops = nflops * (double)((n - 1) * loop);
> - }
> - if (which == 12)
> - {
> - for ( k=0 ; k<n-1 ; k++ )
> - {
> - Checksum[section][12] = Checksum[section][12] + x[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 13)
> - {
> - for ( k=0 ; k<2*n ; k++ )
> - {
> - for ( j=0 ; j<4 ; j++ )
> - {
> - m = 4 * k + j + 1;
> - Checksum[section][13] = Checksum[section][13]
> - + p[k][j]* (double)(m);
> - }
> - }
> - for ( i=0 ; i<8*n/64 ; i++ )
> - {
> - for ( j=0 ; j<64 ; j++ )
> - {
> - m = 64 * i + j + 1;
> - if (m < 8 * n + 1)
> - {
> - Checksum[section][13] = Checksum[section][13]
> - + h[i][j] *
> (double)(m);
> - }
> - }
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 14)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][14] = Checksum[section][14]
> - + (xx[k] + vx[k]) *
> (double)(k+1);
> - }
> - for ( k=0 ; k<67 ; k++ )
> - {
> - Checksum[section][14] = Checksum[section][14] + rh[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 15)
> - {
> - for ( j=0 ; j<7 ; j++ )
> - {
> - for ( k=0 ; k<101 ; k++ )
> - {
> - m = 101 * j + k + 1;
> - if (m < n * 7 + 1)
> - {
> - Checksum[section][15] = Checksum[section][15]
> - + (vs[j][k] + vy[j][k]) *
> (double)(m);
> - }
> - }
> - }
> - useflops = nflops * (double)((n - 1) * 5 * loop);
> - }
> - if (which == 16)
> - {
> - Checksum[section][16] = (double)(k3 + k2 + j5 + m16);
> - useflops = (k2 + k2 + 10 * k3);
> - }
> - if (which == 17)
> - {
> - Checksum[section][17] = xnm;
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][17] = Checksum[section][17]
> - + (vxne[k] + vxnd[k]) *
> (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 18)
> - {
> - for ( k=0 ; k<7 ; k++ )
> - {
> - for ( j=0 ; j<101 ; j++ )
> - {
> - m = 101 * k + j + 1;
> - if (m < 7 * n + 1)
> - {
> - Checksum[section][18] = Checksum[section][18]
> - + (zz[k][j] + zr[k][j]) *
> (double)(m);
> - }
> - }
> - }
> - useflops = nflops * (double)((n - 1) * 5 * loop);
> - }
> - if (which == 19)
> - {
> - Checksum[section][19] = stb5;
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][19] = Checksum[section][19] + b5[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 20)
> - {
> - for ( k=1 ; k<n+1 ; k++ )
> - {
> - Checksum[section][20] = Checksum[section][20] + xx[k]
> - * (double)(k);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 21)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - for ( i=0 ; i<25 ; i++ )
> - {
> - m = 25 * k + i + 1;
> - Checksum[section][21] = Checksum[section][21]
> - + px[k][i] * (double)
> (m);
> - }
> - }
> - useflops = nflops * (double)(n * 625 * loop);
> -
> - }
> - if (which == 22)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - Checksum[section][22] = Checksum[section][22] + w[k]
> - * (double)(k+1);
> - }
> - useflops = nflops * (double)(n * loop);
> - }
> - if (which == 23)
> - {
> - for ( j=0 ; j<7 ; j++ )
> - {
> - for ( k=0 ; k<101 ; k++ )
> - {
> - m = 101 * j + k + 1;
> - if (m < 7 * n + 1)
> - {
> - Checksum[section][23] = Checksum[section]
> [23]
> - + za[j][k] *
> (double)(m);
> - }
> - }
> - }
> - useflops = nflops * (double)((n-1) * 5 * loop);
> - }
> - if (which == 24)
> - {
> - Checksum[section][24] = (double)(m24);
> - useflops = nflops * (double)((n - 1) * loop);
> - }
> -
> -
> -/
> ************************************************************************
> - * Deduct overheads from time, calculate MFLOPS, display
> results *
> -
> ************************************************************************/
> -
> - RunTime[section][which] = RunTime[section][which]
> - - (loop * extra_loops[section][which]) *
> overhead_l;
> - FPops[section][which] = useflops * extra_loops[section]
> [which];
> - Mflops[section][which] = FPops[section][which] / Scale
> - / RunTime[section]
> [which];
> -
> -
> -/
> ************************************************************************
> - * Compare sumcheck with standard result, calculate
> accuracy *
> -
> ************************************************************************/
> -
> - printf("%10.3f\n", Checksum[section][which]);
> -
> - }
> - else
> - {
> -/
> ************************************************************************
> - * Re-initialise data if
> reqired *
> -
> ************************************************************************/
> -
> - count = 0;
> - if (which == 2)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - x[k] = x0[k];
> - }
> - }
> - if (which == 4)
> - {
> - m = (1001-7)/2;
> - for ( k=6 ; k<1001 ; k=k+m )
> - {
> - x[k] = x0[k];
> - }
> - }
> - if (which == 5)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - x[k] = x0[k];
> - }
> - }
> - if (which == 6)
> - {
> - for ( k=0 ; k<n ; k++ )
> - {
> - w[k] = w0[k];
> - }
> - }
> - if (which == 10)
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - for (j=4 ; j<13 ; j++ )
> - {
> - px[i][j] = px0[i][j];
> - }
> - }
> - }
> - if (which == 13)
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - for (j=0 ; j<4 ; j++ )
> - {
> - p[i][j] = p0[i][j];
> - }
> - }
> - for ( i=0 ; i<64 ; i++ )
> - {
> - for (j=0 ; j<64 ; j++ )
> - {
> - h[i][j] = h0[i][j];
> - }
> - }
> - }
> - if (which == 14)
> - {
> - for ( i=0; i<n ; i++ )
> - {
> - rh[ir[i] - 1] = rh0[ir[i] - 1];
> - rh[ir[i] ] = rh0[ir[i] ];
> - }
> - }
> - if (which == 17)
> - {
> - for ( i=0; i<n ; i++ )
> - {
> - vxne[i] = vxne0[i];
> - }
> - }
> - if (which == 18)
> - {
> - for ( i=1 ; i<6 ; i++ )
> - {
> - for (j=1 ; j<n ; j++ )
> - {
> - zr[i][j] = zr0[i][j];
> - zu[i][j] = zu0[i][j];
> - zv[i][j] = zv0[i][j];
> - zz[i][j] = zz0[i][j];
> - }
> - }
> - }
> - if (which == 21)
> - {
> - for ( i=0 ; i<n ; i++ )
> - {
> - for (j=0 ; j<25 ; j++ )
> - {
> - px[i][j] = px0[i][j];
> - }
> - }
> - }
> - if (which == 23)
> - {
> - for ( i=1 ; i<6 ; i++ )
> - {
> - for (j=1 ; j<n ; j++ )
> - {
> - za[i][j] = za0[i][j];
> - }
> - }
> - }
> - k3 = k2 = 0;
> - stb5 = stb50;
> - xx[0] = xx0;
> -
> - }
> - }
> - return 0;
> -}
> -
> -/
> ************************************************************************
> - * init procedure - initialises data for all
> loops *
> -
> ************************************************************************/
> -
> - void init(long which)
> - {
> - long i, j, k, l, m, nn;
> - double ds, dw, rr, ss;
> - double fuzz, fizz, buzz, scaled, one;
> -
> - scaled = (double)(10.0);
> - scaled = (double)(1.0) / scaled;
> - fuzz = (double)(0.0012345);
> - buzz = (double)(1.0) + fuzz;
> - fizz = (double)(1.1) * fuzz;
> - one = (double)(1.0);
> -
> - for ( k=0 ; k<19977 + 34132 ; k++)
> - {
> - if (k == 19977)
> - {
> - fuzz = (double)(0.0012345);
> - buzz = (double) (1.0) + fuzz;
> - fizz = (double) (1.1) * fuzz;
> - }
> - buzz = (one - fuzz) * buzz + fuzz;
> - fuzz = - fuzz;
> - u[k] = (buzz - fizz) * scaled;
> - }
> -
> - fuzz = (double)(0.0012345);
> - buzz = (double) (1.0) + fuzz;
> - fizz = (double) (1.1) * fuzz;
> -
> - for ( k=1 ; k<40 ; k++)
> - {
> - buzz = (one - fuzz) * buzz + fuzz;
> - fuzz = - fuzz;
> - xtra[k] = (buzz - fizz) * scaled;
> - }
> -
> - ds = 1.0;
> - dw = 0.5;
> - for ( l=0 ; l<4 ; l++ )
> - {
> - for ( i=0 ; i<512 ; i++ )
> - {
> - p[i][l] = ds;
> - ds = ds + dw;
> - }
> - }
> - for ( i=0 ; i<96 ; i++ )
> - {
> - e[i] = 1;
> - f[i] = 1;
> - }
> -
> -
> - iqranf();
> - dw = -100.0;
> - for ( i=0; i<1001 ; i++ )
> - {
> - dex[i] = dw * dex[i];
> - grd[i] = ix[i];
> - }
> - flx = 0.001;
> -
> -
> - d[0]= 1.01980486428764;
> - nn = n16;
> -
> - for ( l=1 ; l<300 ; l++ )
> - {
> - d[l] = d[l-1] + 1.000e-4 / d[l-1];
> - }
> - rr = d[nn-1];
> - for ( l=1 ; l<=2 ; l++ )
> - {
> - m = (nn+nn)*(l-1);
> - for ( j=1 ; j<=2 ; j++ )
> - {
> - for ( k=1 ; k<=nn ; k++ )
> - {
> - m = m + 1;
> - ss = (double)(k);
> - plan[m-1] = rr * ((ss + 1.0) / ss);
> - zone[m-1] = k + k;
> - }
> - }
> - }
> - k = nn + nn + 1;
> - zone[k-1] = nn;
> -
> - if (which == 16)
> - {
> - r = d[n-1];
> - s = d[n-2];
> - t = d[n-3];
> - k3 = k2 = 0;
> - }
> - expmax = 20.0;
> - if (which == 22)
> - {
> - u[n-1] = 0.99*expmax*v[n-1];
> - }
> - if (which == 24)
> - {
> - x[n/2] = -1.0e+10;
> - }
> -
> -/
> ************************************************************************
> - * Make copies of data for extra
> loops *
> -
> ************************************************************************/
> -
> - for ( i=0; i<1001 ; i++ )
> - {
> - x0[i] = x[i];
> - w0[i] = w[i];
> - }
> - for ( i=0 ; i<101 ; i++ )
> - {
> - for (j=0 ; j<25 ; j++ )
> - {
> - px0[i][j] = px[i][j];
> - }
> - }
> - for ( i=0 ; i<512 ; i++ )
> - {
> - for (j=0 ; j<4 ; j++ )
> - {
> - p0[i][j] = p[i][j];
> - }
> - }
> - for ( i=0 ; i<64 ; i++ )
> - {
> - for (j=0 ; j<64 ; j++ )
> - {
> - h0[i][j] = h[i][j];
> - }
> - }
> - for ( i=0; i<2048 ; i++ )
> - {
> - rh0[i] = rh[i];
> - }
> - for ( i=0; i<101 ; i++ )
> - {
> - vxne0[i] = vxne[i];
> - }
> - for ( i=0 ; i<7 ; i++ )
> - {
> - for (j=0 ; j<101 ; j++ )
> - {
> - zr0[i][j] = zr[i][j];
> - zu0[i][j] = zu[i][j];
> - zv0[i][j] = zv[i][j];
> - zz0[i][j] = zz[i][j];
> - za0[i][j] = za[i][j];
> - }
> - }
> - stb50 = stb5;
> - xx0 = xx[0];
> -
> - return;
> - }
> -
> -/
> ************************************************************************
> - * parameters procedure for loop counts, Do spans, sumchecks,
> FLOPS *
> -
> ************************************************************************/
> -
> - long parameters(long which)
> - {
> -
> - long nloops[3][25] =
> - { {0, 1001, 101, 1001, 1001, 1001, 64, 995, 100,
> - 101, 101, 1001, 1000, 64, 1001, 101, 75,
> - 101, 100, 101, 1000, 101, 101, 100, 1001 },
> - {0, 101, 101, 101, 101, 101, 32, 101, 100,
> - 101, 101, 101, 100, 32, 101, 101, 40,
> - 101, 100, 101, 100, 50, 101, 100, 101 },
> - {0, 27, 15, 27, 27, 27, 8, 21, 14,
> - 15, 15, 27, 26, 8, 27, 15, 15,
> - 15, 14, 15, 26, 20, 15, 14, 27 } };
> -
> -
> -
> - long lpass[3][25] =
> - { {0, 7, 67, 9, 14, 10, 3, 4, 10, 36, 34, 11, 12,
> - 36, 2, 1, 25, 35, 2, 39, 1, 1, 11, 8, 5 },
> - {0, 40, 40, 53, 70, 55, 7, 22, 6, 21, 19, 64, 68,
> - 41, 10, 1, 27, 20, 1, 23, 8, 1, 7, 5,
> 31 },
> - {0, 28, 46, 37, 38, 40, 21, 20, 9, 26, 25, 46, 48,
> - 31, 8, 1, 14, 26, 2, 28, 7, 1, 8, 7,
> 23 } };
> -
> - double sums[3][25] =
> - {
> - { 0.0,
> - 5.114652693224671e+04, 1.539721811668385e+03,
> 1.000742883066363e+01,
> - 5.999250595473891e-01, 4.548871642387267e+03,
> 4.375116344729986e+03,
> - 6.104251075174761e+04, 1.501268005625798e+05,
> 1.189443609974981e+05,
> - 7.310369784325296e+04, 3.342910972650109e+07,
> 2.907141294167248e-05,
> - 1.202533961842803e+11, 3.165553044000334e+09,
> 3.943816690352042e+04,
> - 5.650760000000000e+05, 1.114641772902486e+03,
> 1.015727037502300e+05,
> - 5.421816960147207e+02, 3.040644339351239e+07,
> 1.597308280710199e+08,
> - 2.938604376566697e+02, 3.549900501563623e+04,
> 5.000000000000000e+02
> - },
> -
> - { 0.0,
> - 5.253344778937972e+02, 1.539721811668385e+03,
> 1.009741436578952e+00,
> - 5.999250595473891e-01, 4.589031939600982e+01,
> 8.631675645333210e+01,
> - 6.345586315784055e+02, 1.501268005625798e+05,
> 1.189443609974981e+05,
> - 7.310369784325296e+04, 3.433560407475758e+04,
> 7.127569130821465e-06,
> - 9.816387810944345e+10, 3.039983465145393e+07,
> 3.943816690352042e+04,
> - 6.480410000000000e+05, 1.114641772902486e+03,
> 1.015727037502300e+05,
> - 5.421816960147207e+02, 3.126205178815431e+04,
> 7.824524877232093e+07,
> - 2.938604376566697e+02, 3.549900501563623e+04,
> 5.000000000000000e+01
> - },
> -
> - { 0.0,
> - 3.855104502494961e+01, 3.953296986903059e+01,
> 2.699309089320672e-01,
> - 5.999250595473891e-01, 3.182615248447483e+00,
> 1.120309393467088e+00,
> - 2.845720217644024e+01, 2.960543667875003e+03,
> 2.623968460874250e+03,
> - 1.651291227698265e+03, 6.551161335845770e+02,
> 1.943435981130448e-06,
> - 3.847124199949426e+10, 2.923540598672011e+06,
> 1.108997288134785e+03,
> - 5.152160000000000e+05, 2.947368618589360e+01,
> 9.700646212337040e+02,
> - 1.268230698051003e+01, 5.987713249475302e+02,
> 5.009945671204667e+07,
> - 6.109968728263972e+00, 4.850340602749970e+02,
> 1.300000000000000e+01
> - } };
> -
> -
> -
> - double number_flops[25] = {0, 5., 4., 2., 2., 2., 2., 16.,
> 36., 17.,
> - 9., 1., 1., 7., 11., 33.,10.,
> 9., 44.,
> - 6., 26., 2., 17., 11., 1.};
> - double now = 1.0;
> -
> -
> - n = nloops[section][which];
> - nspan[section][which] = n;
> - n16 = nloops[section][16];
> - nflops = number_flops[which];
> - xflops[which] = nflops;
> - loop = lpass[section][which];
> - xloops[section][which] = loop;
> - loop = loop * mult;
> - MasterSum = sums[section][which];
> - count = 0;
> -
> - init(which);
> -
> -
> - return 0;
> - }
> -
> -/
> ************************************************************************
> - * check procedure to check accuracy of
> calculations *
> -
> ************************************************************************/
> -
> - void check(long which)
> - {
> - long maxs = 16;
> - double xm, ym, re, min1, max1;
> -
> - xm = MasterSum;
> - ym = Checksum[section][which];
> -
> - if (xm * ym < 0.0)
> - {
> - accuracy[section][which] = 0;
> - }
> - else
> - {
> - if ( xm == ym)
> - {
> - accuracy[section][which] = maxs;
> - }
> - else
> - {
> - xm = fabs(xm);
> - ym = fabs(ym);
> - min1 = xm;
> - max1 = ym;
> - if (ym < xm)
> - {
> - min1 = ym;
> - max1 = xm;
> - }
> - re = 1.0 - min1 / max1;
> - accuracy[section][which] =
> - (long)
> ( fabs(log10(fabs(re))) + 0.5);
> - }
> - }
> -
> - return;
> - }
> -
> -/
> ************************************************************************
> - * iqranf procedure - random number generator for Kernel
> 14 *
> -
> ************************************************************************/
> -
> - void iqranf()
> - {
> -
> - long inset, Mmin, Mmax, nn, i, kk;
> - double span, spin, realn, per, scale1, qq, dkk, dp, dq;
> - long seed[3] = { 256, 12491249, 1499352848 };
> -
> - nn = 1001;
> - Mmin = 1;
> - Mmax = 1001;
> - kk = seed[section];
> -
> - inset= Mmin;
> - span= Mmax - Mmin;
> - spin= 16807;
> - per= 2147483647;
> - realn= nn;
> - scale1= 1.00001;
> - qq= scale1 * (span / realn);
> - dkk= kk;
> -
> - for ( i=0 ; i<nn ; i++)
> - {
> - dp= dkk*spin;
> - dkk= dp - (long)( dp/per)*per;
> - dq= dkk*span;
> - ix[i] = inset + ( dq/ per);
> - if (ix[i] < Mmin | ix[i] > Mmax)
> - {
> - ix[i] = inset + i + 1 * qq;
> - }
> - }
> -
> - return;
> - }
> -
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 2624 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20081201/066fdba5/attachment.bin>
More information about the llvm-commits
mailing list