[LLVMdev] -msse3 can degrade performance
Jon Harrop
jon at ffconsultancy.com
Fri Jan 30 17:43:30 PST 2009
I just remembered an anomalous result that I stumbled upon whilst tweaking the
command-line options to llvm-gcc. Specifically, the -msse3 flag does a great
job improving the performance of floating point intensive code on the
SciMark2 benchmark but it also degrades the performance of the int-intensive
Monte Carlo part of the test:
$ llvm-gcc -Wall -lm -O3 *.c -o scimark2
$ ./scimark2
Using 2.00 seconds min time per kenel.
Composite Score: 432.84
FFT Mflops: 358.90 (N=1024)
SOR Mflops: 473.45 (100 x 100)
MonteCarlo: Mflops: 210.54
Sparse matmult Mflops: 354.25 (N=1000, nz=5000)
LU Mflops: 767.04 (M=100, N=100)
$ llvm-gcc -Wall -lm -O3 -msse3 *.c -o scimark2
$ ./scimark2
Composite Score: 548.53
FFT Mflops: 609.87 (N=1024)
SOR Mflops: 497.92 (100 x 100)
MonteCarlo: Mflops: 126.62
Sparse matmult Mflops: 604.02 (N=1000, nz=5000)
LU Mflops: 904.19 (M=100, N=100)
The relevant code is:
double Random_nextDouble(Random R)
{
int k;
int I = R->i;
int J = R->j;
int *m = R->m;
k = m[I] - m[J];
if (k < 0) k += m1;
R->m[J] = k;
if (I == 0)
I = 16;
else I--;
R->i = I;
if (J == 0)
J = 16 ;
else J--;
R->j = J;
if (R->haveRange)
return R->left + dm1 * (double) k * R->width;
else
return dm1 * (double) k;
}
double MonteCarlo_integrate(int Num_samples)
{
Random R = new_Random_seed(SEED);
int under_curve = 0;
int count;
for (count=0; count<Num_samples; count++)
{
double x= Random_nextDouble(R);
double y= Random_nextDouble(R);
if ( x*x + y*y <= 1.0)
under_curve ++;
}
Random_delete(R);
return ((double) under_curve / Num_samples) * 4.0;
}
--
Dr Jon Harrop, Flying Frog Consultancy Ltd.
http://www.ffconsultancy.com/?e
More information about the llvm-dev
mailing list