[Openmp-dev] [LLVMdev] libiomp, not libgomp as default library linked with -fopenmp

Jack Howarth howarth.mailing.lists at gmail.com
Wed May 6 16:48:31 PDT 2015


Andrey,
     An initial attempt at benchmarking the performance for graphicsmagick
1.3.19 on x86_64-apple-darwin14 built at various optimization levels with
openmp support enabled using gcc 5.1.0 or clang svn at r236592 with...

http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150504/128555.html
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150504/128561.html
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20150504/128567.html

produced the following results.

gcc 5.1 -O3

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 14 iter 10.76s user 10.76s total 1.301 iter/s 1.301
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 25 iter 19.75s user 10.27s total 2.434 iter/s 1.266
iter/cpu 1.87 speedup 0.069 karp-flatt
Results: 3 threads 36 iter 28.74s user 10.04s total 3.586 iter/s 1.253
iter/cpu 2.76 speedup 0.044 karp-flatt
Results: 4 threads 48 iter 38.54s user 10.21s total 4.701 iter/s 1.245
iter/cpu 3.61 speedup 0.036 karp-flatt
Results: 5 threads 58 iter 46.71s user 10.04s total 5.777 iter/s 1.242
iter/cpu 4.44 speedup 0.032 karp-flatt
Results: 6 threads 69 iter 55.76s user 10.14s total 6.805 iter/s 1.237
iter/cpu 5.23 speedup 0.029 karp-flatt
Results: 7 threads 78 iter 63.16s user 10.01s total 7.792 iter/s 1.235
iter/cpu 5.99 speedup 0.028 karp-flatt
Results: 8 threads 88 iter 71.33s user 10.02s total 8.782 iter/s 1.234
iter/cpu 6.75 speedup 0.026 karp-flatt

clang 3.7svn -O3

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 19 iter 10.42s user 10.41s total 1.825 iter/s 1.823
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 36 iter 20.15s user 10.08s total 3.571 iter/s 1.787
iter/cpu 1.96 speedup 0.022 karp-flatt
Results: 3 threads 53 iter 30.45s user 10.15s total 5.222 iter/s 1.741
iter/cpu 2.86 speedup 0.024 karp-flatt
Results: 4 threads 68 iter 39.96s user 10.00s total 6.800 iter/s 1.702
iter/cpu 3.73 speedup 0.025 karp-flatt
Results: 5 threads 83 iter 50.18s user 10.04s total 8.267 iter/s 1.654
iter/cpu 4.53 speedup 0.026 karp-flatt
Results: 6 threads 97 iter 59.97s user 10.01s total 9.690 iter/s 1.617
iter/cpu 5.31 speedup 0.026 karp-flatt
Results: 7 threads 111 iter 70.37s user 10.06s total 11.034 iter/s 1.577
iter/cpu 6.05 speedup 0.026 karp-flatt
Results: 8 threads 124 iter 79.95s user 10.04s total 12.351 iter/s 1.551
iter/cpu 6.77 speedup 0.026 karp-flatt

gcc 5.1 -O2

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 13 iter 10.04s user 10.04s total 1.295 iter/s 1.295
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 25 iter 19.86s user 10.32s total 2.422 iter/s 1.259
iter/cpu 1.87 speedup 0.069 karp-flatt
Results: 3 threads 36 iter 28.87s user 10.08s total 3.571 iter/s 1.247
iter/cpu 2.76 speedup 0.044 karp-flatt
Results: 4 threads 47 iter 37.84s user 10.03s total 4.686 iter/s 1.242
iter/cpu 3.62 speedup 0.035 karp-flatt
Results: 5 threads 58 iter 46.84s user 10.09s total 5.748 iter/s 1.238
iter/cpu 4.44 speedup 0.032 karp-flatt
Results: 6 threads 68 iter 55.06s user 10.02s total 6.786 iter/s 1.235
iter/cpu 5.24 speedup 0.029 karp-flatt
Results: 7 threads 78 iter 63.28s user 10.05s total 7.761 iter/s 1.233
iter/cpu 5.99 speedup 0.028 karp-flatt
Results: 8 threads 88 iter 71.48s user 10.02s total 8.782 iter/s 1.231
iter/cpu 6.78 speedup 0.026 karp-flatt

clang 3.7svn -O2

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 19 iter 10.36s user 10.35s total 1.836 iter/s 1.834
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 32 iter 20.63s user 10.31s total 3.104 iter/s 1.551
iter/cpu 1.69 speedup 0.183 karp-flatt
Results: 3 threads 46 iter 30.29s user 10.10s total 4.554 iter/s 1.519
iter/cpu 2.48 speedup 0.105 karp-flatt
Results: 4 threads 60 iter 40.36s user 10.09s total 5.946 iter/s 1.487
iter/cpu 3.24 speedup 0.078 karp-flatt
Results: 5 threads 73 iter 50.25s user 10.05s total 7.264 iter/s 1.453
iter/cpu 3.96 speedup 0.066 karp-flatt
Results: 6 threads 86 iter 60.44s user 10.08s total 8.532 iter/s 1.423
iter/cpu 4.65 speedup 0.058 karp-flatt
Results: 7 threads 98 iter 70.47s user 10.08s total 9.722 iter/s 1.391
iter/cpu 5.30 speedup 0.054 karp-flatt
Results: 8 threads 109 iter 79.59s user 10.02s total 10.878 iter/s 1.370
iter/cpu 5.93 speedup 0.050 karp-flatt

gcc 5.1 -Os

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 12 iter 10.29s user 10.29s total 1.166 iter/s 1.166
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 23 iter 19.56s user 10.00s total 2.300 iter/s 1.176
iter/cpu 1.97 speedup 0.014 karp-flatt
Results: 3 threads 35 iter 29.68s user 10.27s total 3.408 iter/s 1.179
iter/cpu 2.92 speedup 0.013 karp-flatt
Results: 4 threads 45 iter 38.14s user 10.04s total 4.482 iter/s 1.180
iter/cpu 3.84 speedup 0.014 karp-flatt
Results: 5 threads 56 iter 47.43s user 10.11s total 5.539 iter/s 1.181
iter/cpu 4.75 speedup 0.013 karp-flatt
Results: 6 threads 66 iter 55.89s user 10.06s total 6.561 iter/s 1.181
iter/cpu 5.63 speedup 0.013 karp-flatt
Results: 7 threads 76 iter 64.39s user 10.11s total 7.517 iter/s 1.180
iter/cpu 6.45 speedup 0.014 karp-flatt
Results: 8 threads 86 iter 72.90s user 10.11s total 8.506 iter/s 1.180
iter/cpu 7.29 speedup 0.014 karp-flatt

clang 3.7svn -Os

% gm benchmark -stepthreads 1 -duration 10 convert -size 2048x1080
pattern:granite -operator all Noise-Gaussian 30% null:
Results: 1 threads 19 iter 10.36s user 10.36s total 1.834 iter/s 1.834
iter/cpu 1.00 speedup 1.000 karp-flatt
Results: 2 threads 36 iter 20.50s user 10.25s total 3.512 iter/s 1.756
iter/cpu 1.92 speedup 0.044 karp-flatt
Results: 3 threads 52 iter 30.30s user 10.11s total 5.143 iter/s 1.716
iter/cpu 2.80 speedup 0.035 karp-flatt
Results: 4 threads 67 iter 40.12s user 10.03s total 6.680 iter/s 1.670
iter/cpu 3.64 speedup 0.033 karp-flatt
Results: 5 threads 82 iter 50.25s user 10.06s total 8.151 iter/s 1.632
iter/cpu 4.44 speedup 0.031 karp-flatt
Results: 6 threads 96 iter 60.23s user 10.04s total 9.562 iter/s 1.594
iter/cpu 5.21 speedup 0.030 karp-flatt
Results: 7 threads 109 iter 70.12s user 10.03s total 10.867 iter/s 1.554
iter/cpu 5.93 speedup 0.030 karp-flatt
Results: 8 threads 122 iter 79.82s user 10.03s total 12.164 iter/s 1.528
iter/cpu 6.63 speedup 0.029 karp-flatt

as described in http://www.graphicsmagick.org/OpenMP.html. The
interpretation of the results seem complex as the optimal results would be
a combination of the highest iter/cpu as well as the highest speedup. The
results for clang 3.7svn are clearly superior to gcc 5.1 on both metrics
for -O3. For -O2 and -Os, the performance (iter/cpu) is always higher for
clang 3.7svn but not the speedup compared to gcc 5.1.
                Jack


On Wed, May 6, 2015 at 5:41 AM, Andrey Bokhanko <andreybokhanko at gmail.com>
wrote:

> Jack,
>
> Thanks you for all the testing efforts! -- they are really appreciated
> and in my eyes one of the best contributions to the overall OMP
> development effort.
>
> Keep up the good work!
>
> Andrey
>
>
> On Mon, May 4, 2015 at 1:02 AM, Jack Howarth
> <howarth.mailing.lists at gmail.com> wrote:
> > A couple more data points. Current llvm 3.7svn with the two outstanding
> > OPENMP patches can build the openmp support in gdl 0.9.5 (which
> completely
> > passes its test suite) and apbs 1.4.1's limited openmp support.
> >
> > On Sat, May 2, 2015 at 11:11 PM, Jack Howarth
> > <howarth.mailing.lists at gmail.com> wrote:
> >>
> >>     On a positive note, current llvm 3.7svn with the two outstanding
> >> OPENMP patches applied builds the openmp support in gromacs 5.0.4 and
> the
> >> resulting build fully passes the gromacs regression test suite. Tested
> on
> >> x86_64-apple-darwin14.
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20150506/ebf51a5d/attachment.html>


More information about the Openmp-dev mailing list