[Openmp-dev] initial clang-omp/openmp benchmarking

Cownie, James H james.h.cownie at intel.com
Thu May 29 04:45:39 PDT 2014


I don’t really understand what problem you are complaining about.
Your numbers show clang-omp as the fastest implementation in all directly comparable cases. That doesn’t seem like something we want to change!

-- Jim

James Cownie <james.h.cownie at intel.com>
SSG/DPD/TCAR (Technical Computing, Analyzers and Runtimes)
Tel: +44 117 9071438

From: Jack Howarth [mailto:howarth.mailing.lists at gmail.com]
Sent: Wednesday, May 28, 2014 8:20 PM
To: Cownie, James H
Cc: openmp-dev at dcs-maillist2.engr.illinois.edu
Subject: Re: [Openmp-dev] initial clang-omp/openmp benchmarking



On Wed, May 28, 2014 at 10:24 AM, Cownie, James H <james.h.cownie at intel.com<mailto:james.h.cownie at intel.com>> wrote:
Sorry, I read your description of the problem
“While the results for iomp5 are far better on darwin than those for gomp on darwin, we still are lagging behind the performance of gomp using futex on linux.”
as being that libiomp5.so was underperforming on Linux because we’re not using futex there, so I was explaining how we could do that.

I now grok that what you’re saying is that you’d like to see performance on Darwin (without futexes) that is faster than on Linux (with futexes).
So, I suggest trying the TAS lock, (KMP_LOCK_KIND=tas on Darwin).

Not necessarily faster on darwin but at least equal in performance to linux. FYI, I posted the raw timings for this test case on both darwin and linux…

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61333#c13
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61333#c14

A cursory examination suggests that the ratios of the one to four OMP process timings are pretty much identical between the clang-omp and FSF gcc compilers on linux but the darwin ratios on clang-omp are about 5% lower than futex on linux and even worse for gomp on darwin.

Depending on what you think OpenMP is used for, though, locks may be irrelevant. If you look at the latest SPECOMP codes, there are none that use locks (down from the previous version that had a couple).

In HPC locks should be rare and heavily contended locks absent completely. (Because if there are heavily contended locks in a significant part of the code, it won’t perform well anyway, so doesn’t qualify for the “HPC” name ☺).

-- Jim

James Cownie <james.h.cownie at intel.com<mailto:james.h.cownie at intel.com>>
SSG/DPD/TCAR (Technical Computing, Analyzers and Runtimes)
Tel: +44 117 9071438<tel:%2B44%20117%209071438>

---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20140529/608a1704/attachment.html>


More information about the Openmp-dev mailing list