[Openmp-dev] Performance slowdown

Peyton, Jonathan L via Openmp-dev openmp-dev at lists.llvm.org
Wed Aug 19 09:37:20 PDT 2015


Currently to use libomp, the flag is: -fopenmp=libomp

-- Johnny

From: Openmp-dev [mailto:openmp-dev-bounces at lists.llvm.org] On Behalf Of César via Openmp-dev
Sent: Wednesday, August 19, 2015 11:30 AM
To: andreybokhanko
Cc: openmp-dev at lists.llvm.org
Subject: Re: [Openmp-dev] Performance slowdown

On Wed, Aug 19, 2015 at 12:59 PM, andreybokhanko <andreybokhanko at gmail.com<mailto:andreybokhanko at gmail.com>> wrote:
Indeed, I meant official released 3.5. Did you get your compiler from clang-omp.github? It's probably outdated and can't be used for reliable performance measurements.

I will update clang-omp.github home page to avoid further confusion.

Yes, that was exactly what happend! Thanks Alexey/Andrey.

I have just downloaded clang-3.8 (trunk) and started some experiments, however I see that clang is trying to link with "lib gomp" (GNU) and not "libomp" (Intel). Is this really the intended default behavior? How can I tell clang to use the Intel OMP?




Yours,
Andrey

19 авг. 2015 г., в 8:17, Bataev, Alexey <a.bataev at hotmail.com<mailto:a.bataev at hotmail.com>> написал(а):
If you're using 3.5, then this is unofficial version with OpenMP. Use 3.7 instead just like Andrey said


Best regards,

Alexey Bataev

=============

Software Engineer

Intel Compiler Team
19.08.2015 1:25, César via Openmp-dev пишет:
Hi Andrey,

this is strange because when I compile with "clang-3.5 -fopenmp" the executable that is produced is parallel. I am sure of this because I'm able to see the threads and also because I can see the symbols used by the IOMP runtime in the binary.

$ clang -O3 -g -fopenmp toy13.cpp -o toy13 -lm

$ nm toy13 | grep kmpc
U __kmpc_cancel_barrier@@VERSION
U __kmpc_end_single@@VERSION
U __kmpc_fork_call@@VERSION
U __kmpc_omp_task_alloc@@VERSION
U __kmpc_omp_task_with_deps@@VERSION
U __kmpc_single@@VERSION

$ ldd toy13
linux-vdso.so.1 =>  (0x00007fff9805d000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fc00e3cc000)
libiomp5.so => /usr/lib/libiomp5.so (0x00007fc00e121000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fc00df03000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fc00db3e000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fc00d939000)
/lib64/ld-linux-x86-64.so.2 (0x00007fc00e6fc000)





César.

On Tue, Aug 18, 2015 at 6:15 PM, <andreybokhanko at gmail.com<mailto:andreybokhanko at gmail.com>> wrote:
César,

- The version of Clang that I am using to compile the benchmarks: 3.5.0

Clang 3.5 doesn't support OpenMP -- it simply ignores the pragmas.

Please use version from trunk or from 3_7 release branch. Also, please supply -fopenmp= libomp option.

Yours,
Andrey Bokhanko
=============
Software Engineer
Intel Compiler Team
Intel

Отправлено с iPad

18 авг. 2015 г., в 21:14, César via Openmp-dev <openmp-dev at lists.llvm.org<mailto:openmp-dev at lists.llvm.org>> написал(а):
Hello,

I don't know if this is the correct list to talk about this - I did not find a better place..

I am doing performance experiments with a few OpenMP implementations (IOMP, GOMP and our private impl.) and I am seeing a severe slowdown when I use IOMP (GOMP and others are performing well).

The benchmarks I am using are these ones: http://kastors.gforge.inria.fr/#!index.md<http://kastors.gforge.inria.fr/#%21index.md>

Really, the slowdown is huge. For one of the programs (plasma/dpotrf_taskdep -n 8192 -b 64 -i 1 -c) the serial version executes in ~28s and the parallel one executes in ~110s. I did some profiling and found that most of the time is being spent on synchronization barriers and dependence tracking (see attached image). Before digging deeper I would like to hear back from you if I am doing something wrong here:

- I tested with the last version of the repository:  http://llvm.org/svn/llvm-project/openmp/trunk
- I am using Ubuntu 14.10.
- I have tested on more than one machine, the results above are from a Intel i7-3770
- The runtime itself is compiled using: make compiler=gcc os_omp=linux arch=32e
- The version of GCC that I am using is: 4.9.1
- The version of Clang that I am using to compile the benchmarks: 3.5.0


César.
<pic1.png>
<pic2.png>
<pic3.png>
_______________________________________________
Openmp-dev mailing list
Openmp-dev at lists.llvm.org<mailto:Openmp-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev



_______________________________________________

Openmp-dev mailing list

Openmp-dev at lists.llvm.org<mailto:Openmp-dev at lists.llvm.org>

http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20150819/5f0f9eb4/attachment.html>


More information about the Openmp-dev mailing list