<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Wed, Aug 19, 2015 at 3:07 PM, Jack Howarth <span dir="ltr"><<a href="mailto:howarth.mailing.lists@gmail.com" target="_blank">howarth.mailing.lists@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Tue, Aug 18, 2015 at 2:14 PM, César via Openmp-dev<br>
<<a href="mailto:openmp-dev@lists.llvm.org">openmp-dev@lists.llvm.org</a>> wrote:<br>
> Hello,<br>
><br>
> I don't know if this is the correct list to talk about this - I did not find<br>
> a better place..<br>
><br>
> I am doing performance experiments with a few OpenMP implementations (IOMP,<br>
> GOMP and our private impl.) and I am seeing a severe slowdown when I use<br>
> IOMP (GOMP and others are performing well).<br>
><br>
> The benchmarks I am using are these ones:<br>
> <a href="http://kastors.gforge.inria.fr/#!index.md" rel="noreferrer" target="_blank">http://kastors.gforge.inria.fr/#!index.md</a><br>
<br>
</span>That web page claims the benchmarks use parts of the OpenMP 4.0 specification.<br>
<br>
"The KaStORS benchmark suite has been designed to evaluate the implementation of<br>
the OpenMP dependent task paradigm, introduced as part of the OpenMP 4.0<br>
specification."<br>
<br>
Currently openmp is only complete for the OpenMP 3.2 specification<br>
<span class="im HOEnZb"><br></span></blockquote><div><br></div><div>I am able to compile a few benchmarks that use task dependence annotations (from OMP 4.0) but for those that specify the range of the memory dependence I get syntax error. So, should I assume that this part is not implemented, right? Is there a list for the OMP 4.0 items that are currently supported?<br></div><div><br></div><div>BTW, the Clang version from Github was able to parse these annotations, was it dropped from the current newer version?</div><div><br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="im HOEnZb">
><br>
> Really, the slowdown is huge. For one of the programs (plasma/dpotrf_taskdep<br>
> -n 8192 -b 64 -i 1 -c) the serial version executes in ~28s and the parallel<br>
> one executes in ~110s. I did some profiling and found that most of the time<br>
> is being spent on synchronization barriers and dependence tracking (see<br>
> attached image). Before digging deeper I would like to hear back from you if<br>
> I am doing something wrong here:<br>
><br>
> - I tested with the last version of the repository:<br>
> <a href="http://llvm.org/svn/llvm-project/openmp/trunk" rel="noreferrer" target="_blank">http://llvm.org/svn/llvm-project/openmp/trunk</a><br>
> - I am using Ubuntu 14.10.<br>
> - I have tested on more than one machine, the results above are from a Intel<br>
> i7-3770<br>
> - The runtime itself is compiled using: make compiler=gcc os_omp=linux<br>
> arch=32e<br>
> - The version of GCC that I am using is: 4.9.1<br>
> - The version of Clang that I am using to compile the benchmarks: 3.5.0<br>
><br>
><br>
> César.<br>
><br>
</span><div class="HOEnZb"><div class="h5">> _______________________________________________<br>
> Openmp-dev mailing list<br>
> <a href="mailto:Openmp-dev@lists.llvm.org">Openmp-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev</a><br>
><br>
</div></div></blockquote></div><br></div></div>