<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Dec 8, 2017 at 2:34 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">

    <p><br>

    </p>

    <div class="m_-4467660861384652085moz-cite-prefix">On 12/08/2017 03:55 PM, Jeff Hammond

      wrote:<br>

    </div>

    <blockquote type="cite">

      <div dir="ltr"><br>

        <div class="gmail_extra"><br>

          <div class="gmail_quote">On Fri, Dec 8, 2017 at 1:13 PM, Hal

            Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span>

            wrote:<br>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div bgcolor="#FFFFFF"><span class="m_-4467660861384652085gmail-">

                  <p><br>

                  </p>

                  <div class="m_-4467660861384652085gmail-m_-2288114881520531127moz-cite-prefix">On

                    12/07/2017 11:35 AM, Jeff Hammond wrote:<br>

                  </div>

                  <blockquote type="cite">

                    <div><br>

                      <div class="gmail_quote">

                        <div dir="auto">On Wed, Dec 6, 2017 at 8:57 PM

                          Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>

                          wrote:<br>

                        </div>

                        <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                          <div bgcolor="#FFFFFF">

                            <p><br>

                            </p>

                            <div class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852moz-cite-prefix">On

                              12/06/2017 10:23 PM, Jeff Hammond wrote:<br>

                            </div>

                            <blockquote type="cite">

                              <div><br>

                                <div class="gmail_quote">

                                  <div dir="auto">On Wed, Dec 6, 2017 at

                                    4:23 PM Hal Finkel <<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>>

                                    wrote:<br>

                                  </div>

                                  <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                                    <div bgcolor="#FFFFFF">

                                      <p><br>

                                      </p>

                                      <div class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889moz-cite-prefix">On

                                        12/04/2017 10:48 PM, Serge Preis

                                        via cfe-dev wrote:<br>

                                      </div>

                                      <blockquote type="cite">

                                        <div>I agree that guarantees

                                          provided by ICC may be

                                          stronger than with other

                                          compilers, so yes, under

                                          OpenMP terms vectorization is

                                          permitted and cannot be

                                          assumed. However OpenMP

                                          clearly defines semantics of

                                          variables used within OpenMP

                                          region some being

                                          shared(scalar), some

                                          private(vector) and some being

                                          inductions. This goes far

                                          beyond typical compiler

                                          specific pragmas about

                                          dependencies and cost

                                          modelling and makes

                                          vectorization much simpler

                                          task with more predictable and

                                          robust results if properly

                                          implemented (admittedly, even

                                          ICC implementation is far from

                                          perfect). I hope Intel's

                                          efforts to standardize

                                          someting like this in core C++

                                          will evntually come to

                                          fruition. Until then I as a

                                          regular application developer

                                          would appreciate OpenMP-simd

                                          based execution policy (hoping

                                          for good support for OpenMP

                                          SIMD in clang), but it

                                          shouldn't necessary be part of

                                          libc++. Since 'unordered'

                                          execution policy is currently

                                          not part of C++ standard </div>

                                      </blockquote>

                                      <br>

                                    </div>

                                    <div bgcolor="#FFFFFF">

                                      std::execution::par_unseq is part

                                      of C++17, and that essentially

                                      maps to '#pragma omp parallel for

                                      simd'.</div>

                                    <div bgcolor="#FFFFFF"><br>

                                    </div>

                                  </blockquote>

                                  <div dir="auto"><br>

                                  </div>

                                  <div dir="auto">Do you expect

                                    par/par_unseq to nest?</div>

                                </div>

                              </div>

                            </blockquote>

                            <br>

                          </div>

                          <div bgcolor="#FFFFFF"> Yes.</div>

                          <div bgcolor="#FFFFFF"><br>

                            <br>

                            <blockquote type="cite">

                              <div>

                                <div class="gmail_quote">

                                  <div dir="auto"> Nesting omp-parallel

                                    is generally regarded as a Bad Idea.</div>

                                </div>

                              </div>

                            </blockquote>

                            <br>

                          </div>

                          <div bgcolor="#FFFFFF"> Agreed. I suspect

                            we'll want the mapping to be more like

                            '#pragma omp taskloop simd'.</div>

                          <div bgcolor="#FFFFFF"><br>

                          </div>

                        </blockquote>

                        <div dir="auto"><br>

                        </div>

                        <div dir="auto">That won’t run in parallel

                          unless in an omp-parallel-master region. </div>

                      </div>

                    </div>

                  </blockquote>

                  <br>

                </span> Yes.<span class="m_-4467660861384652085gmail-"><br>

                  <br>

                  <blockquote type="cite">

                    <div>

                      <div class="gmail_quote">

                        <div dir="auto">That means OpenMP-based PSTL

                          won’t be parallel unless the user knows to add

                          back-end specific code about the PSTL.</div>

                      </div>

                    </div>

                  </blockquote>

                  <br>

                </span> That obviously wouldn't be acceptable.<span class="m_-4467660861384652085gmail-"><br>

                  <br>

                  <blockquote type="cite">

                    <div>

                      <div class="gmail_quote">

                        <div dir="auto"><br>

                        </div>

                        <div dir="auto">What I’m trying to say is that

                          OpenMP is a poor target for PSTL in its

                          current form. Nested parallel regions is the

                          only thing that works and it is likely to work

                          poorly.</div>

                      </div>

                    </div>

                  </blockquote>

                  <br>

                </span> I'm not sure that's true, but the technique may

                not be trivial. I believe that it is possible, however.

                For example, the mapping might be to something like:<br>

                <br>

                if (omp_in_parallel()) {<br>

                #pragma omp taskloop simd<br>

                  for (size_t i = 0; i < N; ++i)<br>

                    F(X[i]);<br>

                } else {<br>

                #pragma omp parallel<br>

                  {<br>

                #pragma omp taskloop simd<br>

                     for (size_t i = 0; i < N; ++i)<br>

                       F(X[i]);<br>

                  }<br>

                }<br>

                <br>

                The fact that we'd need to use this kind of pattern is a

                bit unfortunate, but it can be easily abstracted into a

                template function, so it just becomes some

                implementation detail of the library.<br>

                <br>

              </div>

            </blockquote>

            <div><br>

            </div>

            <div>You are right and that is probably the best way to do

              it with OpenMP.  I am concerned about the absolute

              performance, based upon my observations of omp-taskloop vs

              omp-for and tbb::parallel_for</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></div></div>

    Have you tried this recently? There was a recursive task-stealing

    strategy added to our OpenMP library in July of this year (r308338)

    which should have made the performance of taskloop better.<span class=""><br>

    <br></span></div></blockquote><div><br></div><div>I ran those benchmarks this summer with Intel 18 beta.  Tom from LLNL mentioned that a stealing-based implementation of OpenMP taskloop was feasible but I didn't investigate whether it was used.  Obviously, I know some people who can help me answer questions about the LLVM OpenMP runtime ;-)</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class="">

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div> in the PRK project, but at least it is sane from a

              semantic perspective.  Having motivating use cases like

              PSTL should lead to improvements in OpenMP runtime

              performance w.r.t. taskloop.</div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></span>

    Indeed :-)<span class=""><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div><a href="https://i.stack.imgur.com/MVd5j.png" target="_blank">https://i.stack.imgur.com/<wbr>MVd5j.png</a>

              is a snapshot of the performance of PRK stencil (<a href="https://github.com/ParRes/Kernels/tree/master/Cxx11" target="_blank">https://github.com/ParRes/<wbr>Kernels/tree/master/Cxx11</a>),

              which shows taskloop loses to TBB-based PSTL, OpenMP for,

              and tbb::parallel_for (pure TBB beats TBB-based PSTL

              because I use tbb::blocked_range2d, which improves cache

              utilization).  I think those results tuned taskloop

              grainsize as well, so they may be an optimistic

              representation of taskloop in a general usage.<br>

            </div>

          </div>

        </div>

      </div>

    </blockquote>

    <br></span>

    Interesting.<span class=""><br></span></div></blockquote><div> </div><div>I should try to figure out how to recreate what TBB does with PSTL since it's clearly beneficial, at least on KNL.  Obviously, I can block loops manually as I do with raw OpenMP code, but I'm sure there's a nicer way.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class="">

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>I'll see if I can prototype this in RAJA or Intel

              PSTL.  It's not hard to get results directly from the PRK

              tests, if the former attempts fail.</div>

          </div>

        </div>

      </div>

    </blockquote>

    </span></div></blockquote><div>Correct: I'll see if I can prototype for_each.  The rest will be left as an exercise for the reader :-D</div><div><br>Jeff</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><span class=""></span>

    Thanks!<span class="HOEnZb"><font color="#888888"><br>

    <br>

     -Hal</font></span><div><div class="h5"><br>

    <br>

    <blockquote type="cite">

      <div dir="ltr">

        <div class="gmail_extra">

          <div class="gmail_quote">

            <div><br>

            </div>

            <div>Best,</div>

            <div><br>

            </div>

            <div>Jeff</div>

            <div> </div>

            <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

              <div bgcolor="#FFFFFF"> Thanks again,<br>

                Hal

                <div>

                  <div class="m_-4467660861384652085gmail-h5"><br>

                    <br>

                    <blockquote type="cite">

                      <div>

                        <div class="gmail_quote">

                          <div dir="auto"><br>

                          </div>

                          <div dir="auto">Jeff</div>

                          <div dir="auto"><br>

                          </div>

                          <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                            <div bgcolor="#FFFFFF"><br>

                               -Hal</div>

                            <div bgcolor="#FFFFFF"><br>

                              <br>

                              <blockquote type="cite">

                                <div>

                                  <div class="gmail_quote">

                                    <div dir="auto"><br>

                                    </div>

                                    <div dir="auto">Jeff</div>

                                    <div dir="auto"><br>

                                    </div>

                                    <blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                                      <div bgcolor="#FFFFFF"><br>

                                        <blockquote type="cite">

                                          <div>I don't care much on how

                                            it will be implemneted in

                                            libc++ if it is. I just

                                            would like to ask Intel guys

                                            and community here to make

                                            implementation extensible in

                                            a sense that custom

                                            OpenMP-SIMD-based execution

                                            policy along with algorithms

                                            implementations (as

                                            specializations for the

                                            policy) can be used with the

                                            libc++ library. And I

                                            additionally would like to

                                            ask Intel guys to provide

                                            complete and compatible

                                            extension on github for

                                            developers like me to use.</div>

                                        </blockquote>

                                        <br>

                                      </div>

                                      <div bgcolor="#FFFFFF"> In the

                                        end, I think we want the

                                        following:<br>

                                        <br>

                                         1. A design for libc++ that

                                        allows the thread-level

                                        parallelism to be implemented in

                                        terms of different underlying

                                        providers (i.e., OpenMP, GCD,

                                        Work Queues on Windows, whatever

                                        else).<br>

                                         2. To follow the same

                                        philosophy with respect to

                                        standards as we do everywhere

                                        else: Use standards where

                                        possible with

                                        compiler/system-specific

                                        extensions as necessary.<br>

                                        <br>

                                         -Hal</div>

                                      <div bgcolor="#FFFFFF"><br>

                                        <br>

                                        <blockquote type="cite">

                                          <div> </div>

                                          <div>Regards,</div>

                                          <div>Serge.</div>

                                          <div> </div>

                                          <div> </div>

                                          <div> </div>

                                          <div>04.12.2017, 12:07, "Jeff

                                            Hammond" <a class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889moz-txt-link-rfc2396E" href="mailto:jeff.science@gmail.com" target="_blank"><jeff.science@gmail.com></a>:</div>

                                          <blockquote type="cite">

                                            <div>

                                              <div>ICC implements a very

                                                aggressive

                                                interpretation of the

                                                OpenMP standard, and

                                                this interpretation is

                                                not shared by everyone

                                                in the OpenMP

                                                community.  ICC is

                                                correct but other

                                                implementations may be

                                                far less aggressive, so

                                                _Pragma("omp simd")

                                                doesn't guarentee

                                                vectorization unless the

                                                compiler documentation

                                                says that is how it is

                                                implemented.  All the

                                                standard says that it

                                                means is that

                                                vectorization is

                                                _permitted_.</div>

                                              <div> </div>

                                              <div>Given that the

                                                practical meaning of

                                                _Pragma("omp simd")

                                                isn't guaranteed to be

                                                consistent across

                                                different

                                                implementations, I don't

                                                really know how to

                                                compare it to

                                                compiler-specific

                                                pragmas unless we define

                                                everything explicitly.</div>

                                              <div> </div>

                                              <div>In any case, my

                                                fundamental point

                                                remains: do not use

                                                OpenMP pragmas here, but

                                                instead use whatever the

                                                appropriate

                                                compiler-specific pragma

                                                is, or create a new one

                                                that meets the need.</div>

                                              <div> </div>

                                              <div>Best,</div>

                                              <div> </div>

                                              <div>Jeff</div>

                                              <div title="Page 81">

                                                <div>

                                                  <div> </div>

                                                </div>

                                              </div>

                                              <div> 

                                                <div>On Sun, Dec 3, 2017

                                                  at 8:09 PM, Serge

                                                  Preis <span><<a href="mailto:spreis@yandex-team.ru" target="_blank">spreis@yandex-team.ru</a>></span>

                                                  wrote:

                                                  <blockquote style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

                                                    <div>Hello,</div>

                                                    <div> </div>

                                                    <div>_Pragma("omp

                                                      simd") is

                                                      semantically quite

                                                      different from

                                                      _Pragma("clang

                                                      loop

                                                      vectorize(assume_safety)"),

                                                      _Pragma("GCC

                                                      ivdep") and

                                                      _Pragma("vector

                                                      always"), so I am

                                                      not sure all

                                                      latter will work

                                                      as expected in all

                                                      cases. They

                                                      definitely won't

                                                      provide any

                                                      vectorization

                                                      guarantees which

                                                      slightly defeat

                                                      the purpose of

                                                      using

                                                      corresponding

                                                      execution policy.</div>

                                                    <div> </div>

                                                    <div>I support the

                                                      idea of having

                                                      OpenMP orthogonal

                                                      and definitely

                                                      having -fopenmp

                                                      enabled by default

                                                      is not an option.

                                                      Intel compiler has

                                                      separate

                                                      -qopenmp-simd

                                                      option which

                                                      doesn't affect

                                                      performance

                                                      outside explicitly

                                                      marked loops, but

                                                      even this is not

                                                      enabled by

                                                      default. I would

                                                      say that there

                                                      might exist

                                                      multiple

                                                      implementations of

                                                      unordered policy,

                                                      originally OpenMP

                                                      SIMD based

                                                      implementation may

                                                      be more powerful

                                                      and one based on

                                                      other pragmas

                                                      being default, but

                                                      hinting about

                                                      existence of

                                                      faster option.

                                                      Later on one may

                                                      be brave enough to

                                                      add some SIMD

                                                      template library

                                                      and implement

                                                      default unordered

                                                      policy using it

                                                      (such

                                                      implementation is

                                                      possible even now

                                                      using vector

                                                      types, but it will

                                                      be extremely

                                                      complex if attempt

                                                      to target all base

                                                      data types, vector

                                                      widths and target

                                                      SIMD architectures

                                                      clang supports.

                                                      Even with the

                                                      library this may

                                                      be quite tedious).</div>

                                                    <div> </div>

                                                    <div>Without any

                                                      standard way of

                                                      expressing SIMD

                                                      perallelism in

                                                      pure C++ any

                                                      implementer of

                                                      SIMD execution

                                                      policy is to rely

                                                      on means avaialble

                                                      for

                                                      plaform/compiler

                                                      and so it is not

                                                      totaly unnatural

                                                      to ask user to

                                                      enable OpenMP SIMD

                                                      for efficient

                                                      support of

                                                      corresponding

                                                      execution policy.</div>

                                                    <div> </div>

                                                    <div>Reagrds,</div>

                                                    <div>Serge Preis</div>

                                                    <div> </div>

                                                    <div>(Who once was

                                                      part of Intel

                                                      Compiler

                                                      Vectorizer team

                                                      and driven OpenMP

                                                      SIMD efforts

                                                      within icc and

                                                      beyond, if anyone

                                                      is keeping track

                                                      of

                                                      conflicts-of-interest)</div>

                                                    <div> </div>

                                                    <div> </div>

                                                    <div>04.12.2017,

                                                      08:46, "Jeff

                                                      Hammond via

                                                      cfe-dev" <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>:</div>

                                                    <blockquote type="cite">

                                                      <div>

                                                        <div>

                                                          <div>It would

                                                          be nice to

                                                          keep PSTL and

                                                          OpenMP

                                                          orthogonal,

                                                          even if

                                                          _Pragma("omp

                                                          simd") does

                                                          not require

                                                          runtime

                                                          support.  It

                                                          should be

                                                          trivial to use

                                                          _Pragma("clang

                                                          loop

                                                          vectorize(assume_safety)")

                                                          instead, by

                                                          wrapping all

                                                          of the

                                                          different

                                                          compiler

                                                          vectorization

                                                          pragmas in

                                                          preprocessor

                                                          logic.  I

                                                          similarly

                                                          recommend

                                                          _Pragma("GCC

                                                          ivdep") for

                                                          GCC and

                                                          _Pragma("vector

                                                          always") for

                                                          ICC.  While

                                                          this requires

                                                          O(n_compilers)

                                                          effort instead

                                                          of O(1), but

                                                          orthogonality

                                                          is worth it.

                                                          <div> </div>

                                                          <div>While

                                                          OpenMP is

                                                          vendor/compiler-agnostic,

                                                          users should

                                                          not be

                                                          required to

                                                          use -fopenmp

                                                          or similar to

                                                          enable

                                                          vectorization

                                                          from PSTL, nor

                                                          should the

                                                          compiler

                                                          enable any

                                                          OpenMP pragma

                                                          by default.  I

                                                          know of cases

                                                          where merely

                                                          using the

                                                          -fopenmp flag

                                                          alters code

                                                          generation in

                                                          a

                                                          performance-visible

                                                          manner, and

                                                          enabling the

                                                          OpenMP "simd"

                                                          pragma by

                                                          default may

                                                          surprise some

                                                          users,

                                                          particularly

                                                          if no other

                                                          OpenMP pragmas

                                                          are enabled by

                                                          default.

                                                          <div><br>

                                                          Best,</div>

                                                          <div> </div>

                                                          <div>Jeff</div>

                                                          <div>(who

                                                          works for

                                                          Intel but not

                                                          on any

                                                          software

                                                          products and

                                                          has been a

                                                          heavy user of

                                                          Intel PSTL

                                                          since it was

                                                          released, if

                                                          anyone is

                                                          keeping track

                                                          of

                                                          conflicts-of-interest)<br>

                                                          <br>

                                                          On Wed, Nov

                                                          29, 2017 at

                                                          4:21 AM,

                                                          Kukanov,

                                                          Alexey via

                                                          cfe-dev <<a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>>

                                                          wrote:<br>

                                                          ><br>

                                                          > Hello

                                                          all,<br>

                                                          ><br>

                                                          > At Intel,

                                                          we have

                                                          developed an

                                                          implementation

                                                          of C++17

                                                          execution

                                                          policies<br>

                                                          > for

                                                          algorithms

                                                          (often

                                                          referred to as

                                                          Parallel STL).

                                                          We hope to

                                                          contribute it<br>

                                                          > to

                                                          libc++/LLVM,

                                                          so would like

                                                          to ask the

                                                          community for

                                                          comments on

                                                          this.<br>

                                                          ><br>

                                                          > The code

                                                          is already

                                                          published at

                                                          GitHub (<a href="https://github.com/intel/parallelstl" target="_blank">https://github.com/intel/para<wbr>llelstl</a>).<br>

                                                          > It

                                                          supports the

                                                          C++17 standard

                                                          execution

                                                          policies (seq,

                                                          par,

                                                          par_unseq) as

                                                          well as<br>

                                                          > the

                                                          experimental

                                                          unsequenced

                                                          policy (unseq)

                                                          for SIMD

                                                          execution. At

                                                          the moment,<br>

                                                          > about

                                                          half of the

                                                          C++17 standard

                                                          algorithms

                                                          that must

                                                          support

                                                          execution

                                                          policies<br>

                                                          > are

                                                          implemented; a

                                                          few more will

                                                          be ready soon,

                                                          and the work

                                                          continues.<br>

                                                          > The tests

                                                          that we use

                                                          are also

                                                          available at

                                                          GitHub;

                                                          needless to

                                                          say we will<br>

                                                          >

                                                          contribute

                                                          those as well.<br>

                                                          ><br>

                                                          > The

                                                          implementation

                                                          is not

                                                          specific to

                                                          Intel’s

                                                          hardware. For

                                                          thread-level

                                                          parallelism<br>

                                                          > it uses

                                                          TBB* (<a href="https://www.threadingbuildingblocks.org/" target="_blank">https://www.threadingbuilding<wbr>blocks.org/</a>)

                                                          but abstracts

                                                          it with<br>

                                                          > an

                                                          internal API

                                                          which can be

                                                          implemented on

                                                          top of other

                                                          threading/parallel

                                                          solutions –<br>

                                                          > so it is

                                                          for the

                                                          community to

                                                          decide which

                                                          ones to use.

                                                          For SIMD

                                                          parallelism<br>

                                                          > (unseq,

                                                          par_unseq) we

                                                          use #pragma

                                                          omp simd

                                                          directives; it

                                                          is

                                                          vendor-neutral

                                                          and<br>

                                                          > does not

                                                          require any

                                                          OpenMP runtime

                                                          support.<br>

                                                          ><br>

                                                          > The

                                                          current

                                                          implementation

                                                          meets the

                                                          spirit but not

                                                          always the

                                                          letter of<br>

                                                          > the

                                                          standard,

                                                          because it has

                                                          to be separate

                                                          from but also

                                                          coexist with<br>

                                                          >

                                                          implementations

                                                          of standard

                                                          C++ libraries.

                                                          While

                                                          preparing the

                                                          contribution,<br>

                                                          > we will

                                                          address

                                                          inconsistencies,

                                                          adjust the

                                                          code to meet

                                                          community

                                                          standards,<br>

                                                          > and

                                                          better

                                                          integrate it

                                                          into the

                                                          standard

                                                          library code.<br>

                                                          ><br>

                                                          > We are

                                                          also proposing

                                                          that our

                                                          implementation

                                                          is included

                                                          into

                                                          libstdc++/GCC.<br>

                                                          >

                                                          Compatibility

                                                          between the

                                                          implementations

                                                          seems useful

                                                          as it can

                                                          potentially<br>

                                                          > reduce

                                                          the amount of

                                                          work for

                                                          everyone. We

                                                          hope to keep

                                                          the code

                                                          mostly

                                                          identical,<br>

                                                          > and would

                                                          like to know

                                                          if you think

                                                          it’s too

                                                          optimistic to

                                                          expect.<br>

                                                          ><br>

                                                          > Obviously

                                                          we plan to use

                                                          appropriate

                                                          open source

                                                          licenses to

                                                          meet the

                                                          different<br>

                                                          > projects’

                                                          requirements.<br>

                                                          ><br>

                                                          > We expect

                                                          to keep

                                                          developing the

                                                          code and will

                                                          take the

                                                          responsibility

                                                          for<br>

                                                          >

                                                          maintaining it

                                                          (with

                                                          community

                                                          contributions,

                                                          of course). If

                                                          there are

                                                          other<br>

                                                          > community

                                                          efforts to

                                                          implement

                                                          parallel

                                                          algorithms, we

                                                          are willing to

                                                          collaborate.<br>

                                                          ><br>

                                                          > We look

                                                          forward to

                                                          your feedback,

                                                          both for the

                                                          overall idea

                                                          and – if

                                                          supported –<br>

                                                          > for the

                                                          next steps we

                                                          should take.<br>

                                                          ><br>

                                                          > Regards,<br>

                                                          > - Alexey

                                                          Kukanov<br>

                                                          ><br>

                                                          > * Note

                                                          that TBB

                                                          itself is

                                                          highly

                                                          portable (and

                                                          ported by

                                                          community to

                                                          Power and ARM<br>

                                                          >

                                                          architectures)

                                                          and

                                                          permissively

                                                          licensed, so

                                                          could be the

                                                          base for the

                                                          threading<br>

                                                          >

                                                          infrastructure.

                                                          But the

                                                          Parallel STL

                                                          implementation

                                                          itself does

                                                          not require

                                                          TBB.<br>

                                                          ><br>

                                                          >

                                                          ______________________________<wbr>_________________<br>

                                                          > cfe-dev

                                                          mailing list<br>

                                                          > <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

                                                          > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-dev</a><br>

                                                          <br>

                                                          <br>

                                                          <br>

                                                          <br>

                                                          --<br>

                                                          Jeff Hammond<br>

                                                          <a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br>

                                                          <a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a>

                                                          <div> </div>

                                                          </div>

                                                          </div>

                                                          </div>

                                                        </div>

                                                      </div>

                                                      ,

                                                      <p><span>______________________________<wbr>_________________<br>

                                                          cfe-dev

                                                          mailing list<br>

                                                          <a href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a><br>

                                                          <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-dev</a></span></p>

                                                    </blockquote>

                                                  </blockquote>

                                                </div>

                                                <div> </div>

                                                --

                                                <div>Jeff Hammond<br>

                                                  <a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br>

                                                  <a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

                                              </div>

                                            </div>

                                          </blockquote>

                                          <br>

                                          <fieldset class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889mimeAttachmentHeader"></fieldset>

                                          <br>

                                          <pre>______________________________<wbr>_________________

cfe-dev mailing list

<a class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889moz-txt-link-abbreviated" href="mailto:cfe-dev@lists.llvm.org" target="_blank">cfe-dev@lists.llvm.org</a>

<a class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889moz-txt-link-freetext" href="http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/cfe-dev</a>

</pre>

                        </blockquote>

                      </div>

                      <div bgcolor="#FFFFFF">

                        <pre class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852m_2065056468622040889moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

                      </div>

                    </blockquote>

                  </div>

                </div>

                <div>-- 

                </div>

                <div class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852gmail_signature">Jeff Hammond

                  <a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>

                  <a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

              </blockquote>

              <pre class="m_-4467660861384652085gmail-m_-2288114881520531127m_8513274869410520852moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

            </div>

          </blockquote>

        </div>

      </div>

      <div dir="ltr">-- 

      </div>

      <div class="m_-4467660861384652085gmail-m_-2288114881520531127gmail_signature">Jeff

        Hammond

        <a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>

        <a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

    </blockquote>

    <pre class="m_-4467660861384652085gmail-m_-2288114881520531127moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre>

  </div></div></div>

</blockquote></div>

<div>

</div>-- 

<div class="m_-4467660861384652085gmail_signature">Jeff Hammond

<a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a>

<a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

</div></div>

</blockquote>

<pre class="m_-4467660861384652085moz-signature" cols="72">-- 

Hal Finkel

Lead, Compiler Technology and Programming Languages

Leadership Computing Facility

Argonne National Laboratory</pre></div></div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">Jeff Hammond<br><a href="mailto:jeff.science@gmail.com" target="_blank">jeff.science@gmail.com</a><br><a href="http://jeffhammond.github.io/" target="_blank">http://jeffhammond.github.io/</a></div>

</div></div>