[libcxx-dev] Parallel STL

Tue Sep 29 04:14:56 PDT 2020

Thank you for all the advice. I have made a second pass on this same
function. Does this seem to be a more idiomatic approach for OpenMP?

//------------------------------------------------------------------------
// parallel_invoke
//------------------------------------------------------------------------

template <typename _F1, typename _F2>
void
__parallel_invoke_body(_F1&& __f1, _F2&& __f2)
{
    _PSTL_PRAGMA(omp task) { std::forward<_F1>(__f1)(); }
    _PSTL_PRAGMA(omp task) { std::forward<_F2>(__f2)(); }
}

template <class _ExecutionPolicy, typename _F1, typename _F2>
void
__parallel_invoke(_ExecutionPolicy&&, _F1&& __f1, _F2&& __f2)
{
    if (omp_in_parallel())
    {
        __parallel_invoke_body(std::forward<_F1>(__f1),
std::forward<_F2>(__f2));
    }
    else
    {
        _PSTL_PRAGMA(omp parallel)
        _PSTL_PRAGMA(omp single)
        __parallel_invoke_body(std::forward<_F1>(__f1),
std::forward<_F2>(__f2));
    }
}

On Mon, Sep 28, 2020 at 7:55 AM Kukanov, Alexey <Alexey.Kukanov at intel.com>
wrote:

> In the *“if (omp_in_parallel())”* part, OpenMP worksharing constructs
> such as sections cannot be used. The problem is that worksharing constructs
> cannot be nested, and it’s quite likely that the outer parallel region
> already runs one. Therefore, tasks are the only option.
>
>
>
> In the *“else”* part, the parallel region is created by PSTL, so
> worksharing can be used; tasks can be used as well.
>
>
>
> If the same approach (tasks) is used in both cases, repetition can be
> avoided if the relevant code is separated into a function.
>
>
>
> Regards,
>
> - Alexey
>
>
>
> *From:* Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com>
> *Sent:* Monday, September 28, 2020 12:44 PM
> *To:* Christopher Nelson <nadiasvertex at gmail.com>; Kukanov, Alexey <
> Alexey.Kukanov at intel.com>; Pavlov, Evgeniy <evgeniy.pavlov at intel.com>
> *Cc:* Louis Dionne <ldionne at apple.com>; Thomas Rodgers <
> trodgers at redhat.com>; Libc++ Dev <libcxx-dev at lists.llvm.org>; Pavlov,
> Evgeniy <evgeniy.pavlov at intel.com>
> *Subject:* RE: [libcxx-dev] Parallel STL
>
>
>
> Hi Cristopher,
>
>
>
> 1.
>
> Briefly about Parallel design and execution policies  handling in
> particular:
>
>
>
> Parallel STL design  is based on *pattern of bricks* approach and has a
> compile-time dispatching mechanism which is based on overload resolution of
> a couple of type-tags – *is_parallel *and* is_vectror. *A set  of
> combinations of the tags gives four execution policies – *seq, par,
> unseq, par_unseq*. A parallel backend doesn’t handle a passed execution
> policy  - that parameter may be usefull for some special back-ends. It
> doesn’t matter for Open MP backend. (See include
> */pstl/internal/parallel_backend.h* for more details about OpenMP backend
> dispatching).
>
>
>
> In other words, an implementation of each PSTL algorithm based on two
> *patterns*  - *parallel *(chosen by *is_parallel  *type-tag) and * serial
> *(chosen by *is_vector  *type-tag).
>
> Each *parallel pattern* may call * serial brick* or *vector(unsequenced)
> brick*. It “gives” *par* and * par_unseq* policies implementations.
>
> Each *serial pattern* also may call * serial brick* or *vector(unsequenced)
> brick*. It “gives” *seq* and * unseq* policies implementations.
>
>
>
> 2.
>
> Yes, we missed to add a definition of “_Combiner” into this review. In
> that prototype It was moved to an utility file and another namespace… But
> it doesn’t matter. Just now you can find  “_Combiner” in
> https://github.com/llvm/llvm-project/blob/master/pstl/include/pstl/internal/unseq_backend_simd.h
>
>
>
> 3.
>
> In case of *omp_in_parallel*  to avoid oversubscription you should use a *task
> API* instead of *sections*. A task doesn’t create a new thread. A task is
> added to a task pool and may be executed by the first “free” thread from
> the tread pool.
>
> In *else* section, I think,  It would be prefer to use  a task API  as
> well, for better workload balance.
>
>
>
> P.S. + @Pavlov, Evgeniy <Evgeniy.Pavlov at intel.com> who wrote OpenMP
> backend prototype.
>
>
>
> Best regards,
>
> Mikhail Dvorskiy
>
>
>
> *From:* Christopher Nelson <nadiasvertex at gmail.com>
> *Sent:* Sunday, September 27, 2020 10:24 PM
> *To:* Kukanov, Alexey <Alexey.Kukanov at intel.com>
> *Cc:* Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com>; Louis Dionne <
> ldionne at apple.com>; Thomas Rodgers <trodgers at redhat.com>; Libc++ Dev <
> libcxx-dev at lists.llvm.org>
> *Subject:* Re: [libcxx-dev] Parallel STL
>
>
>
> Hello,
>
>
>
> I have followed the advice about taking over the review above, and have
> gotten to a place where I'm working on getting the existing code to compile
> cleanly. A few functions were not implemented, so I have forwarded them to
> the serial backend for now. Just to get compilation working.
>
>
>
> I have a few questions:
>
>
>
> 1. I notice that neither the TBB backend, nor the existing OpenMP backend
> code evaluates the execution policy to understand what to do. I may have
> misunderstood Louis Dionne, but it appears like the "sequential" mode is
> not handled at all if the user requests it. That seems wrong, so I must be
> missing something. I also notice that the execution modes are not enums,
> they are objects. However, when I try to overload on them in order to
> specialize for sequential, I get a compile error saying that the types are
> not fully defined. What is the design expectation for handling the
> different execution policies?
>
>
>
> 2. The existing code refers to a type:
>
>
>
> using _CombinerType = __pstl::__internal::_Combiner<_Value, _Reduction>;
>
> _CombinerType __result{__identity, &__reduction};
>
>
>
> However, this type does not exist in __pstl::__internal, at least so far
> as I can tell. Also, the D70530 code dump does not contain a definition of
> that object. Has this migrated? Should I provide my own implementation of
> it?
>
>
>
> 3. I have tried to implement a very, very simple function:
>
>
>
> template <class _ExecutionPolicy, typename _F1, typename _F2>
> void __parallel_invoke(_ExecutionPolicy &&, _F1 &&__f1, _F2 &&__f2) {
>     if (omp_in_parallel()) {
>         _PSTL_PRAGMA(omp sections) {
>             _PSTL_PRAGMA(omp section)
>             std::forward<_F1>(__f1)();
>             _PSTL_PRAGMA(omp section)
>             std::forward<_F2>(__f2)();
>         }
>     } else {
>         _PSTL_PRAGMA(omp parallel)
>         _PSTL_PRAGMA(omp sections) {
>             _PSTL_PRAGMA(omp section)
>             std::forward<_F1>(__f1)();
>             _PSTL_PRAGMA(omp section)
>             std::forward<_F2>(__f2)();
>         }
>     }
> }
>
> Does this look sane? I have just started reading through the OpenMP
> documentation. This looks like it could be correct, but there is also the
> "omp task" directive, and it's not clear which of these is superior in this
> case. Also, this seems awfully repetitive. Is this just OpenMP?
>
>
>
> Thanks!
>
>
>
> On Wed, Sep 16, 2020 at 9:28 AM Kukanov, Alexey <Alexey.Kukanov at intel.com>
> wrote:
>
> Hi Cristopher,
>
>
>
> One good way to contribute, I think, is to develop an OpenMP-based
> parallel backend. LLVM already supports OpenMP, so it resolves the
> dependency problem Louis mentioned. While it’s arguably not the best
> default engine in the long term, there is certainly some demand for it. The
> GCC community is also interested in it. Moreover, Mikhail and the team at
> Intel in collaboration with Thomas (CC’d) from GCC already developed a
> basic prototype: https://reviews.llvm.org/D70530, but further work is
> postponed. If you are interested to continue, you are more than welcome,
> and we will help with guidance and feedback.
>
>
>
> Regards,
>
> - Alexey
>
>
>
> *From:* libcxx-dev <libcxx-dev-bounces at lists.llvm.org> *On Behalf Of *Christopher
> Nelson via libcxx-dev
> *Sent:* Wednesday, September 16, 2020 2:43 PM
> *To:* Louis Dionne <ldionne at apple.com>
> *Cc:* Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com>;
> *Subject:* Re: [libcxx-dev] Parallel STL
>
>
>
> Fantastic. I will study the serial backend and see what I can do!
>
>
>
> On Tue, Sep 15, 2020 at 5:27 PM Louis Dionne <ldionne at apple.com> wrote:
>
> + Mikhail, who wrote most of the PSTL
>
>
>
> On Sep 15, 2020, at 15:40, Christopher Nelson <nadiasvertex at gmail.com>
> wrote:
>
>
>
> Okay, that makes sense. I can see how you might want to use Grand Central
> Dispatch on macOS, and the Windows system thread pool on Windows. I'm not
> really sure what that means for Linux, though. Other than maybe pthreads,
> which is not great.
>
>
>
> Is there any documentation on what is needed to create a backend? Or are
> there perhaps already plans in motion? I don't want to step on any toes,
> but I would love to have a usable pstl on macOS and Linux for the next LLVM
> release.
>
> We use libc++ on Linux as well as macOS. Depending on what's involved, I
> might be able to contribute a backend for those two platforms.
>
>
>
> You're not stepping on any toes, far from that. If we have backends with
> satisfactory performance and we're confident about ABI stability, I don't
> see a reason why we wouldn't ship the PSTL as soon as we have those. One
> big issue to shipping it so far has been that the only backends are serial
> (not great to ship that), and the other one relies on an external
> dependency (TBB).
>
>
>
> Mikhail might be able to provide documentation. We should check it into
> the PSTL repository. I meant to write such documentation when I wrote the
> serial backend, but never got around to writing something that was enough
> to check in. You can see the minimal API needed to implement a backend
> here: pstl/include/pstl/internal/parallel_backend_serial.h. It's the serial
> backend, which tries to be as trivial as possible.
>
>
>
> Are you familiar with libc++ contribution? If so, contributing to PSTL
> works basically the same -- just send a Phabricator review and I'll review
> it. We can also chat on Slack in the Cpplang workspace and I can give some
> guidance -- look for "ldionne".
>
>
>
> Cheers,
>
> Louis
>
>
>
>
>
> On Tue, Sep 15, 2020 at 2:50 PM Louis Dionne <ldionne at apple.com> wrote:
>
> Hi,
>
> Long story short, the PSTL is pretty much ready to be shipped with LLVM. I
> did the integration between it and libc++, and it all worked last time I
> checked. I think the next step would be to change whatever LLVM scripts are
> used to create releases to also install the PSTL, which is the part I
> haven't had time to look into yet.
>
> That being said, the PSTL will then default to using the Serial backend,
> which isn't very useful. We could decide to ship a different backend if we
> wanted, however I think what makes sense is to use a backend specific to
> the platform we're running on instead of adding a dependency to LLVM.
>
> Louis
>
> > On Sep 8, 2020, at 08:25, Christopher Nelson via libcxx-dev <
> libcxx-dev at lists.llvm.org> wrote:
> >
> > Hello friends,
> >
> > I have spent some time looking at the mailing archives and git logs for
> the parallel STL. I'm not clear what state it is in, since the oneAPI/tbb
> seems to be production ready and comes with the parallel STL. Also, it
> appears the GCC has shipped a PSTL based on the same code that clang is
> using.
> >
> > I was wondering if someone could clarify for me what state the PSTL is
> in, and if there is some work needed to help get it over the finish line I
> may be able to help. I'm very interested in using it in our production
> software, so I'm a motivated helper. :-)
> >
> > Thank you for your time,
> > -={C}=-
> > _______________________________________________
> > libcxx-dev mailing list
> > libcxx-dev at lists.llvm.org
> > https://lists.llvm.org/cgi-bin/mailman/listinfo/libcxx-dev
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20200929/04e19a1a/attachment-0001.html>