[libcxx-dev] Parallel STL
    Kukanov, Alexey via libcxx-dev 
    libcxx-dev at lists.llvm.org
       
    Mon Sep 28 04:54:51 PDT 2020
    
    
  
In the “if (omp_in_parallel())” part, OpenMP worksharing constructs such as sections cannot be used. The problem is that worksharing constructs cannot be nested, and it’s quite likely that the outer parallel region already runs one. Therefore, tasks are the only option.
In the “else” part, the parallel region is created by PSTL, so worksharing can be used; tasks can be used as well.
If the same approach (tasks) is used in both cases, repetition can be avoided if the relevant code is separated into a function.
Regards,
- Alexey
From: Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com>
Sent: Monday, September 28, 2020 12:44 PM
To: Christopher Nelson <nadiasvertex at gmail.com>; Kukanov, Alexey <Alexey.Kukanov at intel.com>; Pavlov, Evgeniy <evgeniy.pavlov at intel.com>
Cc: Louis Dionne <ldionne at apple.com>; Thomas Rodgers <trodgers at redhat.com>; Libc++ Dev <libcxx-dev at lists.llvm.org>; Pavlov, Evgeniy <evgeniy.pavlov at intel.com>
Subject: RE: [libcxx-dev] Parallel STL
Hi Cristopher,
1.
Briefly about Parallel design and execution policies  handling in particular:
Parallel STL design  is based on pattern of bricks approach and has a compile-time dispatching mechanism which is based on overload resolution of a couple of type-tags – is_parallel and is_vectror. A set  of combinations of the tags gives four execution policies – seq, par, unseq, par_unseq. A parallel backend doesn’t handle a passed execution policy  - that parameter may be usefull for some special back-ends. It doesn’t matter for Open MP backend. (See include/pstl/internal/parallel_backend.h for more details about OpenMP backend dispatching).
In other words, an implementation of each PSTL algorithm based on two patterns  - parallel (chosen by is_parallel  type-tag) and serial (chosen by is_vector  type-tag).
Each parallel pattern may call serial brick or vector(unsequenced) brick. It “gives” par and par_unseq policies implementations.
Each serial pattern also may call serial brick or vector(unsequenced) brick. It “gives” seq and unseq policies implementations.
2.
Yes, we missed to add a definition of “_Combiner” into this review. In that prototype It was moved to an utility file and another namespace… But it doesn’t matter. Just now you can find  “_Combiner” in https://github.com/llvm/llvm-project/blob/master/pstl/include/pstl/internal/unseq_backend_simd.h
3.
In case of omp_in_parallel  to avoid oversubscription you should use a task API instead of sections. A task doesn’t create a new thread. A task is added to a task pool and may be executed by the first “free” thread from the tread pool.
In else section, I think,  It would be prefer to use  a task API  as well, for better workload balance.
P.S. + @Pavlov, Evgeniy<mailto:Evgeniy.Pavlov at intel.com> who wrote OpenMP backend prototype.
Best regards,
Mikhail Dvorskiy
From: Christopher Nelson <nadiasvertex at gmail.com<mailto:nadiasvertex at gmail.com>>
Sent: Sunday, September 27, 2020 10:24 PM
To: Kukanov, Alexey <Alexey.Kukanov at intel.com<mailto:Alexey.Kukanov at intel.com>>
Cc: Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com<mailto:mikhail.dvorskiy at intel.com>>; Louis Dionne <ldionne at apple.com<mailto:ldionne at apple.com>>; Thomas Rodgers <trodgers at redhat.com<mailto:trodgers at redhat.com>>; Libc++ Dev <libcxx-dev at lists.llvm.org<mailto:libcxx-dev at lists.llvm.org>>
Subject: Re: [libcxx-dev] Parallel STL
Hello,
I have followed the advice about taking over the review above, and have gotten to a place where I'm working on getting the existing code to compile cleanly. A few functions were not implemented, so I have forwarded them to the serial backend for now. Just to get compilation working.
I have a few questions:
1. I notice that neither the TBB backend, nor the existing OpenMP backend code evaluates the execution policy to understand what to do. I may have misunderstood Louis Dionne, but it appears like the "sequential" mode is not handled at all if the user requests it. That seems wrong, so I must be missing something. I also notice that the execution modes are not enums, they are objects. However, when I try to overload on them in order to specialize for sequential, I get a compile error saying that the types are not fully defined. What is the design expectation for handling the different execution policies?
2. The existing code refers to a type:
using _CombinerType = __pstl::__internal::_Combiner<_Value, _Reduction>;
_CombinerType __result{__identity, &__reduction};
However, this type does not exist in __pstl::__internal, at least so far as I can tell. Also, the D70530 code dump does not contain a definition of that object. Has this migrated? Should I provide my own implementation of it?
3. I have tried to implement a very, very simple function:
template <class _ExecutionPolicy, typename _F1, typename _F2>
void __parallel_invoke(_ExecutionPolicy &&, _F1 &&__f1, _F2 &&__f2) {
    if (omp_in_parallel()) {
        _PSTL_PRAGMA(omp sections) {
            _PSTL_PRAGMA(omp section)
            std::forward<_F1>(__f1)();
            _PSTL_PRAGMA(omp section)
            std::forward<_F2>(__f2)();
        }
    } else {
        _PSTL_PRAGMA(omp parallel)
        _PSTL_PRAGMA(omp sections) {
            _PSTL_PRAGMA(omp section)
            std::forward<_F1>(__f1)();
            _PSTL_PRAGMA(omp section)
            std::forward<_F2>(__f2)();
        }
    }
}
Does this look sane? I have just started reading through the OpenMP documentation. This looks like it could be correct, but there is also the "omp task" directive, and it's not clear which of these is superior in this case. Also, this seems awfully repetitive. Is this just OpenMP?
Thanks!
On Wed, Sep 16, 2020 at 9:28 AM Kukanov, Alexey <Alexey.Kukanov at intel.com<mailto:Alexey.Kukanov at intel.com>> wrote:
Hi Cristopher,
One good way to contribute, I think, is to develop an OpenMP-based parallel backend. LLVM already supports OpenMP, so it resolves the dependency problem Louis mentioned. While it’s arguably not the best default engine in the long term, there is certainly some demand for it. The GCC community is also interested in it. Moreover, Mikhail and the team at Intel in collaboration with Thomas (CC’d) from GCC already developed a basic prototype: https://reviews.llvm.org/D70530, but further work is postponed. If you are interested to continue, you are more than welcome, and we will help with guidance and feedback.
Regards,
- Alexey
From: libcxx-dev <libcxx-dev-bounces at lists.llvm.org<mailto:libcxx-dev-bounces at lists.llvm.org>> On Behalf Of Christopher Nelson via libcxx-dev
Sent: Wednesday, September 16, 2020 2:43 PM
To: Louis Dionne <ldionne at apple.com<mailto:ldionne at apple.com>>
Cc: Dvorskiy, Mikhail <mikhail.dvorskiy at intel.com<mailto:mikhail.dvorskiy at intel.com>>;
Subject: Re: [libcxx-dev] Parallel STL
Fantastic. I will study the serial backend and see what I can do!
On Tue, Sep 15, 2020 at 5:27 PM Louis Dionne <ldionne at apple.com<mailto:ldionne at apple.com>> wrote:
+ Mikhail, who wrote most of the PSTL
On Sep 15, 2020, at 15:40, Christopher Nelson <nadiasvertex at gmail.com<mailto:nadiasvertex at gmail.com>> wrote:
Okay, that makes sense. I can see how you might want to use Grand Central Dispatch on macOS, and the Windows system thread pool on Windows. I'm not really sure what that means for Linux, though. Other than maybe pthreads, which is not great.
Is there any documentation on what is needed to create a backend? Or are there perhaps already plans in motion? I don't want to step on any toes, but I would love to have a usable pstl on macOS and Linux for the next LLVM release.
We use libc++ on Linux as well as macOS. Depending on what's involved, I might be able to contribute a backend for those two platforms.
You're not stepping on any toes, far from that. If we have backends with satisfactory performance and we're confident about ABI stability, I don't see a reason why we wouldn't ship the PSTL as soon as we have those. One big issue to shipping it so far has been that the only backends are serial (not great to ship that), and the other one relies on an external dependency (TBB).
Mikhail might be able to provide documentation. We should check it into the PSTL repository. I meant to write such documentation when I wrote the serial backend, but never got around to writing something that was enough to check in. You can see the minimal API needed to implement a backend here: pstl/include/pstl/internal/parallel_backend_serial.h. It's the serial backend, which tries to be as trivial as possible.
Are you familiar with libc++ contribution? If so, contributing to PSTL works basically the same -- just send a Phabricator review and I'll review it. We can also chat on Slack in the Cpplang workspace and I can give some guidance -- look for "ldionne".
Cheers,
Louis
On Tue, Sep 15, 2020 at 2:50 PM Louis Dionne <ldionne at apple.com<mailto:ldionne at apple.com>> wrote:
Hi,
Long story short, the PSTL is pretty much ready to be shipped with LLVM. I did the integration between it and libc++, and it all worked last time I checked. I think the next step would be to change whatever LLVM scripts are used to create releases to also install the PSTL, which is the part I haven't had time to look into yet.
That being said, the PSTL will then default to using the Serial backend, which isn't very useful. We could decide to ship a different backend if we wanted, however I think what makes sense is to use a backend specific to the platform we're running on instead of adding a dependency to LLVM.
Louis
> On Sep 8, 2020, at 08:25, Christopher Nelson via libcxx-dev <libcxx-dev at lists.llvm.org<mailto:libcxx-dev at lists.llvm.org>> wrote:
>
> Hello friends,
>
> I have spent some time looking at the mailing archives and git logs for the parallel STL. I'm not clear what state it is in, since the oneAPI/tbb seems to be production ready and comes with the parallel STL. Also, it appears the GCC has shipped a PSTL based on the same code that clang is using.
>
> I was wondering if someone could clarify for me what state the PSTL is in, and if there is some work needed to help get it over the finish line I may be able to help. I'm very interested in using it in our production software, so I'm a motivated helper. :-)
>
> Thank you for your time,
> -={C}=-
> _______________________________________________
> libcxx-dev mailing list
> libcxx-dev at lists.llvm.org<mailto:libcxx-dev at lists.llvm.org>
> https://lists.llvm.org/cgi-bin/mailman/listinfo/libcxx-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20200928/0d7a940e/attachment-0001.html>
    
    
More information about the libcxx-dev
mailing list