[libcxx-dev] OpenMP parallel reduce bugs

Christopher Nelson via libcxx-dev libcxx-dev at lists.llvm.org
Wed Sep 30 16:46:08 PDT 2020


Hello friends,

I have been working on the OpenMP backend for the parallel STL, and most of
the tests are passing. However, among the failures is the "is_partitioned"
test. I have rewritten the __parallel_reduce backend function to be simpler
to understand in an attempt to understand what is failing (code is below.)

I also rewrote it as a serial function that splits the iteration range in
two and then calls __reduction() on each half of the range being passed in.
The result I get from the serial execution as compared to the result I get
from the parallel execution is different.

I have verified that the parallel execution tasks are run, and that their
results match what each serial execution would be if I ran them that way.

I am wondering if there is something wrong with the way OpenMP is running
the reducer here? Perhaps it is injecting a value into the computation that
is unexpected for this algorithm? Does anything jump out at anyone as being
suspicious?

Thank you again for your time and assistance!

template <class _RandomAccessIterator, class _Value, typename
_RealBody, typename _Reduction>
_Value
__parallel_reduce_body(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Value __identity,
                       _RealBody __real_body, _Reduction __reduction)
{
    std::size_t __item_count = __last - __first;
    std::size_t __head_items = (__item_count / __default_chunk_size) *
__default_chunk_size;

    // We should encapsulate a result value and a reduction operator since we
    // cannot use a lambda in OpenMP UDR.
    using _CombinerType = __pstl::__internal::_Combiner<_Value, _Reduction>;
    _CombinerType __result{__identity, &__reduction};
    _PSTL_PRAGMA_DECLARE_REDUCTION(__combiner, _CombinerType)

    // To avoid over-subscription we use taskloop for the nested parallelism
    //_PSTL_PRAGMA(omp taskloop reduction(__combiner : __result))
    for (std::size_t __i = 0; __i < __item_count; __i += __default_chunk_size)
    {
        auto __begin = __first + __i;
        auto __end = __i < __head_items ? __begin +
__default_chunk_size : __last;
        __result.__value = __real_body(__begin, __end, __identity);
    }

    return __result.__value;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20200930/d30bdd22/attachment.html>


More information about the libcxx-dev mailing list