[libcxx-dev] OpenMP parallel reduce bugs
Christopher Nelson via libcxx-dev
libcxx-dev at lists.llvm.org
Wed Sep 30 16:46:08 PDT 2020
Hello friends,
I have been working on the OpenMP backend for the parallel STL, and most of
the tests are passing. However, among the failures is the "is_partitioned"
test. I have rewritten the __parallel_reduce backend function to be simpler
to understand in an attempt to understand what is failing (code is below.)
I also rewrote it as a serial function that splits the iteration range in
two and then calls __reduction() on each half of the range being passed in.
The result I get from the serial execution as compared to the result I get
from the parallel execution is different.
I have verified that the parallel execution tasks are run, and that their
results match what each serial execution would be if I ran them that way.
I am wondering if there is something wrong with the way OpenMP is running
the reducer here? Perhaps it is injecting a value into the computation that
is unexpected for this algorithm? Does anything jump out at anyone as being
suspicious?
Thank you again for your time and assistance!
template <class _RandomAccessIterator, class _Value, typename
_RealBody, typename _Reduction>
_Value
__parallel_reduce_body(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Value __identity,
_RealBody __real_body, _Reduction __reduction)
{
std::size_t __item_count = __last - __first;
std::size_t __head_items = (__item_count / __default_chunk_size) *
__default_chunk_size;
// We should encapsulate a result value and a reduction operator since we
// cannot use a lambda in OpenMP UDR.
using _CombinerType = __pstl::__internal::_Combiner<_Value, _Reduction>;
_CombinerType __result{__identity, &__reduction};
_PSTL_PRAGMA_DECLARE_REDUCTION(__combiner, _CombinerType)
// To avoid over-subscription we use taskloop for the nested parallelism
//_PSTL_PRAGMA(omp taskloop reduction(__combiner : __result))
for (std::size_t __i = 0; __i < __item_count; __i += __default_chunk_size)
{
auto __begin = __first + __i;
auto __end = __i < __head_items ? __begin +
__default_chunk_size : __last;
__result.__value = __real_body(__begin, __end, __identity);
}
return __result.__value;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/libcxx-dev/attachments/20200930/d30bdd22/attachment.html>
More information about the libcxx-dev
mailing list