[libcxx-commits] [PATCH] D99836: [pstl] Implement OpenMP backend

Fri Oct 1 08:27:41 PDT 2021

nadiasvertex marked an inline comment as done.
nadiasvertex added inline comments.

================
Comment at: pstl/include/pstl/internal/omp/parallel_merge.h:74
+{
+    std::size_t __size_x = std::distance(__xs, __xe);
+    std::size_t __size_y = std::distance(__ys, __ye);
----------------
rarutyun wrote:
> MikeDvorskiy wrote:
> > The "parallel merge" pattern accepts  just random access iterator types. So, we can write here just difference operator, everywhere in the patch.
> > 
> > Probably, it makes sense to add an assert (in the begin of the function) on random "access iterator type".
> I would even say we **should** write `__xe - __xs` because `std::distance` works with any iterator type. Since we expect random access iterator only to be there we might hide the error with the code as it is. In case of using `operator-` explicitly we get the compile-time error immediately if this very operator is not defined. And one more thing. Using additional abstraction layers in places where we don't need them (in this case slightly, but) increases compilation time.
Fixed.

================
Comment at: pstl/include/pstl/internal/omp/parallel_stable_sort.h:25
+{
+template <typename Iterator1, typename Iterator2>
+struct __move_value
----------------
rarutyun wrote:
> Why did you prefer to make the whole class template rather than making `operator()` with template parameters?
No reason. Happy to change it.

================
Comment at: pstl/include/pstl/internal/omp/parallel_stable_sort.h:101
+        _VecType __output(__size);
+        _MoveValueType __move_value;
+        _MoveRangeType __move_range;
----------------
rarutyun wrote:
> `__move_value` is named as the type (not the alias) that decreases readability. Wouldn't it be better to rename the variable (probably `__move_value_func`)? The same is true for ``__move_range``
I'm not sure why you think adding a _func suffix makes this clearer. I'm not opposed to it, but most functors in the STL aren't suffixed with "_func" and that's not confusing.

================
Comment at: pstl/include/pstl/internal/omp/parallel_stable_sort.h:118
+void
+__parallel_stable_sort(_ExecutionPolicy&& __exec, _RandomAccessIterator __xs, _RandomAccessIterator __xe,
+                       _Compare __comp, _LeafSort __leaf_sort, std::size_t __nsort = 0)
----------------
rarutyun wrote:
> I don't need `__exec` as a name, do you? I personally like the style with comments (`/*__exec */`) but it's up to you
I agree. This was an accident.

================
Comment at: pstl/include/pstl/internal/omp/parallel_transform_reduce.h:37
+{
+    using _Size = std::size_t;
+    const _Size __num_threads = omp_get_num_threads();
----------------
rarutyun wrote:
> Don't see the reason why we need to add such an alias. In some places you use that in others - you don't
Isn't calling omp_get_num_threads() slower than reading a variable?

================
Comment at: pstl/include/pstl/internal/omp/parallel_transform_reduce.h:50
+
+    // Here, we cannot use OpenMP UDR because we must store the init value in
+    // the combiner and it will be used several times. Although there should be
----------------
MikeDvorskiy wrote:
> Actually, we cannot use OpenMP UDR because it requires an identical value to initialize each accumulator. "__init"  parameter is initial value of reduction, not an "identity".
> So, the comment should be corrected.
> 
This is fixed.

================
Comment at: pstl/include/pstl/internal/omp/util.h:60
+template <typename _Tp>
+class __buffer
+{
----------------
rarutyun wrote:
> MikeDvorskiy wrote:
> > I believe the call class __buffer can be shared between the all host backend.
> > But, I think, for now is OK. It may be done later, in the one of next  patch.
> I think it should not be a big deal to create the header with the `__buffer` class within and include it to both TBB backend and OpenMP backend. Otherwise we will commit relatively big code duplication 
I made this change for the oneDPL commit, but it was deferred. I can make this change here if that is the consensus.

================
Comment at: pstl/include/pstl/internal/omp/util.h:83
+// Preliminary size of each chunk: requires further discussion
+constexpr std::size_t __default_chunk_size = 512;
+
----------------
rarutyun wrote:
> I am not sure if we need this variable basing on Mikhail's comments (probably, it's still useful for some algorithms) but if it remains in the code please mark it `inline`.
Many algorithms chunk up the work. If we don't have a variable we don't know how big to make the chunks. I'm happy to make it inline. Also, my testing shows that it should probably be 2048.

================
Comment at: pstl/include/pstl/internal/omp/util.h:88
+constexpr auto
+__should_run_serial(_Iterator __first, _Iterator __last) -> bool
+{
----------------
rarutyun wrote:
> Is there any reason to use trailing return types? As I remember correctly you use those only in this file.
I prefer them. If the preference in PSTL is to not, I can change it.

================
Comment at: pstl/include/pstl/internal/omp/util.h:99
+{
+    using _difference_type = ::std::intptr_t;
+    auto __size = __last - __first;
----------------
rarutyun wrote:
> Why do you use `std::intptr_t`? it's a type to represent the pointer as the signed value. If you already check (with `enable_if`) that your type is integral and you have two same types (by function signature) you may case directly to `Index`, may not you?
Good point! I will fix that!

================
Comment at: pstl/include/pstl/internal/omp/util.h:104
+
+struct __chunk_policy
+{
----------------
rarutyun wrote:
> `chuck_policy` (a least to me) is a very confusing name in the context of Parallel STL. I had an impression that use have some special execution policy or you wrap `ExecutionPolicy` object with this function or something like that. Cannot propose better from the top of my head. Need to think a little bit about that.
__chunk_policy is intended to refer to the information needed to apply the chunking algorithm. We could call it __chunk_boundaries or __chunk_partition_details.

================
Comment at: pstl/include/pstl/internal/omp/util.h:106
+{
+    std::size_t __n_chunks{0}, __chunk_size{0}, __first_chunk_size{0};
+};
----------------
rarutyun wrote:
> rarutyun wrote:
> > I would recommend to split those variable to make each in its line for the sake of readability.
> If I don't miss something you always construct `chunk_policy` object with 3 arguments. Why do you need field initialization? I would prefer to leave this struct `trivially_default_constructible` unless we have strong reasons not to do that.
I think that the compiler complains if I don't initialize the values. I can remove it and see. It might have been an artifact from other refactoring.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99836/new/

https://reviews.llvm.org/D99836